Substitution Cipher in JavaScript

Throughout history, keeping messages private was important. The most widely known case of ciphering is Ceasar’s cipher. Julius Caesar, that Roman general you may have heard of in history class once, had many enemies and wanted some of his messages kept secret, so used a cipher that basically shifts the alphabet a specified amount of characters. Replacing characters to make text unreadable to humans is called a substitution cipher.

ABCDEF    HELLO WORLD
vvvvvv so vvvvv vvvvv
XYZABC EBIIL TLOIA

That’s pretty cute, but is it really safe? In 100-44 BC, not many were able to read in the first place. Those who could probably considered the text just gibberish, rather than encoded text. Fast forward just over 2000 years. Today, shifting letters in the alphabet is not considered safe. In fact, substitution ciphering (replacing characters with others) is not very common in cryptography at all, but it’s interesting, fun and educational nonetheless. Lets just say substitution ciphers are rarely complex enough to trick a professional.

Can all this be done in JavaScript? It surely can. Lets get our hands dirty!

It all starts with a namespace:

var Cipher = {};

Using a character map

When replacing characters with other characters, the first thing you may think of is actually creating a map with the original characters as keys, and replacing characters as values. That makes sense. Lets make one!

var map = {
a: 'q', b: 'w', c: 'e',
d: 'r', e: 't', f: 'y',
g: 'u', h: 'i', i: 'o',
j: 'p', k: 'a', l: 's',
m: 'd', n: 'f', o: 'g',
p: 'h', q: 'j', r: 'k',
s: 'l', t: 'z', u: 'x',
v: 'c', w: 'v', x: 'b',
y: 'n', z: 'm'
};

All we need to do now is remove all characters that are not in the map, and replace the others.

// Convert to an array of characters, filter, iterate and finally join
text.split('').filter(function(v) {
// Does the character exist in the map?
return map.hasOwnProperty(v.toLowerCase());
}).map(function(v) {
// Replace character by value
return map[v.toLowerCase()].toUpperCase();
}).join();

Decoding time! We now have to do the exact opposite. Unfortunately, getting a key by value is irresponsible performance-wise. Instead, we flip the map before iterating through the string.

Here’s the final function:

Cipher.toQWERTY = function(text, decode) {
// ABCDEF to QWERTY map
var map = {
a: 'q', b: 'w', c: 'e',
d: 'r', e: 't', f: 'y',
g: 'u', h: 'i', i: 'o',
j: 'p', k: 'a', l: 's',
m: 'd', n: 'f', o: 'g',
p: 'h', q: 'j', r: 'k',
s: 'l', t: 'z', u: 'x',
v: 'c', w: 'v', x: 'b',
y: 'n', z: 'm'
};

// Flip the map
if(decode) {
map = (function() {
var tmp = {};
var k;

// Populate the tmp variable
for(k in map) {
if(!map.hasOwnProperty(k)) continue;
tmp[map[k]] = k;
}

return tmp;
})();
}

return text.split('').filter(function(v) {
// Filter out characters that are not in our list
return map.hasOwnProperty(v.toLowerCase());
}).map(function(v) {
// Replace old character by new one
// And make it uppercase to make it look fancier
return map[v.toLowerCase()].toUpperCase();
}).join('');
};

Usage:

var text = 'Hello World!';
var encoded = Cipher.toQWERTY(text); // ITSSGVGKSR
var decoded = Cipher.toQWERTY(decoded, true); // HELLOWORLD

In this cipher, we replaced the alphabet (abcdef) with our own (qwerty). If we are able to keep our map and the original text a secret, it would take a very long time guessing until someone would have found how our cipher works.
Unfortunately, there are some technical limits to this method. What if you want to convert Chinese, Russian or Arabic messages? We can’t (or don’t want to) make a map listing all characters, right?

Disclaimer: Because I removed non-valid characters and made them upper-case, this algorithm technically is not a substitution cipher.

Linear shifting

For our previous cipher, we replaced the alphabet (abcedf) with one of our own (qwerty). Unfortunately, reordering characters means listing all (wanted) characters. That’s troublesome when we want to support multiple languages. Instead, we can use an existing list that’s already on your computer; Unicode!

In this cipher, we want to shift (or rotate) the characters, similar to Caesar’s cipher. Remember how that works? When incrementing a letter in the alphabet, it becomes the next letter in the alphabet. So basically we’re doing the exact same as Caesar, but our list is 65536 characters instead of 26.

Here is an example of shifting a string two places to the right:

ABC...XYZ
vvv vvv
CDE...ZAB

So can we increment a character by a number in JavaScript? Well, it’s not that simple. What we can do is convert a character to a code that resembles that character in the Unicode table, and later convert it back to a character, like so:

var character = "p";
// Get code from character at index 0 (first character)
var code = character.charCodeAt(0); // 112
// Convert back to character
var char = String.fromCharCode(code); // "p"

Finally, because we’re shifting back and forth, we have to loop the characters to stay within the limits of Unicode without data loss. When shifting one position to the right, Z would turn A, and opposite when decoding. That means we need a boundary and make sure any number beyond the boundary flips all the way back to the start of the alphabet, or the Unicode table in this case.

var bound = 0x10000;

Why this number? This is where surrogate pairs (UTF-16) start, something JavaScript doesn’t handle very well. This is part of the realm of character encoding. Interesting stuff, but not necessary to understand for now.

I can hear you thinking. “If a number is greater than, do…” Let me share a trick with you using the modulo (%) operator.

The modulo operator returns the remaining of the division of two numbers. That means when the dividend surpasses the divisor will only loop back to 0 and start all over again.

a % n = a – (n × floor(a / n))

By adding the boundary to the dividend, we make sure that numbers below zero (for decoding) remain positive.

var num = (x + bound) % bound;

// For example, if bound = 10
// Within range: ( 4 + bound) = 14, 14 % bound = 4
// Beyond range: (16 + bound) = 26, 26 % bound = 6
// Before range: (-9 + bound) = 1, 1 % bound = 1

If you prefer if-statements over this, you should stick to that. I just fancy one-liners.

First we need to make sure the number used for rotation (or shifting, or increment) is a safe number.

// Force the rotation an integer and within bounds, just to be safe
rotation = parseInt(rotation) % bound;

After making sure the number is valid-ish, we can tell if we really need to iterate through the string.

// Might as well return the text if there's no change
if(rotation === 0) return text;

Character encoding time! We turn the string to an array, like last time. After that we iterate through the array, convert the characters to the corresponding codes and increment those.

// Turn string to character codes
text.split('').map(function(v) {
// Return current character code + rotation
return (v.charCodeAt() + rotation + bound) % bound;
})

Finally, we turn that array with numbers back to an string usingString.fromCharCode(code, ...). It accepts multiple arguments, so we can use .apply() to call it using an array of character codes. Here is the final function:

Cipher.rotate = function(text, rotation) {
// Surrogate pair limit
var bound = 0x10000;

// Force the rotation an integer and within bounds, just to be safe
rotation = parseInt(rotation) % bound;

// Might as well return the text if there's no change
if(rotation === 0) return text;

// Create string from character codes
return String.fromCharCode.apply(null,
// Turn string to character codes
text.split('').map(function(v) {
// Return current character code + rotation
return (v.charCodeAt() + rotation + bound) % bound;
})
);
};

Usage:

var text = 'Hello world!';
var rotation = 325;

var encoded = Cipher.rotate(text, rotation); // ƍƪƱƱƴťƼƴƷƱƩŦ
var decoded = Cipher.rotate(encoded, -rotation); // Hello world!

Although this method is great for supporting Unicode characters, it’s security is bad. Even if you keep the key (amount of rotation) and the original message secret, finding out the key is a matter of a couple of iterations.

Non-linear shifting

Previous method wasn’t safe mainly because of we are shifting he alphabet linearly. Instead, we need to come up with something more advanced. What if we shift every character differently? We could base that on several things, but why don’t we use a key this time?

Yes, a key. This could be a number, a string, even a blob of bits and bytes. Just like a real key that opens a lock, a key in cryptography is required to encrypt or decrypt a message. In cryptography, when the key to encrypt and decrypt a message is identical, we call this a symmetrical key.

Here’s a basic example how we can use a key to shift each character.

Message:   Hello World
Key: 12345 12345
vvvvv vvvvv
Encrypted: Igopt Xqupi

I just told you that a key could be a number, but a string as well. Remember we can convert characters into numbers?

// Assuming a-z = 0-25
Message: Hello World
Key: ABCAB CABCA
vvvvv vvvvv
Encrypted: Igomq Zptoe

The only difference between our previous algorithm is how we define the rotation. Instead of having a static rotation, we have to create one based on the key. Lets line up the key with our text, get the character at the same index, and get that character code.

var rotation = key[i].charCodeAt();

What if the key has less characters than the actual text? We can use the modulo operator here, remember? All we have to do is set a different bound, which is the amount of characters in key.

var rotation = key[i % key.length].charCodeAt();

This may just work. All we have to do is flip this when we’re decrypting.

if(reverse) rotation = -rotation;

That’s basically it! Here’s the complete function:

// Non-linear unicode rotate
Cipher.keyRotate = function(text, key, reverse) {
// Surrogate pair limit
var bound = 0x10000;

// Create string from character codes
return String.fromCharCode.apply(null,
// Turn string to character codes
text.split('').map(function(v, i) {
// Get rotation from key
var rotation = key[i % key.length].charCodeAt();

// Are we decrypting?
if(reverse) rotation = -rotation;

// Return current character code + rotation
return (v.charCodeAt() + rotation + bound) % bound;
})
);
};

Usage:

var text = 'Hello world!';
var key = 'MySecretKey';

var encoded = Cipher.keyRotate(text, key); // •Þ¿ÑҒÜã½ÑÝn
var decoded = Cipher.keyRotate(encoded, key, true); // Hello world!

This method is safest by far. If you keep the key a secret, decrypting is very tough.

A last say

All discussed algorithms are for educational purposes only. None of these should be used for real security. You can use them to send “secret” messages to your friend at most.

For real cryptography in JavaScript, I suggest use of CryptoJS which supports proven algorithms. It contains several proven one-way hashing and ciphering algorithms such as MD5, SHA, AES, DES, Rabbit and RC4.

Complete source code is available: cipher.js

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.