Cryptography for JavaScript/Node.js developers: Part 1 Hash Function

Narek Hovsepyan
7 min readFeb 7, 2018

--

In this series of articles, We are going to explore ready-to-use cryptographic functions, as well as some definitions for better understanding user side of cryptography. We are going to use crypto-js library published in npm, instead of node.js built in module. Why? It doesn’t matter what library we will choose, if you understand one, you will be able to use any library with any language. There is a reason why I choose crypto-js, because it is written in pure javascript, instead of C++ bindings, s you are able to run on any JavaScript environment. On next tutorials after being comfortable with crypto.js, we will introduce bcrypt library for node.js, which is de facto standart to use, specially with dealing with passwords or any other critical data.

In the part 1 we are going to understand hash functions. Hash functions are everywhere. Your password is hashed on any website before storing in database, the content of website before caching is hashed for detecting if the cache has been changed, your favorite torrent tracker is providing hashes of data so you can check locally if the data you received was not modified.

But what is hash function?

Let’s go little bit into theory. Hash function is a any function, which arbitrary size data input, and it is taking output a data with a fixed size. The returned value usually called hash code, hash value or just simply hashes. By saying hash should have fixed size, we mean that no matter what the input data will be, after processing the data with hash function, we will always receive data with same fixed size. You may think that you can hash numbers just making the reminder of dividing to 9, or you can hash strings just by taking their first symbol? Let’s go over properties a hash function can have.

Deterministic

Your hash function for a given input should always return the same hash code. In other words, it should not depend on any external variable or state. The hash code is dependend only from input data, and should always return the same output for the same input. This is called property of determinism. For exampleconst hash1 = (number) => number % 9 have the property of determinism, while const hash2 = (number) => number % 9 + Math.random() don’t, because for different calles of hash2 you will get different outputs for the same input. Sometimes our hash function can receive an external parameter for hashing. For example const hash3 = (number, param) => number % param. In that case hash3 is still have property of determinism, because for the same input data and the same parameters, it will return the same output.

Collision Free
As we said, input can be data of arbitrary size, while the output should have fixed size. Well there indefinite amount of data, which can be taken as input, meanwhile the output, as it have fixed size, can have definite/limited available values. Yes that means, that two different inputs can have the same hash code, and it may be still good. For example for thehash1 from previous we have hash1(1) = hash1(10) . For a good hash function it should be hard enough to find two differnt a and b values such hash(a) = hash(b) . There are algorithms like brute force, to find such values for a good hash functions, yes they are, but for some hash functions, on a modest cluster of supercomputers, it may take even million of years to find such values.

So let’s define propery of Collision Free. We will say that hash function is Collision Free, if it is safe to assume, that if hash(a) is equal to hash(b) , then a is equal to b.

Non-invertible

The last important property is the non-invertible property. That mean if you have a hash code, you can not recover an original data, at least without spending a lot of resource of computing. This means that you can not find any of the original data. In other words if you have a hash code h, you can not find a value x, that hash(x) = h.

Other properties

There are other properties that hash function can have, like uniformly or continouty, which we will not cover as they are less important in cryptography. You can read them in wikipedia if you have an interest.

Cryptographic hash functions

Cryptographic hash functions are hash functions, which are usefull in cryptography, and usually they should always hold the properties of determinism, beinng collision free and and non-intervitble, and also be quick enough to compute. There multiple algorithms like md5 or sha3, we are not going to cover the actual algorithms, but we are going to give general overview and use cases.

Back to coding

Open the command line terminal
Install crypto-js by typing ithe following code in terminal

npm install crypto-js

Then open node REPL, by typing node in terminal

$ node

Now we are ready to explore crypto-js. Let’s import it and use some hash functions.

> var Crypto = require('crypto-js')
> Crypto.SHA256(11)
> Crypto.SHA256(11).toString()

The Crypto.SHA256()is returning an object, which is containing the result of SHA256 algorithm, like state of internal variable, as well as methods for going furhter, whiletoString() method will return the hash code. You can run them multiple times, over and over again, to make sure it will always return the same value. As those functions do not contain I/O, they will run synchronously, which is good to play around in REPL.

You can hash numbers, strings, objects and even files.

> Crypto.SHA256({}).toString()
> Crypto.SHA256([]).toString()
> Crypto.SHA256("single string").toString()
> var fs = require('fs')
> Crypto.SHA256(fs.readFileSync("Some file path")).toString()

Now let’s see what other hash functions does crypto-js contain.

> Crypto.MD5(124124).toString()
> Crypto.SHA1("asd").toString()
> Crypto.SHA256("asdafaf").toString()
> Crypto.SHA224("asdadad").toString()
> Crypto.SHA512("qqqq").toString()
> Crypto.SHA384("asfasfaf").toString()
> Crypto.SHA3(1).toString()
> Crypto.RIPEMD160(7).toString()

You may ask why so many hash functions. First the diversity is guaranteeing that the attacker can not guess the hash function . Somethimes it also usefull to define a constant nonce variable in the application, and add on every data, before hashing, like crypto.SHA256(nonce+data).toString(). Secondly, for over period of time, hash functions may use the propery of being non-invertible. You may ask how, if the algorithm is not changing. Let’s look for example to md5. It was innovative hash function few ago, but now it is not recommended to use it anymore. Why? There are databases of over few milliards common words with their md5 hashes, and an attacker can easily find the original data by md5 hash. Also there are algorithm for every string or file to create another one, which will have the same hash. Over period of time, the same thing can happen to every hashing algorithm, that’s why it is important to have multiple hash algorithms.

Use cases

The most popular use case of hash functions are passwords hashing. Do NOT store passwords of users in database, instead store their hash code. You can verify the written password during sign in by hasing and comparing with the stored hash. Due properties of determinism you will always get the same hash. Due propery of being collision free, someone can not accidentaly write another password and still login. Due property of being non-invertible, an attacker who will get access to your database, user’s password will be safe, nobody will know or guess them.

Another popular of use case of hash functions are checksums. Imagine you are sending a large file of 10GB to your friend over network, and your friend has received it, and you now want to check if it received as it is, without any byte transformations, loss or modifications by men in the middle. You can just hash your file on both your and your friends PC with SHA256, and compare those hashes. Due property of being collision free, if they will match you can assume that original files are the same.

Hash functions are everywhere, they are used in source code management systems, like git or mercurial, for getting short unique identifier for files, directories and trees, hash functions are heavily used in cryptocurrencies, as well as a in proof of work systems and much much more.

HMAC

Hash-based message authentication code, or simply HMAC, is an extension, which is used for data integrity tasks, as well as for message authentication. Think like HMAC is extending original hash argument with an additial parameter, which is called key, without loosing all the properties of cryptographic hash function. The idea is to have a key is that you can generate code for specific input, only by knowing the key. Read more about HMAC here. Crypto-js is providing HMAC support as well. Let’s see in a code sample bellow. You will see we are passing additional string parameter as a key.

> Crypto.HmacMD5(124124, "key1").toString()
> Crypto.HmacSHA1("asd", "keyx").toString()
> Crypto.HmacSHA256("asdafaf", "some string").toString()
> Crypto.HmacSHA224("asdadad", "4").toString()
> Crypto.HmacSHA512("qqqq", "asd").toString()
> Crypto.HmacSHA384("asfasfaf", "anything").toString()
> Crypto.HmacSHA3(1, "a").toString()
> Crypto.HmacRIPEMD160(7, "").toString()

Crypto-js character encodings

Byte is sequence of 8 bits. Bit is either 0 or 1. Everything is stored inside bytes. Strings are stored inside bytes too. String characters can be converted into sequence of bytes by multiple ways. Those ways are called encodings. There is ASCII or Latin1 encoding for latin characters, there is UTF-8 multilanguage support and much much more. As the same string can provide different hashes if they will be encoded differently. Crypto.js is providing support for various encodings in Crypto.enc submodule. See code examples and play around to get better understanding.

> Crypto.SHA256(crypto.enc.Utf8.parse("кирилица")).toString()
> Crypto.SHA256("кирилица").toString()
> Crypto.SHA256(crypto.enc.Latin1.parse("кирилица")).toString()

Wrap Up

This was the first article in series of cryptography for JavaScript developers, next will come about symmetric encryption, public/private key encryption, as well as digital signatures and much much more.

Press “Follow” me if you enjoyed the tutorial.

--

--