Hello World! in the light of security

Published in

Hello World! Series

9 min readJun 1, 2018

In this post I will try to write about most basic security concepts that every software developer should know. So if you are just like me who did not pay much attention in Networks & Security classes, sit tight and grab your popcorn.

Not so loyal customers

Suppose to build up audience for your recently launched web app ,you declared that for every 1000th visit clients will get reward. For some strange reason you have decided to store how many times a user has visited your website in their browser cookie instead of a persistence layer on the server side(You will loose the information if user clears cookie from browser). So every time a client request comes you increase the cookie value by 1 and check whether he is eligible for reward or not. Something like following

if (is_present(cookie[total_visit])==false)
   set_cookie(cookie,total_visit,0)
increase_total_visit_by_one(cookie)
if(cookie[total_visit]>=1000)
   reward()
   set_cookie(cookie,total_visit,0)
return cookie

Now cookies sit on users browser and your not so loyal clients can temper the cookie value. So if a user manually sets the cookie value to 999 before he visits the website, he has to be rewarded even if it was his first visit!

Hashing to the rescue

A hash algorithm turns an arbitrarily-large amount of data into a fixed-length bytes has that represents the original data. From cryptographic hash functions certain qualities are expected

Hash function should be one way. That means if H(x) = y then it should be extremely difficult to find x given y.
Collision should be minimal, we can expect almost every time we will get a different y for different value of x.
Efficiently computable, if the hash function is not easy to compute it will slow down the program which is not practical.

Now the trick to solve our specific problem is instead of setting only the total no of visit we will also set the hash value H(total_visit) in cookie. So when a user tempers the total no of visit he has to also temper the hash value accordingly and if the hash function is unknown to him he will not be able to do that. On the server side we will check whether hash_value_sent_by_user == H(total_visit_sent_by_user) or not. If they match we will increment the total visit and set a new hash value for the new total visit in cookie header. The strength of this technique is limited to user’s unawareness of the hash function used.

We will never need to design our own hash function as there are popular hash functions out there, also designing own hash functions is not wise as most likely it will not fulfill requirements of a good cryptographic hash function.

Some of the popular hash functions are

MD5 : from Wikipedia

“ The MD5 algorithm is a widely used hash function producing a 128-bit hash value. Although MD5 was initially designed to be used as a cryptographic hash function, it has been found to suffer from extensive vulnerabilities. It can still be used as a checksum to verify data integrity, but only against unintentional corruption.”

SHA-1 : from Wikipedia

“In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographic hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest — typically rendered as a hexadecimal number, 40 digits long. It was designed by the United States National Security Agency, and is a U.S. Federal Information Processing Standard.[3]

Since 2005 SHA-1 has not been considered secure against well-funded opponents,[4] and since 2010 many organizations have recommended its replacement by SHA-2 or SHA-3"

SHA-2 : from Wikipedia

“SHA-2 (Secure Hash Algorithm 2) is a set of cryptographic hash functions designed by the United States National Security Agency (NSA).[3]

SHA-2 includes significant changes from its predecessor, SHA-1. The SHA-2 family consists of six hash functions with digests (hash values) that are 224, 256, 384 or 512 bits: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256.”

Keep in mind that no cryptographic hash function is completely “unbreakable”. It is the amount of time that takes to break them makes thing practically impossible. With the advent of quantum computing, modern hardware it is imminent that stronger cryptographic methods will be needed in future.

Cleverer than you thought

Our regular users might be oblivious to hashing techniques but certainly there will be some users who is aware of all these popular hashing techniques. What if one of your user read this post already? The moment they successfully predict which hash function was used all our work will go in vain.

To solve this issue we will use a elegant technique called salt. Instead of computing the hash value H(total_visit) in this way we will compute H(salt+total_visit). Salt is a random value and is completely hidden from user. So even if user knows the exact hash function (md5, sha1 whatever it is) he will not be able to compute correct hash value as he is oblivious to the salt value. The most common use of salt is to protect password where different salt value is used to protect every single password, you can study more about it on your own.

Introducing Eve

So far we have assumed that communication between client and server was over secure communication channel and nobody else was eavesdropping. In reality there are bad people always listening over the communication channel.

Because of Eve we can no longer send plain data over communication channel as neither Alice or Bob will be happy if their message is read by someone else. We need some kind of encryption method so that only intended user can decipher the encrypted text.

Cipher text

Encryption can be rooted back to hundreds of years back when war generals used encryption techniques to protect sensitive information. It has been used by lovers to express love, by intellectuals to challenge one another.

Suppose you are Alice and always had a suspicion that Bob secretly likes you but don’t have the courage to express. Finally gathering all his courage Bob has sent you a letter like following

Efbs Bmjdf,
Nz mpwf gps zpv jt mjlf uif sbhjoh tfb, Tp qpxfsgvm boe effq ju xjmm gpsfwfs cf. Uispvhi tupsn, xjoe, boe ifbwz sbjo,Ju xjmm xjuituboe fwfsz qbjo.
Cpc

I encourage you to figure out yourself what could be the actual meaning of above text before further reading.

Clearly Bob is shy or clever (maybe both) to express his feelings. Now Alice is a smart girt. She thinks that the text looks a lot like a letter and at the end there is a signature of the sender which is a three letter word, moreover the letter comes from “Bob” which is also a three letter word. She need not to be a lady Sherlock to figure out that Bob actually used the immediate next letter of the original character. So the original text was

Dear Alice,
My love for you is like the raging sea, So powerful and deep it will forever be. Through storm, wind, and heavy rain, It will withstand every pain.
Bob

By the way Bob hired the poem from online, Shame Bob!

This is what is called a Ceaser cipher where every character gets shifted by a fixed length. There are only 25 possible choice for the length so even for someone without any cryptographic knowledge it is vulnerable.

Symmetric-key algorithm

Let’s add some complexity to make the cipher stronger. What if instead of shifting each character by a fixed length(previously we shifted by one) we shift characters by a varying length? Say we use the code word “sunrise” to encrypt the above text. That means shift the first character by (S->19) place, shift the next character by (U->21) place, the next by (N->14) place and so on. If only sender and receiver know the code word “sunrise” they will be able encrypt and decrypt the message and the eavesdropper will not be able retrieve the original text.

If the code word is sufficiently long enough then it will be difficult for average people do break down the cipher. But there are techniques like frequency analysis which can be used to break the cipher with a little effort. But also there are other advance encryption techniques that are not so easily breakable.

In general when both sender and receiver need to agree upon a secret key word to send messages it is known as symmetric key encryption. Now the problem is we have to send the secret key over a secure channel because if the code word is exposed to attacker the original text will be revealed immediately. It is not always possible to exchange the key securely (may be sender and receiver can not meet physically) and moreover a important question arises, If we are able to securely exchange the key why we are not able to securely exchange the text directly? Kind of chicken and egg problem.

Bring Public-key (asymmetric key) cryptography into the game

From Wikipedia “Public-key cryptography, or asymmetric cryptography, is any cryptographic system that uses pairs of keys: public keys which may be disseminated widely, and private keys which are known only to the owner.”

Private keys are kept private that means nobody shares it with anybody.

Public keys are public that means eavesdropper will be aware of the key.

Public key and private key are mathematically related but it is impossible to generate private key from public key. Both sender and receiver has their own public and private key. Using them they reach upon a shared secret key.

Let’s investigate a public-key cryptography technique called

Diffie–Hellman key exchange:

The first time I learned about this method it was jaw dropping for me because of the method’s utter brilliance.

To put it simply instead of ‘sharing’ a secret key over insecure channel, Diffie-Hellman method ‘build’ a secret key on both side by sharing some agreed upon public values. So even if Eve has all the knowledge of public keys she will not be able to generate secret key because lack of private key knowledge.

Khan academy has an excellent video explaining Diffie-Hellman key exchange.

By using Diffie–Hellman key exchange we can reach upon a shared secret key even if Eve was listening every message passed between sender and receiver. Using this shared key we can communicate exactly the same way explained in symmetric key section. It look’s like all the challenges have been solved and from now on all communication will be safe and secure. Not so fast my friend!

Introducing man in the middle (MITM) attack

So far we have assumed that eavesdropper can only listen the messages passed but can do nothing more than that. But it will be a real threat if some one sits in the middle of client and server and is able to forge the messages passed between them

Man in the middle attack. Source: google

Neither client or server will be aware that they are talking with a middleman instead of the intended recipient. This scenario occurred because no authentication of the recipient was performed.

TLS to the rescue

Man in the middle attack is only possible over a insecure(http) channel but if we enforce authentication of the recipient before sharing anything we can avoid the attack and this is where TLS comes in. Transport Layer Security (TLS) is the successor of previous SSL(Secure Sockets Layer) which ensures that no third party may eavesdrop, tamper with any message, and message forgery. TLS ensures security by verifying a digital certificate provided by a trusted third party Certification Authority(CA). From wikipedia “ The client uses the CA certificate to authenticate the CA signature on the server certificate, as part of the authorizations before launching a secure connection. Usually, client software — for example, browsers — include a set of trusted CA certificates.”

Wrapping up

This was a overview of how security is ensured over a communication medium. I suggest the following resources for further investigation