What is cryptography? The basics

My aim in writing my series was to attempt to help anyone understand everything bitcoin, from game theory to advancements in development. Recent feedback from a friend who asked what I was writing suggests I may have missed the mark on how steeper learning curve this is. So I thought I’d take a pause before i drop part 2 in the series to try and explain the underlying very basics of cryptography.

The internet is insecure

Poeple have terrible practices, it’s a fact. Check out Shoden.io (securely) it is a search engine of connections to the internet and it will probably shock you at the scale of how many insecure nodes (computers and devices) there are worldwide with default username and passwords. An IP address can trace back to your connection to street level accuracy. Please do not try to access these devices ffs. The other issue is the internet itself is not private and the vast majority of information on the internet is in plain text or visible code. Its easy and fast for both people and computers to read. You are looking at some right now, although I’m pretty happy for anyone to copy and share anything I write. This is not always ideal and much of our personal data from passwords to the information assets they protect only should be known by yourself or necessary parties.

Let’s take passwords for an example. You use them everyday. However common poor practices means most people’s passwords are insecure. You use them to access websites, apps and remote services such as banking. The most basic way this is achieved is you communicate with a server and create a password, they keep a copy and every time you enter your password, the server will verify that against the copy they keep. This has a number of serious problems. Plain text passwords chosen are generally poor and susceptible to being broken, commonly by either dictionary attack (computationally guessing against a list of words and common phrases) or brute force attacks (computationally guessing every variation possible) or a mixture of both. There is also the issue that all or part of the information could be intercepted making hacking relatively easy and low cost.

Hashing

This is where hashing comes in. Algorithms can take your password produced and generate what is known as a hash. Now the server only stores a hash of your password. If the server is hacked it becomes incredibly difficult to reproduce your password. Everytime you input your password a calculation is performed by the algorithm on the server side from your password and compared against the stored hash to verify that your password created it. The hash produced must be long and complex enough to not be broken by attackers to reproduce your password and the same original attacks to guess your passwords can still be performed and simple passwords can be cracked within seconds. In general most servers will only allow so many attempts to enter a password and the security of hash produced can be improved by making the password more complex yourself, a concept called “salting” which adds characters to your password before hashing and using stronger hashing algorithms. Hashing is what is known as a one way function, as in it “cannot” be reversed from only the hash.

But this doesnt completely solve the problem of the man-in-the-middle attack and the data (your password) being intercepted. This is where encryption to the server becomes important. The most common online is SSL or secure socket layer and it will be obvious from the ’https://’ and usually a padlock icon in browser.

Encryption

Encryption differs from hashing in that it is a two way function .In cryptography a “cipher” or complex mathematic functions called algorithms can encrypt data from plaintext to ciphertext (very rarely you will see encryption called encipherment) and a cipher can decrypt ciphertext to plaintext (again sometimes rarely called decipherment).

Cryptographic keys

Keys are used to encrypt and decrypt data. They determine the output either way. An encryption key is actually just stings bits in binary. The longer they are the stronger they are and less susceptible to brute force attacks. Very early on there was a major problems with early symmetric encryption keys. Similarly to our password problem the key from the author had to be send to to recipient of the data so they could decrypt it. So the key itself could become the victim of man-in-the-middle attack. The other major problem is this suffers from a serious scalability issues or what’s known as the key distribution problem. The amount of keys needed can be calculated by n(n-1)/2 with n being the number of parties communicating. It becomes a big number very quickly in a network.

Diffie–Hellman key exchange method

This issue was solved some time in the the early 70s by GCHQ along with two American cryptographers, Whitfield Diffie and Martin Hellman. It was kept secret for many years which is unsurprising as I’ve previously mentioned Phil Zimmerman was on trail in the US for arms exportation up until 1996 for publishing PGP online. But anyway what they came up with was revolutionary in cryptography and our very own Bitcoin was duel keys known as asymmetric encryption or Public key cryptography. Now two keys are used, the public key encrypts plain text to cipher text and the private key decrypts. Using this key pair anyone can use my public key to send me encrypted data but only I can unlock it with my private key from the key pair that is never revealed. This is obviously the most secure method of the two for large scale and distributed systems but it does come with tradeoffs over single symmetric keys in that they are much longer and slower. In bitcoin we use what is known as elliptic curve cryptography to solve this which retains the best of both in speed, security and scalability.

Hashing keys

Public keys as you know them in bitcoin are actually a hash of the key for added security. But we can actually use hashing to authenticate data sent too. It is VERY SIMPLE Faketoshi… as we know hashing operates unidirectionally to create a new unique identifier. It is used in solutions like PGP and Bitcoin as a simple way to “prove” you are the author…Craig. Hashing algorithms don’t last forever there can either be flaws found in them or they could be computationally broken by very powerful computers to find collisions. MD5 and SHA-1 are still commonly used even though found to be insecure. But anyway as mentioned the process of signing a message is because of the way public key cryptography works you can only decrypt data encrypted by a public key with a private key hence why it is used to encypt secrets. The opposite is not true, it is is just that opposite in that data encrypted with the private key anyone can decrypt with the public key. This obviously isn’t useful for secrets but can be useful for authenticating the author or at leat the sender anyway. The below diagram visualizes this well. However as we’ve all seen this process is not without its flaws. It in no way proves any connection to the statement and the keys. Only that the senders keys are corresponding. A malicious idiot…Craig, could create a new key pair claiming to be alice. The only way to solve this is to bind the public keys to the owners identity using certificate authorities. And yes there are way too many words in there crypto anarchists and privacy advocates are not very fond of…

These are the very basics surrounding cryptography. In distributed systems there are many other issues to be resolved for security and privacy. Take encrypting a message for instance. If done on a centralised service alot of metadata is left behind or in their control, timings, frequency, origin, endpoint and maybe the size or characters of the message among others issues.

If you enjoyed this feel free to tip a pirate a beer via paynym +noisyfog046 (part 2 soooon)