In Search of the Perfect Hash

The Crypto hash, that is

--

We hash values in order to get a digital thumbprint of data. In most cases we take data of any size and then convert it into a digital hash. For MD5 this is a 128-bit hash (32 hex characters) and for SHA-1 it is a 160-bit hash (40 hex characters).

The longer the hash, the less chance there is of getting a collision, and where two sets of data will give the same hashed value. While this is acceptable if there is no context between the data, it becomes worrying when we use data in a different context, and it then results in the same hashed value. For example if a message of “Hello” can be changed to “Goodbye”, and still produce the same hashed value, then we have changed the context of the message, and still produced the same hash.

In another example, we might have two images, and if they create the same hashed signature, we cannot tell them apart from the hashed value. Thus if someone could go to prison because we detected a given hashed value for a banned image, but where the person could have viewed another image with the same hash signature. This is one of the reasons that MD5 signatures are often not acceptable in digital forensics investigations.

In this article I will pose a few questions:

  • If I change a single bit in the input of a hash, what is…

--

--

Prof Bill Buchanan OBE FRSE
ASecuritySite: When Bob Met Alice

Professor of Cryptography. Serial innovator. Believer in fairness, justice & freedom. Based in Edinburgh. Old World Breaker. New World Creator. Building trust.