System Design Basics 2— Hash Functions

Aqua Education
Web Architects
Published in
5 min readJan 16, 2024

--

Hash functions have a widespread application in the tech world. As a software engineer, you might not need to know the inner workings of hash functions, which would require a decent amount of math knowledge. However, an engineer must know the characteristics of hash functions, why they are used, and in what scenarios they can be applied.

Hash functions are used to convert binary data of arbitrary length into binary data of a fixed length. The result is usually much shorter than the original content. We usually represent the resulting bits using hexadecimal numbers or base64 to make it human-readable. This piece of human-readable string is called a hash or digest. Let’s take a look at how in Python we can use SHA256 to hash a string.

>>> import hashlib
>>> hashlib.sha256("Web Architects".encode()).hexdigest()
'd9da9e4763d16fa4d792639f77394d567eb3be8c16aa661d43bc0e72f5353ea5'

As you can see we first need to convert the string into binary data, and we also need to convert the resulting binary data into hexadecimal numbers.

A hash function has three characteristics that make it useful.

  • Deterministic, i.e. same input always renders the same output
  • Similar input renders completely different output, there is no correlation among the outputs of similar inputs. Changing one character in the previous example produces a completely different hash as shown below.
>>>…

--

--