Hamming’s distance takes its name from Richard Hamming, who introduced it in his work on error recognition and correction codes. It is used in telecommunications to count the number of erroneous bits in a fixed-length binary word in order to estimate the error. For this reason it is also called signal distance. Hamming weight analysis of bits is used in several disciplines, including information theory, code theory and cryptography.
The limit of the Hamming Distance is the fact that this theory is applicable only for string with the same length, in fact the algorithm is very simple to be implemented.
The Hamming distance between two equal-length strings of symbols is the number of positions at which the corresponding symbols are different.
In other words, the Hamming distance measures the number of substitutions necessary to convert one string into another, or, seen in another way, the minimum number of errors that may have led to the transformation of one string into another.
- “karolin” and “kathrin” is 3.
- “karolin” and “kerstin” is 3.
- “kathrin” and “kerstin” is 4.
- 0000 and 1111 is 4.
- 2173896 and 2233796 is 3.
The code is really simple, there are:
- A check to throw an error in case the string have not the same lenght
- The initialization of the distance variable to 0 (min distance)
- A for, character of the string1 by character of the string2, to check if in the same position of the two strings there are a different character.
- The return statement
Be careful to use different new constructs for this algorithm, new doesn’t always mean more efficient. For example: