Photo by Hannes Johnson on Unsplash

Malware/Spear Phishing Detection: Meet TLSH — A Locality Sensitive Hash

--

In digital forensics and cybersecurity, we often use hashing methods to pinpoint a unique hash value for a given set of data. This is known as a cryptographic hash. We can then use the cryptographic hash to determine if something has been changed or not. But, sometimes, we need to create a hash to show that a file is similar to another file. This is the case of malware or spear phishing emails, and where the data used has often been copied from one source to another, and just changed a little.

Similarity hashing

In malware analysis and in text similarity, we often have to identify areas of data within files that are similar. One method is TLSH (A Locality Sensitive Hash), and which was defined by Oliver et al [1]:

It is used — along with ssdeep — by VirusTotal to identify a hash value for malware:

It is a fuzzy hashing method that requires at least 50 bytes of data. The hash itself is 35 bytes long with…

--

--

Prof Bill Buchanan OBE FRSE
ASecuritySite: When Bob Met Alice

Professor of Cryptography. Serial innovator. Believer in fairness, justice & freedom. Based in Edinburgh. Old World Breaker. New World Creator. Building trust.