Entropy And Randomness

--

You are gambling in a casino and there seems to be rather too many occurrences of a certain number. So how can you tell if the casino is cheating? Well we would measure the entropy of the system, and determine how random the numbers were.

Encrypted content tends not to have a magic number (apart from detecting it in a disk partition). If we analyse both compressed and encrypted fragments of files we will see high degrees of randomness.

An important detection method for detecting compressed and encrypted files is the randomness of the bytes in the file. This measure is known as entropy, and was defined by Claude E. Shannon in his 1948 paper. The maximum entropy occurs when there is an equal distribution of all bytes across the file, and where it is not possible to compress the file any more, as it is truly random.

We determine frequencies of the bytes and then use:

For example “00 01 02 03” gives f1=0.25, f2=0.25, f3=0.25 and f4=0.25, which gives:

for freq in freqList
ent = ent + freq * math.log(freq, 2)

If we measure the Shannon entropy of a TrueCrypt volume we get the results of:

C:\Python27>python en.py "c:\1.tc"
File size in bytes:
3145728
Shannon entropy (min bits per byte-character):
7.99994457357

--

--

Prof Bill Buchanan OBE FRSE
ASecuritySite: When Bob Met Alice

Professor of Cryptography. Serial innovator. Believer in fairness, justice & freedom. Based in Edinburgh. Old World Breaker. New World Creator. Building trust.