‘C’, ‘o’, ‘w’, ‘s’, ‘ ‘, ‘g’, ‘r’, ‘a’, ‘z’, ‘e’, ‘ ‘, ‘i’, ’n’, 260, ‘r’, ‘o’, ‘v’, ‘e’, 259, ‘o’, 268, 261, ‘a’, ‘s’, 259, ‘w’, ‘h’, ‘i’, ‘c’, ‘h’, 269, 257, 259, 267, 286, 271, 273, 266, 276, 270, 272, ‘s’

--

We live in a wasteful digital world. Our data stream were originally designed with little in the way of reducing latency or in optimizing network bandwidth/storage. But in a world where we need almost instant responses for our data transfers, we often implement compression. The most popular standard for compressing data streams is now gzip, and which is a combination of LZ77 (Lempel Ziv 1977) and Huffman coding.

In gzip we can compress into a stream and convert to Base64 with [here]:

Input:  hello
Compressed: eJzLSM3JyQcABiwCFQ==
Compressed: <Buffer 78 9c cb 48 cd c9 c9 07 00 06 2c 02 15>

And then decompress with [here]:

Input: eJwLSS0uMTQyBgAJ6gI3
Uncompressed: Test123
Uncompressed: <Buffer 54 65 73 74 31 32 33>

The compression does not look good in these cases, as we need to repeat character sequences and also create enough data to justify the compression. Now let’s create repeated characters, and we see that the compression stream length stays the same length [here]:

--

--

Prof Bill Buchanan OBE FRSE
ASecuritySite: When Bob Met Alice

Professor of Cryptography. Serial innovator. Believer in fairness, justice & freedom. Based in Edinburgh. Old World Breaker. New World Creator. Building trust.