# Randomness, Information Theory, and Kolmogorov Complexity

This article is written as the final project for Theory of Computation at Smith College, Spring of 2020

**Introduction**

Random is a tricky concept to define. Imagine you have to pick a random word — is the word “apple” random? Is the word “antidisestablishmentarianism” random instead? If one is going solely by numbers, is “101111010100” random instead of “000000000000”. Technically, if one were to flip a coin 12 times, the latter is just as likely to occur as the former, but it also is clearly determined by a pattern. It can be rewritten as “write 0*12”, a description which is shorter than the string itself. This isn’t possible with the former string. This leads one to believe that, while “000000000000” is just as random from a probabilistic standard as “101111010100”, in application, it is much less complex.

The Kolmogorov complexity of any given string is the shortest possible computer program outputs that string. A string is considered Kolmogorov random if the length of the program encoding it is longer than the string is itself. A Kolmogorov random string is one that has no easy pattern which can be turned into a program in order to describe it. Of course, the Kolmogorov complexity of a given string is dependent on the language used to write it. There’s a big difference in the length of a program encoding a string when its in Python, which was designed for human readability, v.s. Pyth, which was designed to describe algorithms as succinctly as possible.

As an example, we’ll describe two strings in Python. The first is Kolmogorov random while the second is not.

As can be seen, there is no simple pattern in the first string. It was created by keysmashing and there isn’t a python program which could describe it in less than its own length (or at least, none are apparent). The second, meanwhile, is described by a simple pattern, showing that it is Kolmogorov random.

**Historical Context**

Discussing Kolmogorov complexity is aided by an understanding of information theory — the study of the transmission, compression, and description of information, among other uses. The field is multidisciplinary, with applications in engineering, mathematics, and computer science. Algorithmic information theory is the subfield of information theory concerned with computer science.

Andrey Kolmogorov was a soviet mathematician and one of the preeminent Russian intellectuals of the 20th century. His work underlies modern day probability theory, publishing the foundational book *Basic Concepts of Probability Theory.* He was appointed the head of Probability and Statistics at the USSR Academy of Sciences while also working at the University of Moscow, with a focus on dynamical systems while making contributions to many other fields of mathematics. At the same time as he was primarily focused on dynamical systems, the field of information theory was beginning to blossom.

While Kolmogorov Complexity wasn’t named in reference to him, Ray Solomonoff was the founder of algorithmic information theory and the mathematical field that now underlies machine learning. He originated the idea of Kolmogorov Complexity, though his work gained little prominence until his ideas were rediscovered by Kolmogorov, who gave credit to Solomonoff once Solomonoff’s contributions were made apparent to him. While Kolmogorov originated the idea due to his interest in randomness, Solomonoff originated it due to his interest in inductive inference. In particular, he was interested in the problem of taking a Universal Turing machine and an output, and then predicting the input using unsupervised learning. One interesting thing to note is that Solmonoff proved that in the field of inductive interference with any given real world problem, the simplest answer to a question is probabilistically the most likely to be correct. Occam’s razor neatly ties into Kolmogorov complexity.

**Properties**

Here are several proofs displaying the properties of Kolmogorov complexity.

Determining whether a string is Kolmogorov random or not is undecidable, as shown by the following proof:

Given a machine M, and a string x of length n, construct a decider D which, given the the machine and two strings, accepts if the first string is a program that encodes the second string, and rejects if the first string does not encode the second string. Create a machine that will run this decider on all strings of length n-1, and accept iff the decider accepts for one of the strings. If one or more of the strings of length n-1 does not halt, then the decider would be able to solve the halting problem. Since this is not possible, creating a machine which determines if a string is Kolmogorov random has to also be impossible.

This result also shows that it is as hard as the halting problem to create a program which, given a machine and a string as input, will give the Kolmogorov complexity of that string on the machine. While this would be remarkably convenient, its uncomputability has lead to a new sport in the programming world. Code golfing is a competition where programmers take either an algorithm or string which they want to encode in the least amount of characters.

Another property is that for every n, there is at least one string of length n which is Kolmogorov random.

Assume that there are no strings which are Kolmogorov random, meaning all strings can be encoded by a program shorter than themselves, and that you have an alphabet of {0, 1}. For every string of length n, there must then be a program of length n-1 which is able to encode it. There are 2 to the n possible strings of length n, while there are only 2 to the n-1 possible strings that can encode them. By the pigeonhole principle, there must be at least one program which encodes multiple strings. However, that’s impossible. Thus, there must be Kolmogorov random strings.

One simple and very useful property in the context of computability is that, when converting from one machine A to another machine B, the Kolmogorov complexity of a given string x can only increase as much as a constant C. This is because C is the size of a compiler which compiles machine A into machine B.

**Applications**

There are many interesting applications of the ideas of Kolmogorov complexity. Among the most well-known are compression algorithms, graph labeling, number theory, and universal search. They have also been applied to everything from physics to looking for extraterrestrial life. As can be seen in the research of Solomonoff, Kolmogorov complexity is also foundational to the field of machine learning and artificial intelligence.

File compression is, predictably, one of the major application fields of Kolmogorov complexity. The question of how to minimally compress a file in a given descriptive language is a hard one to solve, especially given the incomputability of the Kolmogorov complexity of a given string. There are two forms of compression algorithms: lossless and lossy compression. Lossless compression algorithms allow a piece of data to be compressed, and then decompressed without losing any information, while lossy algorithms result in a loss of information during compression. For a good example of the use of Kolmogorov complexity in defining and examining the bounds of computable compression functions, this paper explores the complexity of variations of lossless compression algorithms. Kolmogorov complexity effectively describes the limits of lossless compression algorithms, making it an important tool in evaluating their performance.

Zipbombs are an interesting case in file compression. While this exploit is no longer possible on most machines, it used to be that antivirus programs could be shut down by forcing them to scan through zipbombs. Zipbombs are zip files which are filled with more zip files. After several layers of zip files, they are filled with simple files that compress incredibly small. For example, it is incredibly easy to compress a file with 7,864,543,924 0’s written in it. We were easily able to describe that file in the previous sentence. These files, with an incredibly small Kolmogorov complexity, decompress into much larger files which overwhelm the antivirus’s memory and cause it to shut down while being scanned, allowing a virus to take over the machine. Most antivirus programs these days have altered the way that they open zip files as a result.

If anyone reading is feeling particularly inspired to make file compression algorithms, there is a 500,000 euro prize to the first person to compress 1 gigabyte of Wikipedia down to under 116 megabytes.

## Assessment

Kolmogorov complexity is an incredibly useful idea which has far-reaching implications across all of math and computer science, especially in the fields of statistics and computational complexity.

The incomputability of the Kolmogorov complexity of a string is one of its shortcomings. In response to this, a measure was made called Levin complexity which sought to address this by time-bounding Kolmogorov complexity. However, Levin complexity is still limited by questions of computational complexity. It’s hard to discuss many shortcomings of Levin or Kolmogorov complexity because they’re not ones with a quick fix — we’re limited by the bounds of what computers can do. Kolmogorov complexity is still a concept which is incredibly important to our understanding of computation.

## Why we chose this topic

Kolmogorov complexity seems like such an outwardly simple idea and yet it provides an elegant way to understand the dynamics of information and computation. Ananda had previously read about information theory and wanted to develop a deeper understanding of how computational complexity connected to storing information, and suggested the topic. Sunshine had read about zipbombs, thought they were interesting, and wanted to explore the theory behind them.

**Contributions**

We each made the following contributions to the final project: Ananda did the research, wrote this post and the script, edited the PowerPoint Presentation, and participated in and edited the video. Sunshine created the PowerPoint presentation and participated in the video.

## References:

The following link are to great further reading on the topic of Kolmogorov complexity.

This article, written by Lance Fortnow, is the primary article we used for this project as we initially wanted to focus on the relationship between Kolmogorov complexity and computational complexity. Here’s a collection of Lance Fortnow’s publications, which informed a lot of other aspects of this article.

We used this set of lecture notes for properties of Kolmogorov complexity and we found many of the applications of Kolmogorov complexity here. The lecture notes also talk about several other properties of Kolmogorov complexity and tie it into probability.

Marcus Hutter is another great source on Kolmogorov complexity and algorithmic information theory.