The Tortoise, the Hare, and the Cyclical Linked List

Edgar Brazda
7 min readDec 13, 2021

--

Introduction

Little did the Greek fabulist Aesop know, when he penned his most famous fable almost three millennia ago, that it would be the namesake for one of computer science’s most enduring and useful algorithms. In this article, I am going to be explaining Floyd’s tortoise and hare algorithm, which is used to detect cycles in linked lists. Note that whilst I have included snippets in this article, an extended version of all the code and tests accompanying this article may be found in my GitHub medium repository.

The Tortoise and the Hare by Jean Grandville (1855)

Catching the Hare

A linked list, of course, is a list of nodes where each node points to the next node in the list. The first node is typically called the head, whilst the last node is typically referred to as the tail. A quick example of how to implement this in Python is:

The ListNode Class

The first thing to note is that the final node points to None by default. Here is a depiction of such an acyclic linked list:

Acyclic Linked List with Five Nodes

That is usually the case. However, we can have an instance where
list nodes point to a node which already exists in our tree:

Cyclic Linked List with Five Nodes

Our first task is to find out whether our linked list has a cycle in it, i.e.
does the final, new node point to a node in the list or None? To do this, we can start implementing our tortoise and hare algorithm. Hares, naturally, are fast. Very fast. They can travel at up to 80 km/h over short distances. By contrast, tortoises are slow. Very slow. At about 0.5 km/h, the tortoise is already testing its limits. In essence, a tortoise could only meet a hare if the two were moving in circles. To apply this concept to a linked list, instantiate two pointers pointing to the head node. (Note: if you are unfamiliar with pointers, a pointer is simply a reference to a specific node in a linked list; it’s basically just assigning the node to a variable.) Let’s call the pointers hare and tortoise. The hare, gifted with its speed, moves forward with alacrity. Move it two nodes forward at each iteration. By contrast, move tortoise forward only one node. If they meet, we know that a cycle exists! If the hare reaches None, then no cycle exists.

Let’s work through the example with the cyclical linked list comprising five nodes, depicted above. Here is where each pointer points at each step:

Catching the Hare I — Cyclic, 5-Node Linked List

First, both of them start at node 1. Then, the hare sprints ahead reaching node 3, whilst the tortoise trudges along to node 2. Next, the hare reaches node 5, whilst the tortoise moves to node 3. Finally, they meet at node 4, which the hare cycling back to 3 and then 4. We have confirmed the existence of a cycle. An implementation of this in Python is:

Catching the Hare II

Here, the ListNode class, from above, is used. As you can see, if the twain meet, we return True. If the hare, i.e. fast, “finishes the race”, we return False. So, there it is, a simple way to detect a cycle in a linked list. More often than not, however, one is not simply required to detect a cycle, but one must actually return the node where the cycle began. This takes the complexity of the task one step further.

A Minor Detour

By now you may be wondering why we do not simply use a set, and return fast if we encounter a node already in the set. Or, indeed, to solve what we’re building towards, we could just return the first value which is already in the set. The reason is space complexity. If you’re dealing with a large list, you can’t simple store every node in a set. For the curious, an implementation of the aforementioned might look something like this:

A Minor Detour

However, as I mentioned, this solution is spatially inefficient, and so we must return to our tortoises and our hares.

Winning the Race

Ultimately, we know how the race turned out. The hare, displaying a sickening level of hubris, takes a nap. The tortoise wins the race and rests thereafter. It’s quite a short piece and you can read it online, for free, in under five minutes. Just make sure to return when you’re done. In any case, let’s turn our attention to how the tortoise won the race, that is to say, how we return the node where the cycle began, instead of just a boolean. This part is, fortunately, laden with interesting mathematics! Let’s first ascertain what we actually want to prove. Floyd’s algorithm essentially states that, after the tortoise and the hare meet, we can start pointers from the head and the meeting point, both traveling one step at a time, then the pointers shall meet at the start of the cycle. Therefore, if we keep the tortoise moving, and start another tortoise from the head, the twain shall meet at the beginning of the cycle. This may sound a bit confusing, so let’s come up with a linked list with a cycle in it. In the following list, the node where the tortoise and hare meet is red and marked with an “M”. In accordance with our usual arrangement, the hare travels two nodes at a time, whilst the tortoise travels one node at a time. Both start at the head of the list. You can work out the meeting point yourself to get comfortable with the list:

A linked list with a cycle. The meeting point of the tortoise and the hare is marked with an “M”.

Once you have familiarised yourself with the list, try and see if Floyd’s algorithm works. Starting one pointer at the head and one at the meeting point, will the two meet at the start of the cycle if both move one node at a time? (They should!) Next, let’s prove Floyd’s algorithm. To do so, let me add some more labels to our linked list.

A linked list with a cycle, labelled.

Once again, we have two pointers, a tortoise and a hare, both pointing to M, and we wish to return the green node. I have labelled our head node H, the distance from H to our target node a, the distance from the target node to the meeting point b, and the distance from the meeting point back to the target node c. Again, for the sake of clarity, Floyd’s algorithm states that two pointers moving one node at a time starting at the two red nodes (H & M) will invariably meet at the green node. Let’s now write some equations based upon our labeling.

First, let’s call the full length of the cycle l. This means that l = b + c. Let’s also call the distance traveled by the tortoise, i.e. the number of steps taken, n. This means that n = a + b. The hare takes twice as many steps, covering a distance comprising a, b, and some number of cycles, at least one, which we shall call k. This means that 2n = a + b + kl, and that k > 0. Finally, to prove Floyd’s algorithm, we need to show that a % l = c. What does this mean? Well, we’re hoping that pointers starting at M and H meet. The tortoise at M has c to travel. The tortoise at H has a to travel. If the tortoise starting at the head is not at the target node when the tortoise starting at M arrives, our tortoise has to continue doing cycles. Therefore, we need a to not necessarily equal c only, but c and a certain number of cycles (which may even be no cycles). Hence, a % l = c. Thus far we have:

  1. l = b + c
  2. n = a + b
  3. 2n = a + b + kl ; k > 0
  4. a % l = c

If we can prove point 4, we shall, in turn, have proved Floyd’s algorithm. First, note that we can insert point 2 into point 3. This yields:

2(a + b) = a + b + kl

Next, we can subtract a + b from both sides:

a + b = kl

Next, we can subtract b from both sides:

a = kl — b

Next, we can slightly rewrite the right-hand side. Naturally, subtracting b from k cycles is the same as adding c to k — 1 cycles. Note, also, that as k > 0, k — 1 will never be negative:

a = (k — 1)l + c

Next, we can modulo l, both sides:

a % l = ((k — 1)l + c) % l

Therefore, of course:

a % l = c

And there you have it! A proof for Floyd’s tortoise and hare algorithm. All that remains is to implement it.

Crossing the Finish Line

The actual implementation is pretty straightforward:

Winning the Race

The crux of the problem is not so much the implementation as the proof. Now, however, you know. If asked, you can both implement and explain Floyd’s tortoise and hare cycle detection algorithm, and why it is preferable to simply using a set. I, finally, wish to exhort you once again to read the text and, thereafter, to check out my GitHub medium repository, where I have posted the entirety of the code, along with various tests and a function to generate linked lists. Festina lente.

Editorial note: In line with the Python documentation, I made the decision to update the code to use is and is not as opposed to ==. Although this makes the code longer than originally, and certain lines now use the former instead of simply interpreting a variable as a boolean, this way is technically correct, which is, of course, the best kind of correct.

--

--