Push Swap in less than 4200 operations

Ulysse Gerkens
13 min readAug 1, 2023

--

You love it or you hate it: it’s Push Swap. There is hearsay about the annoyance of this project, or worse, faults that it doesn’t have. However, once you understand the bizarre project rules, your interest in it may gradually increase. It worked for me until it became a major concern in my life and start to infiltrate my dreams. No joke.

In this article, I share the strategies I adopted to build a program that sort 500 numbers in less than 4200 operations (3784 mean, 3871 worst case). But above all, what I wish I had known from the beginning, my mistakes, and how to achieve a similar result in MUCH less time.

Requirement

Understand the subject and movement mechanics (this tool could help: github.com/stevebalk/push-swap-clicker). Understand how to sort three and five numbers is a good start. For this purpose, I recommend this article: The least amount of moves with two stacks by Jamie Dawson.

Interest of the project

  • Compare classic algorithms and adapt one to the specification of this project. Here is a well-designed website to compare the most classic of them: toptal.com/developers/sorting-algorithms¹.
  • Our algorithm performance is solely judged by its frugality of moves. Thus, it doesn’t need to be shy about checks: measure twice, cut once. While this makes it a poor candidate for digital sorting, it may shine for physical sorting, where the costly moves are the resources to minimize².

Advertisement

  • I will not go into the details of the implementation. It’s your job! However, everything is available and documented on my GitHub with more technical discussion. I’m available for any questions or remarks.
  • I haven’t found a more efficient algorithm on GitHub… BUT: there are some³⁴⁵ I just haven’t found the corresponding code and/or detail about the algorithm used. If you want a more efficient and mathematical approach, there is this article in Korean⁶ (but math is a universal language, isn’t it?).
  • If you find or do a more efficient algorithm, I’m very interested to see how it works! Do not hesitate to reach me via github.com/ulyssegerkens.
  • This is not an introductory tutorial to the project. I'll get straight to the point without explaining the mechanics of Push Swap. Ready?

First chapter — Build your stack and operations

We will need a good foundation if we want to carry out this mission successfully. The first step is quite trivial: you have to create the data structure of the stacks and the actions (= function) that will allow you to modify this data.

I already contradict myself by telling you about the implementation, but these first choices will be decisive for the rest, so I’m making an exception:
For a cleaner code, I recommend splitting this project into three distinct parts, with their header file and src directory: the stack library (data structure and function to act on it), the push_swap program, and the checker program. Both programs use /stack to run, but /stack is a kind of independent library.

Second tip: the push_swap program sorts a list of random numbers. Actually, the value of these numbers is indifferent to us; the only thing that matters is their order. So I recommend you map the values into their rank.
Example: ( 1024, 19, 42, -512 ) will become (4, 2, 3, 1). This will allow us to make much easier comparisons during the sorting.

Finally, I advise you not to implement your stacks as linked lists but as Circular Buffer⁷⁸ in an int array. This is optional but will significantly increase the speed of your program, which will come nice when you’re debugging your program and doing lots of testing.

Second chapter — Chose your algorithm

The best sorting algorithm often depends on the specific constraints and requirements of the problem you’re trying to solve. So the key to choosing the suitable algorithm is understanding our data structure and the operations we can perform on it.

Could you make it simpler?

The most simple approach to sort the stack A could be the following :

while (is_empty(stack_a)){
while (stack_a[0] != min(stack_a))
rotate_a();
push_b();
}
while (is_empty(stack_b)){
push_a();
}

And voila! We have an inefficient but already functional sorting algorithm! It’s like arranging a bookshelf by emptying and refilling it for each book — we could certainly do better. Let’s plan a real strategy!

Divide and conquer approach

My observation of the algorithm my pairs wrote led me to the first key to sort efficiently: split your set of random numbers into smaller pre-sorted chunks, then sort these smaller chunks, the bigger one to the smallest. The next question was how. How to choose the number of chunks, their size, and where to place them? During my research, I observed a lot of complex strategies (such as creating eleven chunks…). The good news: one of the most straightforward strategies is the most efficient!

meme push_swap algo
How I felt after searching “Push Swap algo”⁹ ¹⁰.

To the 3rd & 4th dimensions

After some reflection, I arrived at the following idea: Thanks to the (reverse) rotation operations, which is uncommon for a classic stack problem, we do not have two stacks but four! I swear! But let’s call them ‘position’ to not be confused. So there are: TOP_A, BOTTOM_A, TOP_B, BOTTOM_B. We can easily move any value from one position to another in one, two, or three operations. Now the question is: How to take full advantage of this accessibility?

Implementation tips: enum type for positions and a function to move a value from one position to another.

Three-Way Quick Sort

Here is the second key for the project: the power of recursivity. Programming recursion can be tricky to handle, but when mastered, it’s a superpower that enables you to create clean, short, and efficient functions. Recursivity is also the most common technic to implement the algorithm we’ll use: Quick sort.

The key idea to understand recursivity, but also Quick Sort is the following: reduce the problem, until it becomes extremely trivial. Concerning our Push Swap the difficulty is to automatically sort a big set of numbers. But our dear C language can only compare 2 values. So, let's divide our big set of numbers, again, again and again, until obtaining groups with only 2 numbers that we can compare and eventually swap. Finally, merge these very small groups in the right order.

I’ll let you understand how works the classic Quick Sort by yourself. It’s a very classic algorithm and there is a bunch of resources on the internet. Here, thanks to our 4 locations, we can implement a Three Way Quick Sort. The core steps of the algorithm are:

  1. We split the chunk from its location to the three other locations. So we have chunk min, mid, and max. We always send the chunk max to TOP_A (BOTTOM_A if we’re already on TOP_A). An essential feature of the next step is that values are guaranteed to be in their corresponding chunk. What I mean by this is that you will always find all the values of a full range in the corresponding chunk.
    Let’s give an example: To start, I have a random set of values on TOP_A. I split these values, one by one, into 3 locations: the bigger ones to BOTTOM_A (by rotating A), the middle ones to TOP_B (by pushing to B), and the smaller ones to BOTTOM_B (by pushing to B and then rotating B). Now that I’m done with my first split, I have a range of big values on BOTTOM_A, let’s say 100–66, still in random order; a range of middle values on TOP_B, let’s say 65–33; and the smallest values on BOTTOM_B, 32–0. Nothing is left on TOP_A.
  2. After split(), we recall our recursive function for each split chunk in the proper order: the chunk max first, then mid, then min.
    So, continuing from the previous example, we call the same split function for the max chunk located on BOTTOM_A. It will split the values into TOP_A for bigger values (100–89), TOP_B for middle values (89–78), and BOTTOM_B for smaller values (78–66). We repeat this process on each range, etc. When the first chunk max is completely divided and sorted, we will split chunk mid from the first division, and so on.

These two steps, the problem reduction (by splitting) and the recall, make our recursion. But something is still missing to make a recursive function — the condition that will stop recalls, aka the base case or the termination condition. I let you have the pleasure of finding it by yourself. The question is something like "How far can I simplify this problem? When will I no longer be able to perform recursion?"

In your implementation, the recursive function should look something like this :

void recursive_chunk_sort(t_chunk *chunk_to_sort)
{
if (base_case == true)
{
perform_very_simple_sort();
return;
}

split_chunk(chunk_to_sort, &split_chunks);

recursive_chunk_sort(&split_chunks.max);
recursive_chunk_sort(&split_chunks.mid);
recursive_chunk_sort(&split_chunks.min);
}
// simplification of https://github.com/ulyssegerkens/push_swap/blob/main/src/push_swap/chunk_sort.c

Once found, you will be able to admire the magic of Quick Sort: everything sorts by itself, even though you have only made simple divisions. It's quite an emotional experience the first time!

I chose Quick Sort thanks to this article that contains schemes to figure out how the splitting process works: “push_swap 가이드” 작성자 minckim. In the first case [1], [2] and [3] are mixed (it is not clearly indicated). In the second situation, the 3 separated split chunks contain three ranges of values.

Now that we have a functional algorithm, it’s time to test it… and be terribly disappointed. It sorts 500 numbers in 5636 operations. A poor score does not allow us to succeed in the project with 100%…

This is the point in the story when the hero is exhausted, his dearest friend is dead, and the villain seems invincible… However, as with any story worth telling, there is a silver lining, a light at the end of the tunnel. A final effort may allow us to achieve victory!

dunning kruger effect
Welcome to the Valley of Despair from the pseudo Dunning-Kruger Effect

Third chapter — Optimization

The optimizations of our algorithm will always be done two times: observe its operations meticulously¹² to locate a useless movement, then fix it in the program. Here is a brief summary of what can be observed.

Strengths: Make use of all the available space.
Weaknesses: Operation for all numbers, even if they are already sorted; Do not enjoy the swap operation and double operation; Do parasite operation on chunks due to the positioning system; Clearly not optimized to sort small chunks.

Below is a summary of each optimization I found. It’s probably from here that you can also find your own original solution. Let’s start!

Last in first! Why?

… because it’s the most obvious, of course!

Let’s start with Post sort-optimization. These optimizations are independent of my algorithms and you can apply them with any other method! It involves not directly printing the operations when you call them, but saving them in a linked list. When the stack is sorted, we’ll parse the list, apply two optimizations, and then print it.

The first is the merge operation. It will fix the unused advantage of double operation. It’s pretty simple: sa + sb = ss; ra + rb = rr; rra + rrb = rrr. And that’s it.

The second consists of purging neutral operations: pa + pb; sa + sa; ra + rra; … A more advanced version watches after the neutral operator not just on the next node but while there is no operation on the current stack. I did it but, if I remember well, the spared operations are quite low.

Extend the base case

If you’re reading this line, I hope you found the base case by yourself; If not, spoiler alert! The base case of the algorithm sorts a chunk “statically” when it cannot be split: so when its size is 1 or 2, depending on your implementation. As our dear Quick Sort is inefficient for small chunks and our human brain can easily find the best sort for each combination of numbers until 3 or 4, we can have a significant impact by extending our base case to these sizes.

To The TOP!

After the first split, we have a chunk max on BOTTOM_A, a chunk mid on TOP_b, and a chunk min on BOTTOM_B. The next stage will be the splitting of chunk max. As there is nothing else on TOP_A, let’s say this chunk is on TOP_A in place of BOTTOM_A. It will avoid a lot of rotation! Same for chunk min that will be split much later but also on BOTTOM_B with a free TOP_B.

From here we go into more advanced optimization… but the good news is we just reached the 4200 operation 🥳 So this title wasn’t a clickbait!

Better pivots

Okay, here I must confess something to you: I do not understand my whole program. This optimization is the fruit of intuition: changing the size of the split chunk according to its position and size could be advantageous. I use pivot values to split a chunk, so I try to define these pivots dynamically. In the first instance, I searched for a holistic solution, but after many tries and errors, I found an optimized code with values… that I cannot wholly explain.

I share with you one of these intuitions that seems verified by my tests: If the chunk to split is small, it’s useless to split it in 3. To prevent any values from being placed in the chunk min, I set my first pivot to the smallest value of the chunk.

Sort Asap

During the split process, when we put a value in the chunk max, if the value is by chance on its final position (we are on TOP_A, and after this value, TOP_A is sorted), we’ll reduce the chunk max by this value. So, the chunk max will be smaller, and we’ll avoid unnecessary sorting. This is the basic version. The advanced one does the same but for the 2 and 3 last values placed on TOP_A.

Easy sort

We’ll take advantage of some chunk take that could be already sorted. This optimization increases or variance and does not theoretically improve or worst case, but in practice, thanks to the randomness of our set, it will improve the average performance.

Again, there is a simple version and are more advanced one. The simple one checks if, on the current location, the chunk is already (reverse- ) sorted. If that’s the case, it moves all the chunks on TOP_A. Then, I added two extensions: “at least the beginning” and “it’s just a swap”. The first one does an Easy Sort at the beginning of the chunk (while the first value of the chunk is the next value of TOP_A). The second, based on the first, watches not only the first value but also the second one and takes it if it’s the next value of TOP_A.

Operations report

If you follow this article as a tutorial, here is a list of each implementation’s difficulty represented by 1, 2, or 3 emoji.. which one? hummm… this one 🐬¹³. I advise implementing them in this presented order. I also indicate the average result of each optimization for 500 numbers set (cumulatively).

Before any optimization: 5636 🏁

  • 🐬 Extend base case (for 2): 4700
  • 🐬 Extend base case (for 3): 4675
  • 🐬 To The TOP: 4425
  • 🐬🐬 Post-op: Merge: 4207
  • 🐬🐬 Post-op: Eliminate neutral: 4138 🥳
  • 🐬🐬 Split: Better pivots: 3901
  • 🐬🐬🐬 Split: Sort Asap: 3836
  • 🐬🐬🐬 Easy sort extended: 3784

After optimizations: 3784 🏳️

And for your lovely eyes (yes, I see you since you forgot to put these stickers on your webcam), here is a flowchart of the whole program, optimization in yellow. You will also find screenshots of the algorithm in action.

Fourth chapter — The Checker

As this bonus is quite easy, I will not delve too deeply into it. I’m just drawing your attention to an implementation issue: we could be tempted to reuse the project get_next_line to read the file, but unfortunately, I have not recommended it. I explain why on my Github repository: github.com/ulyssegerkens/push_swap.

Postface — Ideas to go further

It’s here that I transmit to you the torch.

Personal ideas to improve Quick Sort:

  • A custom algorithm to sort small chunks (it could be more efficient on sizes 4, 5, 6, …?)
  • The choice of pivots has a significant impact on the performance. A more dynamic way to calculate them could be gainful. What about splitting into the two closest locations instead of three?
  • In the process of splitting from a top location, if we have several values that go to the bottom of the other stack, then to the bottom of the same stack, we could spare some operation by using rr: ( n*push, then n*rr).
  • Another post-sort optimization that I haven’t implemented, but might save a few operations: ra + pb + rra = sa + pb (and its opposite).

I’m very curious if one of these ideas leads to success… or is just dumb. Do not hesitate to reach me if you try one of them.

Other algorithms that look more efficient :

  • I have been greatly impressed by the projects developed by the students at 42-Seoul, but the push_swap by jeongble⁵⁶ seems to over-perform overall.
  • Two other better solutions³⁴ than mine are not code-sourced and seem to use a chunk pre-sort, then an L.I.S. algo⁹.

I let you on this beautiful reflection from Chat GPT:

Push Swap is like a complex dance: a series of intricate movements that can lead to harmony or chaos.

Did you like this adventure? Follow the next one here :

--

--