Serverless Genetic Algorithm

Ohi, this is a write up of my attempt to make a genetic algorithm to run on AWS lambda serverless architecture. A bit of what, why, and how if you will. You can find all the code of this project over here. But first, lets refresh our memories of what a genetic algorithm is and why bother at all.

What are genetic algorithms

Generally speaking, genetic algorithms are a group of search heuristics, which mimics the biological process of adaptation to external conditions. If we’d boil the definition further down we could say it is a way to find a solution to a problem where you know how the end result should look like, but you don’t know how to get there.

Here is a classical example. Let's say we want to sort alphabet letters, and for the sake of the experiment let's pretend we don’t know any sorting algorithms, aka we don’t know any analytical ways to solve the problem. A classical implementation would look somewhat like this:

This is a very simplified version of the process, just to demonstrate the principle. But, in general, it works the same way as natural selection does. We have a gradient vector defined by the grade function, and then we just iterate by creating small batches of variations on the currently best known solution and picking the best one at each iteration. Provided that we have variability and direction, we inevitably will end up with a perfectly sorted alphabet, without actually sorting it.

Why does this work?

This is actually a fascinating topic, and if you have time, I highly recommend reading the The Blind Watchmaker book to fully understand the principles. But for the sake of simplicity let me explain it this way.

Let's say you get a monkey in front of a typewriter and you ask it to type 26-letters. Chances that it will type a perfectly sorted alphabet are pretty much astronomical. Millions of monkeys later we still probably won’t see a perfect solution to the problem.

Now lets add the element of memory into the mix. Lets say a monkey makes 10 attempts, and it can memorize one of the better solution. Monkey’s memory is imperfect so with the next batch it’s going to screw up the memorized sequence and create 10 variations on it.

In this new system the monkey can narrow down to the perfectly sorted alphabet in 50–100 iterations.

Not too shabby for a creature that has no clue of what it is doing, ey?

We are a bit smarter than monkeys though, we have a better understanding of randomness and selection processes and can fine tune the evolution to serve our needs.

Why is this useful?

Well, any software engineering sophomore can bubble sort their way out of the task of sorting an alphabet. But, there are so many tasks out there where the complexity is so high, we humans are no better than monkeys trying to sort an alphabet. Well, it is actually the other way around. There are infinitely more problems that we have no idea how to solve than those that we have formal methods of solving. We need magic wands like genetic algorithms to even approach those problems. Here are some examples:

Genetic algorithms are used in construction to optimize weight bearing properties of complex structural elements. They are used in car manufacturing to optimize amount of metal and plastic used to produce car parts with the same strength.

There is a famous case when a satellite micro antenna was designed via GA process. Look at this, we humans have very little idea how it works, but it does and outperforms anything created by humans:

I myself used genetic algorithms to create a more efficient keyboard layout:

As you can see there are plenty of problems where the sheer complexity of a task is so high it doesn’t lend itself easily to analytical approaches to be solved.

What severless has to do with this?

The problematic part about genetic algorithms is that they are quite CPU intensive. We’re basically fooling with randomness and trying a lot of variations. Which means that we are bound by the number of CPUs we have.

For example in my keyboard design experiment, I had to analyze thousands and thousands of variations, and I only had 4 CPUs in my laptop. So an average run would take a day or two before it exhausts the options. I would have to fiddle with the size of populations and selection algorithms to get just anywhere. And the thing is, keyboard layout analysis is not that complicated of a thing, you just run it by couple of megs of real life text and you have your answer. There are way more intense problems out there.

Here is where the population size comes to play. Say if you have a 26 letters alphabet, you might get away with population sizes of 5–10. But if say you’re dealing with a 100+ keys keyboard layout, you probably want something around 50. Some of real life problems might have thousands of variables in the solution where you want population sizes of hundreds. This quickly becomes a choking point on normal computers with a limited CPU numbers.

You know what has a loooot of CPUs though? That’s right, serverless. Imagine we would run a genetic algorithm in a serverless environment. Technically there are almost no difference time wise between processing 10–50 or 200 variations in a generation. Thousands maybe even.

Yes it will be slower to boot, but if you’re solving a serious enough problem and it takes at least a few seconds to grade a solution, booting a hundred parallel computations will still be way faster than going sequentially.

My naive solution

Now, a fair warning, I’m not an expert neither in serverless nor in AI. As a co-worker of mine like to put it, I just know enough to be dangerous. I’m pretty sure my solution could be taken way further. But here is my proof of concept implementation:

Basically I used regular AWS lambda functions and direct invokation. There are several sections of the algorithm that are implemented as individual functions:

  1. kicker — just a function to start the process
  2. looper — tracks the iterations process and responsible for termination
  3. populate — generates a fresh population based on a current winner
  4. aggregate — spawns parallel calculations for each solution in a population and then picks a winner.

It is very simple internally, I didn’t even go for any complex winners selection schemas, just picked the straight up winner on each iteration and that’s it. Impressively enough, with the population size of 100 variations (without duplicates tracking) it would go from a fully reversed alphabet to a perfectly ordered one in 15–20 iterations. That is one lucky monkey!

You can poke at this project over here, it is fully deployable and ready to go.

Closing thoughts

There are obviously quite a few things to consider here.

Firstly the cost. Yes we all know that lambda is way cheaper for many applications. Is it going to be cheaper than renting a 64 core VM for a day or two, I don’t know.

Secondly the termination loop. One probably need to be real careful about having sensible termination conditions on that thing. You can’t press Ctrl+C anymore if your code stuck in a loop.

Thirdly, duplications tracking is going to be a problem. Normally you would store hashes of already processed solutions in memory to get useful new variations at every step. In serverless environment that memory will have to live somewhere outside of the current process, like maybe in a database or something similar, which will slow down the population creation process.

And finally, how much of our knowledge about genetic algorithms actually apply in a vastly scaleable infrastructure? Computer science has quite a few tricks under its sleeve when it comes to optimising selection process of genetic algorithms. But they are developed in the physical computer era where they had to use small batches and be very careful about the gradient descent. Now that one can process literally thousands of variations in parallel, how much of that knowledge apply here anymore? Or maybe there are new tricks we can develop to exploit the large population sizes that are available now?

Interesting times we live in.