Using a Genetic Algorithm to Optimize Developer Conference Schedules

Or, scheduling conferences the (NP-)hard way

I’ve co-organized a bunch of developer conferences in the past ~7 years and let me tell you: everything about organizing an event is hard. Finding speakers, estimating attendance, securing a venue, scheduling, letting people know about the event, getting them badged, getting them wifi, getting them fed… all of this is hard.

I’ll explain this bellow. For now you’ll have to trust me this is actually cool.

What makes scheduling interesting is that—in theory, at least—it’s the one task that can be automated. It can be formulated as a Computer Science exercise.

Problem: You have a list of talks, and a venue. Order the talks (and add coffee-breaks, lunch-breaks, day-breaks) to optimize for projected attendee happiness given the following constraints:

  • Limited time.
  • Limited space.
  • Limited attention spans (people need breaks at relatively regular intervals).
  • Limited willingness to starve (people want food at relatively regular intervals).
  • Social norms (for example, people want only one lunch per day, at around noon).
  • Keynote-y sessions should start the conference.
  • Free-form sessions (like lightning talks or unconferences) should end the day.
  • Each day of the event should start with exciting sessions.
  • Early morning and after-lunch sessions should be energetic.
  • Some sessions should be scheduled together (because they share a topic that people see as a whole).
  • Some sessions should not be scheduled together (because they share a topic that people see as monotonous).
  • Codelab sessions should be taking place in rooms with spare electrical outlets.
  • For multi-track conferences, some sessions shouldn’t be overlapping (because people can’t clone themselves to attend both).
  • For multi-track conferences, sometimes you want to “reset seating” by forcing people to move from room to room.
  • Each session can have different length (or length constraints).
  • Some speakers have their own constraints, like “my talk needs to go sometime after this other talk” or “I want the main room” or “I don’t do mornings”.

Now you have all the information you need to schedule a conference. Go.

The single-track, 2-day conference that I’m co-organizing these days has ~14 talks, ~7 breaks, and 2 lunches, which comes down to 23! different permutations of schedule. (That’s 23 factorial, not just 23 with an exclamation mark, thank you very much.)

23! = 25,852,016,738,884,976,640,000

If evaluating each schedule took a microsecond, evaluating them all would take 819 million years.

Most conference organizers don’t have 819 million years for scheduling, so they do what people have been doing since before Computer Science ruined everything — they find some local optimum using their organic brains. Usually, they:

  1. add more constraints (like “every session must be 45 minutes” and “day 1 is topic A, day 2 is topic B”) that significantly prune the decision tree — just to keep those organic brains from exploding
  2. start with scheduling sessions with the tightest constraints (like the keynote)
  3. fill in the rest more or less randomly
  4. swap things around until they see no glaring problems

Inevitably, after they’ve done all of the above there comes new information which considerably changes the inputs. A speaker cancels. The venue can’t provide one of the rooms. There are now two keynotes instead of one. And so on. For any sufficiently large conference, this type of new information will come several times before the schedule is published (and at least once after it’s printed).

For any such change, the schedule needs to be updated. That mostly happens via swapping things around because at some point no organizer really has the time and patience to redo the schedule from scratch.

Let’s automate! Right? Why not just feed the problem to the machine? Well… sequence-dependent setup scheduling is NP-hard. It doesn’t just feel like a hard problem, it is provably hard.

Fortunately, there is a lot of software that specializes on this problem. Those are called linear programming solvers and they’re available as plugins for commercial products like MATLAB or SAP, but also as free libraries like Google Optimization Tools.

Unfortunately, Linear Programming frightens and confuses me. It’s hard for me to understand how to frame a problem with this many constraints (that themselves sometimes include their own constraints) in LP terms, i.e. using matrices describing linear equations and inequalities.

Also, LP is not nearly as much fun as Genetic Programming.

Oooh, Genetic Programming. I’ve been a big fan of Genetic Programming for a long time now. I like how you only describe a fitness function and then let simulated evolution do its work. How the solutions converge to seemingly intelligent design although what’s behind is chaotic, simple and dumb (competition, sex and mutation). And of course how it all resembles nature.

So I picked up my 2014 genetic programming library and hacked together a project in one afternoon and one morning.

This is the same gif as above. I’m just saving you scroll miles.

The animated gif you see above is the algorithm running its course in realtime.* It takes its time but it’s still a lot faster than a human would be. More importantly, in half a minute it arrives at a quality of program that no person would have the persistence to achieve.

*) I actually slowed the beginning down so that you can see the progress of the first few generations.

Before I explain what happens, I need to give you some basic vocabulary:

  • Phenotype is, in our case, an instance of the conference schedule. In nature, that would be how a specimen (like you) looks and behaves.
  • Genotype is, in our case, a coded way of expressing the schedule using a list of integers. In nature, genotype is a specimen’s DNA.
  • Generation is a bunch of phenotypes. In our case, it’s 100 different conference programs. In nature, this is “millennials” or “baby boomers”.
  • Mutation is a small change to a genotype. In our case, it could manifest in swapping two neighbouring talks. In nature, it’s slight changes in the DNA that happen much more frequently than you would think.
  • Fitness is a phenotype’s ability to do well. In our case, its computation is rather involved, but basically it’s a score of projected attendee pain. For example, a conference schedule that calls for 16 hours of straight talks with no break has a high projected attendee pain. In nature, fitness is your ability to find someone with whom to have children. (See below.)
  • Selection is a way to decide which phenotypes get to pass their genetic information to the next generation. In our case, there’s a random tournament among the schedules where the winner of each one-on-one “fight” is decided based on their fitness. In nature, selection involves things like “dating pools”, “sex-appeal”, “marriage” and “having children”.
  • Niching (from the word niche) is a way for a genetic algorithm to keep its population’s diversity. In our case, schedules that are too similar to other schedules are penalised through something called fitness sharing (paper). In nature, niching is seen in the fact that people have different preferences in sexual partners. Example: not all women on Earth insist on only having children with the “Sexiest Man Alive 2017” (whoever that is). For reasons why you’d want to keep diversity high, see also: inbreeding.
  • Sex is a way to combine the genetic information of two phenotypes. In our case, it’s cutting up the lists of integers (the genotypes) of two schedules and stitching them together in new ways, chromosome-style. In nature, sex is “principally the insertion and thrusting of the penis, usually when erect, into the vagina” (thanks, Wikipedia!).
Obligatory Charles Darwin (image source)

What happens during the course of the animated gif above:

  1. A generation is created at random. You read that right. The first 100 schedules are completely random, with phenotypes like “conference with zero talks on day 1” or “conference consisting only of a series of lunch breaks”.
  2. This bunch of random phenotypes/schedules are evaluated, producing a fitness score for each.
  3. The best phenotypes/schedules have sex among themselves, creating offspring schedules. Some of the children receive random mutations as well.
  4. The children form a new generation.
  5. Repeat steps 2–4 until happy with the best specimen in the latest generation.

This is all it takes to arrive at a schedule that is actually really, really good. Again, the genetic algorithm itself doesn’t rearrange the schedules according to any knowledge of scheduling. The genes could describe anything else (a turbine design, a vector path, a neural network), and the algorithm would still work towards better solutions. That’s the power of evolution.

As you can see from the graph, the evolution gets rid of the worst genetic information pretty quickly. Ten generations in, and you don’t see the hilariously bad schedules anymore. By generation #200, the best schedule is almost production-quality, and improvements are sparse.

The presented gif and the graph stop at generation #1000. I normally run about 5000 generations which often still gets some additional improvement, although not much.

On full speed, evolving 5000 generations takes about 30 seconds. That’s an improvement over the 819 million years, although going back to Linear Programming, I’m sure a good LP algorithm would be finished in a fraction of that time, so I can’t really brag. Then again, setting up the LP algorithm would take me more than an afternoon and a morning, so evolution still wins in my book. (And also, LP doesn’t do multi-objective optimization as easily as evolutionary algorithms, and this problem is definitely multi-objective.)

For those interested, here’s what a typical session input looks like:

new Session('Architecture for Flutter Apps', 30,
tags: [flutter, architecture, third_party],
avoid: [architecture, third_party],
seek: [flutter])

This says that there’s a session around 30 minutes long with some tags (flutter , architecture, third_party) and that it should avoid being next to talks tagged with either architecture or third_party (because we don’t want to have a string of just architecture, or just 3rd party talks) and seek being next to talks that are tagged with flutter (because we do want to have strings of talks about Flutter).

There are also tags like keynote or energetic that have a special meaning for the fitness function.

Along with the ~14 sessions, some other preferences are fed into the machine, such as the expected length of the conference (in days), the maximum time to go without food, etc. All in all, it’s some 450 lines of code.

That’s it. Building this project was fun. It probably didn’t save me time in scheduling this particular conference (DartConf is pretty straightforward, scheduling-wise) but it will let me avoid some major frustration with rescheduling. And I’m going to use the code for future conferences.

But—who am I kidding—this was mostly an excuse to play around with my favorite type of algorithm. If you enjoyed this article, I encourage you to find a genetic programming library for your platform-of-choice, and start hacking.

Or, if you’re feeling lazy, at least try the genetic programmer’s favorite past-time and breed some cars.

UPDATE: The first conference scheduled by this algorithm—DartConf—took place on January 22–23, 2018. I’ve written a postmortem.