Nexu’s approach to the task scheduling problem

Published in

#NexuDev

4 min readJan 12, 2018

I can remember one of my teachers talking about how a CPU prioritizes internal tasks and executes them accordingly. The formula to obtain each task’s priority depends on the first value assigned to it and the time passed since its arrival. It seemed obvious to me how it worked until we had to reproduce this situation at Nexu: which customer should we serve next?

Different clients at different times

Nexu’s day to day operation relies on giving customers follow-up until they buy their car through our service. Some key obstacles have arised:

We can’t follow-up every single customer; there are more arrivals each day than there can be dispatches.
Clients can be somewhat categorized. Each category has different probabilities of closing a transaction.
We’ve found that the faster a response is given to arriving customers, the higher the chances of them actually buying their car.

Some clients receive follow-up but do not buy their car. Analyzing our data and detecting patterns we are able to create categories where we can group similar clients. We assign each client’s category an initial score depending on their probability of actually closing the transaction. Lower score = higher priority.

It is intuitive that if two clients with the same initial score arrive at different moments, the one who arrived first has a higher priority. Likewise, if two clients arrive at the same time, the one in a better category has priority. So far so good.

The real problem happens when two clients belonging to different categories arrive at different moments in time. Lets say category A has a higher initial score than category B. We have Alice in category A and Bob in category B. If Alice arrives before Bob, she will be served first. But who should be served first if Bob arrives before Alice?

Bob because he arrived earlier? Or…

Alice because she is in category A?

We need a way of transforming time into score to make that decision. This way we can add Alice’s and Bob’s initial score with their respective value of time and serve the one with lower current score.

Transforming time into score.

The current score for each client can be seen as InitialScore + f(TimeWaiting), where we must play with some function f to get the expected result. This function could be lineal, but there was a problem with it.

The order in which we would serve each client, following a linear function of time, would depend only on the initial score and the time of arrival!

If A and B arrive at the same time the order is preserved.

Arriving at different points in time the order is still preserved.

The realistic way of creating this tranformation of time into score would be by a non-linear equation. And it makes sense: the first minutes of a client’s time are crucial, after that there is no big difference between waiting 40 or 50 minutes, but there is a big difference again between waiting one or two days. Our transformation of time into score would now look something like this:

There are different crucial points in time.

How computationally complex is this approach? We can create a set for our waiting clients, ordered by score (Sets have unique members and can be ordered by some value). Because it is ordered we always take the first one in it to attend next. But the key concept of all this story is this:

If the conversion of time into score is not linear, the order of the set is not maintained through time.

In an ordered set we calculate the position of new items at arrival. Here, we must calculate every client’s score when we have resources to serve a new one. Let’s see some graphic examples.

Different clients arriving at the same time. Order is preserved: A is always on top.

Clients with same category and different arrivals. Order is preserved throughout time: B is always on top.

Order is **not** preserved: sometimes A is on top, sometimes B.

This changed our solution. Calculating each client’s score at arrival would be waste of resources. We must continuously check every client’s score in the set and take the one with the highest priority.

Final thoughts

We solved the problem, it was computationally more complex than what we initially thought. Now that I’ve deeply understood our solution and why its neccessary, I ask myself:

How does task scheduling really work on a computer?
Is a linear approximation sufficient?

I’ve found this problem and it’s solution quite interesting. I hope that other important programs consider it as well and, if they haven’t, that they find this document usefull.

Nexu’s approach to the task scheduling problem

Different clients at different times

Transforming time into score.

Final thoughts

Written by Emmanuel Byrd