Open sourcing Beeline: How we do our routing?

Open Government Products
Data.gov.sg Blog
Published in
7 min readDec 15, 2017

Engineering: Daniel Sim | Copywriting: Terri Teo

Beeline is Singapore’s first crowd-sourced transportation platform. Introduced to the public in 2015, the platform helps to band commuters together through their suggestions, and establish transport routes which can then be serviced by private transport operators.

If you’re wondering why Beeline takes the route it takes, or maybe why it’s not closer to your daily appointments, read on to find out more.

What is Beeline’s routing algorithm used for?

If you visit Beeline.sg’s suggest page today, you can have a look at our routing algorithm at work.

Beeline’s algorithm aims at making it easier for people to propose routes that can serve other people with suggestions similar to theirs. This is a critical part of Beeline’s value proposition, as it allows bus operators to serve a larger number of customers with the same bus fleet, and price their services lower through economies of scale. Simply put, more customers correspond to lower prices. After all, Beeline’s goals are to make rides not only adaptive but affordable.

Source: Beeline

Our approach to routing

What Beeline aims to do is classified in computer science literature into a category of optimisation problems called the travelling salesman problem:

“Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?”

More specifically, the problem that Beeline solves is a variant of TSP but with additional constraints:

  • Beeline needs to consider the order of pick-up and drop-off points for every passenger, whereas TSP doesn’t need to account for the order (you can’t drop-off someone which you’ve not picked up).
  • Beeline cannot do doorstep to doorstep pick-up/drop-offs. Detours take time, and this problem is compounded since Beeline caters to multiple passengers per trip.

It’s a hard problem to solve optimally, even without considering the effects of factors on the ground like traffic congestion.

We decided to constrain the problem by:

  • Limiting the number of possible start and end points
  • Ignoring optimality, and using algorithms that are empirically good enough

Let’s look at the two parts separately:

Limiting the number of possible start and end points

Routing is a computationally expensive task. This is why a subscription to Google’s Directions API (under the Standard plan) still limits you to 100,000 queries/day.

The problem with routing on a large scale is that we typically need to compute the travel time from every suggested pick-up/drop-off point to every other suggested pick-up/drop-off point (i.e. compute the distance matrix).

Once we have computed the distance matrix, finding the fastest route is easy (easier). However computing a full distance matrix is computationally intensive. Instead, we pre-compute a distance matrix for around 5,000 known bus stops.

At Beeline, we’ve received about 50,000 suggestions to date, and that translates to 2.5 billion suggestion pairs (70 years on the Standard Plan of the Directions API!). And if we had to pick people up from every suggested stop and home, that would probably translate to an unacceptably high number of stops, twists and turns on the route which would adversely affect travel time.

Therefore, unlike uberPOOL or GrabShare, customers can’t choose their pick up and drop off points to their utmost convenience (like X, Y or Z). It’s simply more productive with prearranged pickup points (like A and B) — usually a bus stop², though it could also be the lobby of an office for example.

Suggestions are not served to the doorstep, but at the most efficient locations to route them by

This reduces the problem of 2.5 billion pairs to under 2.5 million pairs³. While still expensive, building the distance matrix now only takes slightly less than under a day. We perform travel time calculations using the open-source GraphHopper library and OpenStreetMap data.

That done, we now only have to snap each person to a bus stop near his home or office, and perform lightning fast travel time computations by looking up our distance matrix.

Ignore optimality, and use algorithms that are empirically good enough

For the routing algorithm itself, we were inspired by the ruin-recreate principle.

In the ruin-recreate principle, we start with a set of non-optimal bus routes, and then gradually achieve more and more optimal bus routes by, for example, randomly removing 40% of stops from existing routes (ruin), and then slotting the affected passengers into the remaining bus routes (recreate).

Passengers are inserted into routes using various strategies. The open-source project Jsprit implements a number of strategies (e.g. regret insertion and best insertion).

However, the ruin-recreate principle is usually intended for global routing problems. For example, if we could cancel all existing public bus routes and recreate bus routes all over again, we would use this algorithm. Or, if we wanted to deploy 1000 buses to try and serve all the 50,000 suggestions tomorrow, at huge cost and risk to the bus operators, we would use this algorithm too.

But we don’t do that. Instead, we need a more incremental approach. We abandon the ruin part of ruin-recreate, and focus only on the recreate strategy which is equivalent to Jsprit’s best insertion strategy:

  1. Start with a seed suggestion (an initial OD pair that we will serve first)
  2. Serve the seed suggestion
  3. Shuffle all other route suggestions
  4. Try to serve the next suggestion (at any stop within 500m of his home / office) if it does not result in an unacceptably long detour. Otherwise skip ahead to the next suggestion

We repeat the algorithm a couple of times to achieve some diversity in the routes. Once done, we filter out routes that are excessively similar and present you the results.

Results

We have used some of these autogenerated routes to plan our first tranche of crowdstart routes in February this year. Unfortunately, we have also been caught up with other operational issues until now and haven’t been able to work on this algorithm as much as we would like. If you have some time, why don’t you revisit your suggestions (note: you need to “verify your email” first in order to see your past suggestions), see what routes the algorithm has proposed for you, and tell us what you think of them? If any of your proposed routes have 30 people or so, tell us about it and we can create a new crowdstart for you!

Update - 20 Dec 2017: (edited “Our approach to routing” to define the problem clearer)

There was a question about why didn’t we use existing open source tools (e.g. Google’s Optimization Tools), and instead decided to build our own to solve this Vehicle Routing Problem (VRP). The problem that Beeline seeks to solve, though seemingly similar to the problems that other VRP solvers are handling, differs as follows:

Existing Vehicle Routing Problem (VRP) Solvers

  • Generally limited to route creation involving doorstep to doorstep pick-up/drop-off

Beeline’s Routing Algorithm

  • Goes further to select the best out of multiple public bus stops (near one’s doorstep).
Left: Solution provided by typical VRP solver; Right: Solution provided by Beeline

If you have any suggestions on how we can improve Beeline, leave us a comment below!

Caveats

  1. Some bus stops are restricted during peak hours so we apologise in advance if we cannot accede to certain requests.
  2. The routing algorithm is not ready to handle a large number of concurrent requests, so please be patient if it takes a while to generate a new route

Footnotes

  1. The travelling salesman problem (TSP) asks, “what is the shortest route a salesman should take to serve all his customers.” The mTSP asks, “What is the best way for a group of salesmen to serve their customers.” Whether the “best” means “fewest number of salesmen”, “shortest distance by any salesman”, “shortest average distance” etc. is entirely up to you.
  2. For the purposes of open sourcing, the algorithm makes use of public bus stops.
  3. There are around 5,000 bus stops in Singapore, so there’ll be approximately 2.5 million travel times between bus stop pairs that needs to be calculated.
  4. “Best” being the most convenient, common bus stop among numerous users with different routes in the area

Icons made by Freepik from www.flaticon.com is licensed by CC 3.0 BY

--

--

Open Government Products
Data.gov.sg Blog

We are Open Government Products, an experimental division of the Government Technology Agency of Singapore. We build technology for the public good.