Moving the Needle: Predicting Unmet Demand in On-Demand Ecosystems

Contributors: Wengen Li, Patrick Leung, Winnie So, Raymond Chan, Michal Szczecinski

Published in

GOGOX Technology

9 min readSep 18, 2020

Hourly Unmet Demand and Driver Idle Ratio (scaled)

You hail yet another taxi, tell the driver the destination and then, without uttering a single word, they drive away without you…🤷🏻‍♂

I’m pretty sure that many of our fellow Hong Kongers can relate to that experience quite well.

In fact, some of them (including some of our fellow GOGOVANers!) have developed what they call themselves a “fear of getting rejected” — they only hail taxis if they really have to.

Hourly Unmet Demand vs Demand (scaled) in some region we operate

We wish we could say that at GOGOVAN, our clients can be sure every time they need to move their goods, we will get the job done and complete their order. Unfortunately, it is not the case — yet!

However, as per one of our OKRs, we strive to make User Experience delightful. That is why every time we are unable to fully serve a client, we become more committed to achieving this aim. After all, our ultimate goal is to create a highly efficient marketplace, where we can serve our customers fully and in a timely manner.

That is what led us to establish a data team-led Marketplace Optimisation project 🎊. It brings Data, Operations, Product and Customer Service together to achieve perfection in the way we serve our marketplace.

However, we must not forget that our marketplace doesn’t only consist of customers — but also our amazing and supportive drivers that, from the beginning, have shown their commitment to achieving our vision. In return, we aim to provide them with a platform that allows them to multiply their earnings and makes it easier to find orders.

Yet sometimes, we do not achieve these goals.

It’s challenging to ensure that our clients get assigned a driver that will deliver their goods, while simultaneously there may be drivers sitting idle elsewhere, eager to have orders assigned to them.

As a result, we decided that one of our first projects in the newly created theme would be predicting cancelled order requests, or to put it simply: Unmet Demand Prediction.

Why start with this project?

Because we believe that it is one of the building blocks for our future projects: without knowing how many orders will be cancelled, we cannot really work on Dynamic Pricing, Driver Working Region Recommendations, or any other fancy demand-related project we choose.

Of course, we had already had the chance to complete quite extensive research on our ecosystem and learn more about the behaviour and interaction patterns of our users. We had a basic understanding and rule-based methods to predict what type of trips were likely to be cancelled, and when. Having laid the groundwork, we decided to focus on the next steps: gaining a competitive advantage with the help of Machine Learning.

What makes predicting unmet demand hard?

GOGOVAN is a platform that matches users with drivers. The (unmet) demand in our platform is dependent on multiple spatiotemporal features, such as date, time, number of active/busy drivers in regions, traffic, weather, historical demand, marketing campaigns or online activity.

Unmet Demand vs some other chosen features (scaled)

Given the stochastic nature of the problem and its complexity, it is hard to make accurate predictions. However, the closer we are to the truth the better — our main goal is to accurately predict Unmet Demand, and specifically to detect anomalies or extreme values.

Variety of models and evaluation metrics

There are a variety of solutions out there that aim to forecast (unmet) demand, both Statistical (e.g. ARIMA) and Machine Learning (e.g. Linear Regression).

As we prove in the Exploratory Data Analysis, our Unmet Demand is quite correlated with Unmet Demand and Demand in the previous hour (+0.7 correlation coefficient), but at the same time, we know from our domain expertise and experience, that it also depends on other factors, some of which we mentioned earlier.

Consequently, we decided to reject using classical Statistical Methods and simply focus on Machine Learning solutions that have been proven to work better with such a large multidimensional set of features.

Drivers in Tong Mei and Mong Kok and Sha Tin are struggling to find orders, while 20 minutes away there are many orders that get cancelled in Central (mock illustration)

Having decided on our methods and focus, we now need to define our evaluation metric. As our goal is to precisely detect spikes in Unmet Demand and avoid false positives, we realised that our metrics should, therefore, entail bigger punishment for larger errors. That is why we chose RMSE as our evaluation metric.

We plan to use our model to predict the unmet demand for the following hour, as that gives us enough time to act on the prediction in our ecosystem, and for each region separately.

Features

For training, we decided to utilise a year-worth of GOGOVAN historical trips data.

Why such a long period?

Mainly due to seasonality. As every on-demand platform provider does, we experience peaks and troughs in demand throughout the year (to learn more, read this great piece on seasonality).

We also test our models on two months of data.

Avg Unmet Demand in HK. The gaps in the plot show the dates which we removed from the training dataset.

Throughout the EDA, we removed some dates that were missing too many features or presented anomalous Unmet Demand (due to numerous factors, such as our backend being down, schema changes, replica lags or external API calls failing).

Of course, this is not an ideal way of dealing with such issues, but given that most of these issues have recently been resolved with our switch to streaming architecture, we decided that, in many cases, filling these nulls would unnecessarily skew the results and entail too much extra work for little benefit.

In the beginning, we managed to gather together a list of more than 80 features that could possibly have an impact on Unmet Demand. Given that some of them were highly correlated and consequently increasing the risk of overfitting, we decided it was necessary for us to first decide on the optimal number of features and select them.

Recursive Feature Elimination

Firstly, we decided to determine the number of features we shall use: for that, we ran Recursive Feature Elimination — a method in which at each iteration the least important features are dropped and an optimal number is determined.

We found that the optimal number of features for us to use is about 20–30. That should ensure we avoid common traps and risks coming from using multiple features and also focus on the most important signals.

Ridge & Lasso as feature selectors

We also used the classical feature selection models: Ridge and Lasso Regressors. With their help, we settled on the following features:

Demand: 
‘unmet_demand_1_hour_before’, ‘unmet_demand_2_hour_before’, ‘demand_1_hour_before’,‘demand_2_hour_before’, ‘unmet_demand_rolling_mean_30days’, ‘unmet_demand_rolling_mean_7days’, 'cancel_ratio_1_hour_before', ‘avg_total_gmv_1_hour_before’, ‘avg_total_advance_time_1_hour_before’, ‘avg_final_response_time_1_hour_before’
Regions:
‘location_id’, ‘lodging_count’, ‘restaurant_count’, ‘food_count’
Drivers:
‘driver_count’, ‘idle_ratio_mean’, ‘idle_ratio_std’, ‘traffic_speed_mean’, ‘traffic_speed_std’
Weather:
‘dew_point’, ‘humidity’, ‘wind_speed’
Temporal:
‘time_slot_id’, ‘day_type’

Model Selection

As we have established what kind of features we will use, now it’s time for us to settle on a model.

Baselines

In order to have some reliable comparison for RMSE, we decided to use a few naive solutions as baselines:

UD = 0: RMSE=1.46

As we saw before, many times the Unmet Demand is simply 0 in some regions, therefore we decided that UD=0 is a viable heuristic.

UD = AVG(UD): RMSE=1.33

Again, in this case, we always assume that Unmet Demand will be equal to the average for each region. As we see, that already performs better than the first baseline

UD = MOVING_3_HOUR(UD): RMSE=1.35

Lastly, we decided to include a 3-hour moving average for each region as the third baseline. This time, it performs slightly worse than a simple average.

Linear Regression

RMSE = 1.28

We see an improvement over our baselines — that is already good. However, looking at the graph below, that is not exactly what we are looking for:

We do not give up and continue looking.

Stochastic Gradient Descent

RMSE = 1.20

Great — looks like another improvement, at least on paper. However, let’s see how the predictions actually look like:

Well, this is still not what we’re really looking for… The good thing is, the model seems to have picked up the general trend and spikes, however it simply lacks the momentum.

Let’s keep on looking…

Random Forests Regression

RMSE = 0.97

Wow, this is a relatively huge improvement compared to the previous gains. However, given our previous experiences, let’s not get our hopes up until we see the prediction graph:

Spot on! 😍

As we see, this model not only performed the best, but it also achieved what we aimed for: picked up the momentum and distinguished between spikes or varying magnitudes.

What is also important, is that there are barely any false positives — we want to avoid situations where we predict a high unmet demand (which causes us to take some action to tackle it), while in fact there will be no unmet demand at all.

How do we use it?

Our Realtime Marketplace Monitoring tool — Ops Data Brain (randomised data)

These Unmet Demand predictions are definitely not going to waste.

Our Operations and Customer Service teams have these predictions readily available in our Realtime Marketplace Tracking Tool — Ops Data Brain. It provides them with opportunity to utilise these predictions in order to take actions like notifying our drivers, asking clients to place orders 1 or 2 hours later, or encourage them to use our another product — GOGODELIVERY.

Unmet Demand Prediction Sparklines in Ops Data Brain (randomised)

That way, we are able to quickly react anytime there is a possibility of high unmet demand and keep our completion ratio high. With time, the actions this algorithm suggests could also be easily automated, thus reducing the need for human input at all. However, initially our goal is to verify it in a “semi-automated” setting, where its predictions first go through a sanity-check by a member of our Ops/CS teams before taking the recommended actions — just to avoid pitfalls like those.

We will soon also write more on Ops Data Brain, its architecture and components — stay tuned!

Summary

In this article, we have delved into Unmet Demand at GOGOVAN, its root causes and methods we have used to predict it. We have demonstrated our process of feature and model selection, underlying the most important goal of the model: the ability to accurately predict spikes in Unmet Demand.

Our final Random Forest-based model performs up to a standard currently required by us and is running in production as we speak (write, in fact).

We are fully aware that there are better methods out there, especially based on Recurrent Neural Networks. However, being pragmatic, we decided against using them for now, as the model’s current results are more than sufficient.

However, we cannot wait to, with time, explore all these other interesting techniques — especially using Reinforcement Learning for trying to balance our supply and demand.

Article on Ops Data Brain is coming up soon!

If you are interested in another research work we have done, please read our article on Route Optimisation here.

If you want to find out more about our Data Team, please see our Head of Data’s article here.

We are always looking for top-notch Applied Operations and ML Research talent. Please get in touch if interested!(Onsite and remote)