Highlights from the Global Optimization Challenge

Published in

JD Technology Blog

8 min readNov 8, 2018

The Global Optimization Challenge (GOC) is an open data challenge that JD.com hosted from June to October 2018. Participating teams competed directly on the JDATA platform, where their solutions and algorithms were evaluated and ranked in real time. Some 847 teams and 3,500+ contestants participated in this very first GOC, competing for a piece of the total $250,000 prize. In this blog, we want to share the joys and learnings from the GOC and discuss how the GOC problems relate to the real business at JD.

Among the 3500+ contestants who participated, 76% were undergraduates, 18% were graduate students, and 6% other. The participating teams came from more than 600 universities and organizations. Most of the teams — 86% — were from China with the rest hailing from countries including the U.S., Singapore, Germany, Britain, Australia, Canada, and Japan.The average team size was about four people.

GOC Problems

Different from many data challenges that focus on prediction problems, the GOC requires teams to provide solutions to specific problems at JD. We selected the Urban Truck Routing and Scheduling problem and the Two-Stage Inventory Control problems. The Urban Truck Routing and Scheduling problem is a difficult Vehicle Routing Problem (VRP) that considers transportation capacity, demand stochasticity, pick-up and delivery time windows, heterogeneous fleet, and electric recharging stations (Figure 2).

Figure 2: The Urban Truck Routing and Scheduling Problem

In the Two-Stage Inventory Control problem, the teams are asked to design an inventory replenishment and allocation policy that determines how many units of each product to store at a regional distribution center and at local warehouses on a daily basis under certain operational constraints (Figure 3).

Figure 3: The Two-Stage Inventory Control Problem

Urban Truck Routing and Scheduling Problem

This problem is a large-scale NP-hard problem with constraints including 1) vehicle load capacity, 2) charging capacity for electric vehicles, 3) service time for customers, 4) the number of vehicles available, and 5) other business constraints. The teams needed to consider more than 1500 nodes (Figure 4) and calculate a feasible solution within 5 minutes on a super PC.

The VRP is not only a classical challenge in academia but also an extremely important real-life business problem. As there are so many combinatory choices for the routes, solving a VRP with just 100 nodes by pure enumeration would require the processing power of all the world’s computers, all running for many years. In other words, even with quantum computing capability, it’s not solvable using pure enumeration. Luckily, through modern optimization methods such as the simplex method that George Dantzig developed in 1959, the medium-sized VRP can be solved relatively efficient with PCs. However, solving a large-scale VRP optimally is still too slow even using state-of-art commercial solvers, because businesses often require answers within minutes or seconds. The challenge facing the GOC teams was to design efficient heuristic algorithms that can solve the large scale VRP efficiently in practice.

Figure 4: Distribution Map for the Urban Truck Routing and Scheduling Problem

Two-Stage Inventory Control Problem

Here we will discuss the two-stage inventory control problem in detail for those interested in diving deeper.

For each product, the first stage decision is called inventory replenishment. The inventory replenishment decision is to determine how many units of the product to order from the supplier. Ordering too many will result in excess inventory and ordering too few will increase the risk of not having enough stock to fulfill customer demand.

The second stage is called inventory allocation. The goal is to determine how many units of the product to transship from the regional distribution center to local warehouses. This process is strategically important because the in-stock rates at the distribution center/warehouses are directly related to JD’s 211 delivery promise[1], a key commitment the company maintains in order to continue offering the best online shopping experience to its customers.

The key to the inventory allocation decision is to avoid inventory stockouts in both the regional distribution center as well as any of the local warehouses by adjusting the inventory levels. The challenges lie in the difficulties of 1) accurately predicting the customer demand for each location, 2) determining the priorities among the products when allocating the products under tight transportation capacity limits, and 3) managing the trade-off between proactive allocation (storing more inventory at the local warehouses in advance) and reactive allocation (storing more inventory at the regional distribution center to provide risk pooling). The inventory allocation decision is a crucial business decision JD faces every day and we plan a future post on how to solve this problem in the real business environment at JD.com.

Figure 5: Major Challenges in the Inventory Allocation Decision

To help the teams approach the inventory allocation problem in a progressive manner, the competition is divided into two rounds. The first round consists of two tracks — Forecast and Inventory — and teams can choose either one. The Forecast Track asks the teams to make quantile demand predictions for 1,000 products based on the previous 2-year sales data. The Inventory Track asks the teams to design inventory allocation policies with known product demand distribution. The performance of the inventory allocation policy is measured by the total cost consisting of the holding, the stockout, and the transportation costs. The second round of the competition essentially combines the two tracks from the first round. It asks the teams to design both the inventory replenishment and allocation policies directly based on historical sales data. The second round of the competition further challenges the teams on how to smoothly aggregate the machine learning and optimization techniques to deal with real complex business problems. We allow team merges between the first and the second round so contestants can share knowledge and learn from each other.

Figure 6: Two-Round Design of the Two-Stage Inventory Control Problem

Winners’ Tricks

Big congrats to the winning teams at the first GOC! The top 20 teams presented their solutions at JD headquarters Beijing On October 16, 2018 (Figure 7).

Figure 7: Final Presentations in Beijing. Picture 1 — winning teams representatives with judges and JD business leaders; Picture 2 — first place team TP_AI presents their solution on Two-Stage Inventory Control Problem; Picture 3 — first place team NJUSME presents their solution on Urban Truck Routing and Schedule Problem

Among the questions we asked the winning teams are: What is the key in your solution? What’s your team’s most innovative idea? What was the most important decision you made in the competition? We selected some common elements from the answers on the Two-Stage Inventory Control problem and highlight them below:

Trick 1: Estimate the quantiles, not the mean.

“Customer demand is uncertain. We found that for the inventory control problem, it is crucial to get a good quantile estimation of demand from historical sales. In order to get better quality estimation on the quantiles, we use an ensemble method to incorporate the quantiles of the historical sales directly into master model. This greatly improve the performance of our algorithm.”

— Team SCUTSF, 3rd place

Trick 2: For demand forecast, select the right features.

“We created a few important features for the demand forecast based on the analysis from the historical data. For example

Weighted average sales from previous x days
Number of days with positive sales in the previous x days
Average weekly sales in the previous 4 weeks
Average sales from the days having promotions in the previous x days
Average sales from the days having no promotions in the previous x days
Applying logarithm operator to the sales data in training

We found those features greatly improved the accuracy of the forecast.”

— Team FLY, 4th place

Trick 3: For inventory allocation, be careful how you use the transportation capacity.

“We decompose the complex optimization problem into three sequential decision steps in the inventory allocation decision. We first determine the amount of inventory that needs to be reserved at the RDC, then determine what products to allocate to each FDC under the capacity constraints, and finally determine the amount of allocation for each product and to each FDC. We found the sequence of making those decisions is as important as improving the performance of each step.”

— TP_AI, 1st place

Trick 4: Use online test chances wisely.

“When tuning model parameters, we split the given data into training and validation sets and trained the model offline. However, we often found that the ideas that worked well in the offline training could not improve the result in the online test, which can be only used two times a day for each team. We conjectured that the demand pattern of the dataset in the online test may be very different from our training sample in the offline test. We learnt to carefully design the algorithm so that we can utilize the online test as much as possible, and to use the offline test as a supporting process.”

— LEAP, 2nd place

Trick 5: Improve on the weak point, don’t get stuck.

“The biggest learning for us is that working on the right direction is way more important than working hard. We spent most of our time designing inventory models. But after we tried more than ten inventory control policies, we realized that there were little room for further improvement. Before the last week of the competition, we were not in the top 10. In the final week, we decided to focus on improving demand forecast. Every day we worked in the morning and waited until midnight when the last submitted solution is evaluated. We made continuous improvement by testing new ideas every day, including using SVR model to replace the LSTM model. Turning to focus on improving forecast is the most important decision we made because it allowed us to make big improvement in the last week!”

— Wrongmans, 5th place

We hope you have a better idea about the GOC and how these optimization and machine learning techniques can be applied to real world problems at JD. If you are interested in solving these problems yourself, we look forward to having you in the next GOC. And since JD.com’s data scientists and software engineers work every day on equally exciting challenges, please stay tuned for more posts!

We are open to all kinds of academic research collaborations. Please contact jd_tech_blog@jd.com if you are interested.

Highlights from the Global Optimization Challenge

Written by JD Smart Supply Chain