Estimating the optimum “Estimated Time of Arrival”

Published in

Tata 1mg Technology

10 min readAug 21, 2020

When visiting an e-commerce site, people have come to expect to know the day and, if possible, the time their purchased items will arrive for delivery. This estimated time of arrival (ETA), which is generally promised to the customer at the time of making a purchase, makes a huge difference in the brand-customer relationship, which is the new currency in the system. At 1mg, India’s most trusted consumer health platform, we process a large number of pharmacy orders, lab test requests and doctor consultation chats every day and for each order, our aim remains to set the right expectations on the delivery time to reduce anxiety and ensuring a delightful order fulfilment experience to the customer.

The Challenge:

An ETA is an estimate for the actual turn-around time for an order. It drives the first impression of the efficiency of order-fulfilment in a company. There are many challenges in predicting a reasonable TAT for an order:

First and foremost is knowing what you have in store and what/how much you need to procure to fulfil the majority of orders without suffering from overstocking/understocking. Predicting the TAT for this procurement process followed by packaging of the order is an important portion of the complete TAT prediction.
Second is having a well-oiled operations and logistics unit. However, India is a vast geographical unit and when one is promising to deliver in any nook and corner of the country, they have to deal with a huge variance in the availability of facilities and services. Also, the variable operational efficiency of the warehouses and logistics unit during different times of the year (festivals/heavy rains/pandemic) lead to massive uncertainty in the estimation of the accurate arrival time for every order.

The Journey:

To take the first crack at predicting ETA for an order, we started out implementing a heuristical model which took into account:

The displacement between the vendor and the customer city
The geographical state in which the delivery address lies
The promised SLAs by the logistics partner and
Whether the order started getting packed before or after daybreak (2 PM).

Using a bunch of conditions on these variables, we decided to communicate a range of days as the ETA for a customer’s order. To further reduce the chances of an ETA getting breached, 1–2 days were added to make the ETA lean more on the safer side. Our ETAs were not great, but never-the-less our breach was less than 5% which was satisfactory for us.

Over the last few years, major overhauls were made in getting a live inventory tracking system, better order processing and packaging flow, improvements in the logistics which led to better turn-around times for order deliveries. However, we did not change much in the way we were estimating our actual TAT. On analysing the situation, we found that close to 70% of our orders were getting delivered before or on the first date of the ETA range that was communicated, making our ETA prediction heuristics extremely conservative. In a lot of cases, we were delivering even two days before the promised date. No doubt, we had a high rate of order cancellation within the first hour of order placement, that is, as soon as the user sees the ETA.

Simply reducing a couple of days from the predicted ETA was the first obvious solution that came to our mind for tackling the problem of high order cancellation percentage. But we knew that would lead to a much higher breach percentage than we would afford. This is because a customer with an order which reaches him delayed is extremely unhappy and rarely shies away from giving us a bad performance rating or cancelling the order. Bad rating leads to a poor Net Promoter Score (NPS), something that a rapidly growing company like ours is really protective of. Also, an unhappy customer is the last thing we want at 1mg.

To explicate and solve this issue, we went to the drawing board to list down the multiple KPIs or metrics that the business associated with the communicated ETA:

Key observations based on analyzing the Rule-based ETA:

{Assume T0 is order placement time, ‘X’ is the start date of potential delivery, ‘Y’ is the end date of potential delivery and the first ETA communicated to the customer after order placement is “X to Y”. The bucket size of the ETA i.e. Y - X is 2 days.}

Breaching the ETA has a very negative effect on the customer feedback as well as order cancellations. NPS becomes negative in such cases and almost half of the orders that get their ETA breached, get cancelled.

Higher “Total orders delivered after Y %” and “ETA Breached & Cancelled %” in the above metric table for the Timeline “1st Aug’19 to 15th Aug’19” result in lower “NPS Score” and higher “Post-Val Cancellation %” as compared to the other timeline.

For all ETA buckets, delivering within the ETA bucket leads to lower NPS than communicating a longer ETA and delivering before the ETA. For e.g., when we communicated ETA between 2 to 3 days and delivered within the ETA, we got an average NPS score “41”. However, when we communicated ETA between 3 to 4 days and delivered before the ETA, we got NPS “50.0”.

High “Total orders delivered before X %” with less “Total orders delivered after Y %“ and “ETA Breached & Cancelled %” leads to the best NPS Score and low “Post-Val Cancellation %”

Being very conservative with our ETA predictions (non-competitive ETA) leads to HIGH order cancellation by the customer as soon as he/she is communicated the ETA.

The above analysis helped us in fixing the goal post for our ETA model. Ideally, we wanted our ETA model to achieve the following :

Low TAT breach- which means our model predicts “Y” in a way to minimize orders falling beyond “Y” (<5% say).
Remaining 95% orders which are delivered “before X” or “between X & Y” should follow a distribution which should minimize cancellations and maximize NPS.

The Proposed Solution:

We propose that the communicated ETA to a user for their order should be calculated using out of two prediction systems. The first system should be responsible for accurately predicting the actual TAT for an order while the second system should use the predicted TAT to assess the best range of ETA that would meet the multiple business objectives.

Following attempts were made in lines to solve the first portion of the problem i.e. “Predicting TAT accurately”.

Predicting actual TAT piece by piece:

Following is the Pharmacy order flow diagram from order placement until the first ETA communication:

To accurately predict the ETA, we needed to be precise with the following calculations at the backend:

Estimated order packaging time: Varies according to the availability of SKUs in the in-hand inventory, the number of items ordered and how busy the vendor is at the time of receiving the order.
Delivery pickup time: The wait-time between the order getting packed and shipped for delivery.
Shipment time: Depends on the customer’s and vendor’s location/city, the delivery partner serving that order, the shipment picked time/day etc.

We started with predicting ETA for the orders catered by our 3rd-party logistic partners as:

These orders constitute ~65% of the total orders fulfilled by 1mg.
These orders can be delivered inter-state as well as intra-state giving us a huge variance in the delivery TAT and an urgent need to replace the rule-based system due to its incapability to incorporate such complex variability in the order-related features.

We built a separate model for predicting each of these above-mentioned portions of TAT and the ETA communicated to the user was the overall sum of these 3 delays.

The TAT prediction method for each of the process queue was as follows:

Packaging Queue: Predicted as the 70th percentile of the packaging time a vendor took to pack an SKU in the same inventory scenario(in stock/oos) each time in the last 3 weeks.
Shipping Queue: The wait-time between the order getting packed and shipped for delivery. (A static table consisting of the daily pickup slots committed by the delivery partner to the vendor was used to predict this TAT)
Manage Delivery Queue: A gradient boosting model that took into account the geographical features related to the customer and vendor’s location, features related to the order dispatched time and properties of the delivery partner fulfilling the order was used to predict the delivery time.

Retraining frequency: This model was retrained once in every 3 weeks on the most recent delivery data gathered.

Performance Comparison with the Rule-Based Model:

We deployed the DS model in production in parallel to the RB model to check its accuracy and performance on the Order Cancellation percentage. This model helped us achieve ~1.4% reduced Post-Val cancellations and 16% better accuracy in our communicated ETAs as compared to the Rule-based model with a ~5% increase in ETA breach, which was overall a huge win for us.

The ETA metrics comparing both the models for the orders placed and delivered between 1st March 2020 and 15th March 2020 can be seen below:

Model-wise Comparison of the ETA metrics

We were also able to establish the statistical significance of the reduced Post-Val Cancellation % in case of the DS model by running T-test over the daily cancellation rates of both the models.

Need for a new ETA model:

We noticed from the DS model that showing conservatively accurate ETAs serve our business purpose in the best manner.
Everything was working fine till we decided to start showing ETAs inspired by this conservatively accurate model on our Product Description Pages(PDP) as well. The ETA at the product pages had to be consistent with the ETA we commit to the customer after order placement. The issue at its core was we couldn’t afford to use a non-competitive ETA model at PDPs as an ETA conservative in any sense on the product page can drift the customer away due to the large expected wait time.
Also, we wanted to try not predicting ETA in parts but as a whole, by a single model as it was expected to not add-on to the errors committed by the previous step’s model.
Getting rid of the rules written for predicting TAT in PQ and SQ was also a motivation.

Hence, we developed a Neural Network model that was trained on raw order features like :

[‘order_placed_timestamp’, ‘customer_city’, ‘customer_pincode’, ‘vendor_id’, ‘vendor_code’, ‘vendor_city’, ‘vendor_pincode’, ‘delivery_partners_code’, ‘delivery_partner_type’, ‘no_of_unique_skus’, ‘rx_count’, ‘otc_count’, ‘order_type’, ‘packaging_timestamp’, ‘customer_latitude’, ‘customer_longitude’, ‘vendor_location’, ‘sku_eta’, ‘parallel_orders’, ‘is_fulfilled_by_inventory’, ‘order_placed_weekday’, ‘order_placed_hour’, ‘order_placed_2pm’, ‘vendor_latitude’, ‘vendor_longitude’, ‘distance’]

In addition to the above mentioned raw features, aggregated features generated from the past month’s data of these raw-features were also fed to the NN model. Some examples of this set of features are:

Min, max, mean, 90th percentile of the TAT for the given customer_city when the order is delivered by a particular delivery partner.
Min, max, mean, 90th percentile of the order_type(RX/OTC/Both) served by a particular vendor.

We compared the “predicted delivery date” accuracy of the very conservative Rule-based ETA model, the conservatively accurate DS Boosting Model, this new PQ Neural Net model and the blend(average of predictions) of the DS Boosting model and the new PQ Neural Net model(Mean DS and PQ). The predicted delivery TAT accuracy is for each of the models is shown below:

Model-wise Accuracy Comparison of the predicted TAT

Since “The Blend of the DS Boosting Model and the PQ Neural Net”(Mean DS and PQ in the above chart) was far more aggressive than the in-production Rule-Based and DS Boosting model (very few orders getting delivered “Before X-2” and “On X-2”), it could provide the Product Page ETA model with the liberty to be aggressive in its prediction (which could help increase our order conversion rate) without letting an increase in inconsistency between the two ETA models deployed on different stages in a single pipeline.
It also achieved high accuracy (~60% orders were predicted to be delivered within [X, Y]) while keeping a check on breach (only 5%) on the backtesting timeline and hence was serving our goal of predicting TATs accurately with less ETA breach percentage.

Due to the above-mentioned wins, this blended Neural Network model(Mean DS and PQ) was deployed in production and it helped us achieve the best accuracy results in comparison to the other in-production ETA models.

Conclusion:

In this article, we discussed the traditional approach of calculating ETA for an order and how we redefined it using our first set of ETA models. In the next article, we’ll be talking about handling unprecedented scenarios that cause sudden disruptions in the supply chain like the COVID-19 from the ETA perspective and also about predicting the optimal ETA bucket using the accurately predicted ETA bucket.