Analytics in Action: Predicting Shipment Volume in Logistics to Drive Business Value

The trucking industry has a massive influence on the United States economy. Trucking companies serve as a lifeline between producers and consumers, moving pretty much everything from gasoline to Gatorade.

Source: Bureau of Transportation, McKinsey Global Institute

Despite the industry’s size, there has been a lack of innovation in the commercial side of the broader sector. In a study by McKinsey, companies in the sector have been shown to lag behind peer B2B companies in commercial capabilities. However, this gap can be closed through advanced analytics, potentially generating an additional 3 to 5 percent return on sales.

Enter Transfix. Transfix handles the complexities involving the mass movement of goods across multiple freight lanes through advanced analytics. Offering a wide range of data-powered logistics solutions for both shippers and carriers, Transfix is well-positioned to lead the trucking industry to further commercial success.

Linking Trucking Demand and Supply: The Issue of Shipper Volume

Through Columbia’s Analytics in Action masterclass, our team was able to work closely with Transfix to address one of their fundamental business problems: the issue of shipper volume.

Transfix maintains relationships with both shippers and carriers, linking the logistics requirements of large shippers with the small trucking companies that execute them.

Illustration of trucking supply and demand

Problems start to arise, however, when the shipping requirements of Transfix’s customers stray from their pre-contracted demand. Under or over-utilization of these shipping contracts tips the delicate balance between supply and demand, often resulting in value loss. Transfix encounters difficulty optimizing its bidding strategy because of the uncertainty surrounding real shipper volume.

With access to more accurate volume projections, Transfix could act more decisively to acquire or renew customers with confidence as to the true terms. Our team consequently decided that a model to predict total contract volume needed to be developed in order to address these concerns.

Understanding the Complexities of the Data

We were provided data covering about 220,000 shipments, 5000 contract lanes and 83 shippers, complete with details on equipment used, start and end dates, pick-up and delivery destinations. We aggregated the count of shipments under each unique contract lane so that we could compare the projected and actual performance.

Trying to predict a specific contract volume with any degree of accuracy proved to be extremely challenging because of the nature of the data itself. We discovered that only 60% of total pre-agreed volume obligations were actually converted into actual shipments. Even more surprising was that 47% of all contract lanes yielded no actual shipments. Contracts were not just being underutilized — many were not being used at all.

Histogram of contract fulfillment rates (Actual Shipments / Contracted Shipments)

This discovery, combined with the overall distribution of contract fulfillment rates, caused us to reassess our approach and our output variable. We eventually decided that predicting actual contract volume was no longer feasible and that we should instead focus on predicting fulfillment rates.

Modifying our Approach and Testing Different Models

In the absence of an analytical model to predict contract volumes, Transfix used a straightforward approach of assuming 80% fulfillment for all contracts which proved surprisingly effective.

We used this as a naïve baseline that we tried to surpass when building and testing new models. We explored three modeling approaches and measured our success by identifying the Mean Absolute Error (MAE) of each one.

First, we used multiple classification methods to predict what bucket of completion each contract belongs to, then predict the median of the group as the expected volume. The best model here was a random forest with 3 buckets: 0%, between 0% and 100% and greater than 100%. This model resulted in an MAE of 97.9%.

Second, we used various volume regression techniques to predict the expected volume. However, a 0% naïve model ended up with superior results, matching the previous random forest with an MAE of 97.9%.

Our final approach was to account for the heavy concentration of 0% fulfillment contracts by first making a binary prediction whether fulfillment was 0% or not, and then using volume regression techniques for the contracts predicted to yield shipments. Using a random forest for the binary step and a 30% naïve model resulted in an MAE of 93.1%.

Our binary random forest yielded a 77% AUC, indicating significant potential in accurately identifying 0% fulfillment contracts. Unfortunately, high levels of outliers and inconsistent patterns within the data made predicting exact volume highly challenging. As a result, advanced regression techniques consistently outperformed by applying simple multipliers.

A Holistic Approach to Identify Contract Business Value

Intent on adding business value beyond the project’s scope, the team decided to lay down the framework for a holistic approach to identify the business value of each contract. Our final recommendation to Transfix was not just to improve our volume prediction model, but to supplement it with two other models: one that calculates the profit per shipment and another that determines the probability of winning a contract at a given price.

Multiplying the outputs of the three models would result in the expected contract value at a given price. The final step would be to determine the price that maximizes expected contract value.

The end-to-end holistic model provides a better understanding of expected contract value

Final Thoughts

Our final model was ultimately able to offer a 20% improvement over their current baseline. However, we believe that improving average contract compliance and reducing 0% fulfillment contract volume would go a long way towards making their data more workable, further improving predictive power. Nevertheless, Transfix was able to see the tremendous potential of our proposed holistic model that serves as the foundation for maximizing the business value of each contract.

Contributors: Assylan Kassymov, Duncan Bender, Jin Pu, Pablo Clark, Xerxes Escaño, Yijia Luo

--

--