Content-Based Recommendation Model to Improve the Shipment Selection for Trucking Companies

Rajat Rajbhandari, PhD
dexFreight
Published in
9 min readJan 14, 2021

by Mauro Jimenez and Rajat Rajbhandari, PhD

Summary

In this blog, we discuss how we built and tested a recommendation model to present relevant shipments to carriers. On the platform, we receive over 50,000 shipments a day from various sources. Carriers find it time-consuming to filter so many shipments using predefined parameters. Hence, we built a model so that it dynamically assigns weights to shipments that match the carrier’s preferences and past history of similar shipments they have moved. The model then presents numerical matching % to an already filtered set of shipments. The blog post describes the complexity and nuances of building and running these models.

Introduction

The use of recommendation models can be found in many online marketplaces such as Amazon, Airbnb, and Netflix. These models present users with “recommended” items for purchase or movies based on their usage history as well as their stated preferences. In dexFreight marketplace, everyday trucking companies (referred to as users) are presented with thousands of shipments for booking, which could overwhelm them, and worse they may pass on the opportunities to book those shipments.

Hence, we built a recommendation model using machine learning to present trucking companies with the most likely candidates of shipments and reduce the time they spend in searching and filtering for shipments. This allows users to spend more time on other essential tasks of moving processing.

Figure 1 below shows a user interface in the dexFreight marketplace by which users are presented with a curated list of recommended shipments along with matching percentages.

Figure 1. List of shipments recommended to carriers along with matching percent.

The recommendation model utilizes TF IDF and Cosine Similarity algorithms to find matching shipments based on the user’s past history. However, we found out that there is a trade-off between how many new input shipments are analyzed as well as how far back users’ historical data is utilized in the model and the model’s execution time.

Objectives

We present in this blog post a summary of the research we conducted in assessing the previously mentioned trade-off. Our objectives for this research are the following:

  • Describe the purpose of the recommendation model in the dexFreight marketplace.
  • Understand the components of a recommendation model.
  • Develop a model to recommend shipments to users and analyze trade-offs.

Brief description of the shipment data

In this section, we describe the datasets of over 500,000 records that were used to build the shipment recommendation model. Each record is a shipment offer posted by freight brokers and shippers in the marketplace. Our challenge is to present the users with a set of relevant shipments. Each shipment record has a structure with three levels of attributes as shown in Figure 2. The first level is associated with the main labels, such as mode, origin, destination, trailer size, etc. The second level increases the segmentation of the records with keys related to the first level attributes in a hierarchical structure. For example, second-level attributes related to mode include less than truckload, full truckload, etc. The third level contains values or parameters further segmented from the second level attributes. For example, the third level attributes for flatbeds included the length of the trailer.

A pre-processing unit was implemented in order to represent the data from the third level attributes in a numeric way. The pre-processing unit is based on the data distribution and the weights associated with the term frequency of each individual attribute.

Figure 2. Hierarchical attributes of shipments.

Insights about the learning model

In order to understand the learning of the shipment recommendation model and the user’s behavior, we analyzed the shipments that have a similar type of attributes. We considered the output of the recommendation model as a random variable because we have to wait until a new set of shipments are used as an input to perform a recommendation.

Thus, we created a dictionary for the attributes that are values or parameters from the shipment. These attributes can be considered as indicators from an event, which is the simplest way to describe discrete random variables. This assumption led to a numeric domain composed of indicators (attributes of the shipments), each one of them with two states. Ignoring the frequency of those events, we have two clusters of indicators; the indicators show the occurrence of a load attribute and the absence of a load attribute.

The reason for using this concept is to consider the uncertainty of those attributes because a probability framework provides a consistent process for the tokenization of attributes and gives us clues for pattern recognition of the shipment data. When combined with decision theory, it allows us to make optimal predictions given all the shipment information available to us, even though that information may be incomplete or ambiguous.

Although the absence or presence of a shipment attribute gives us the boundaries for the amount of information required for the recommendation model, the frequency of an attribute might provide additional information, but it does not necessarily mean that the most frequent value is the most relevant. Let’s break down the previous statement about the relevance of the frequency with an example:

Let’s think about a carrier that in the past has booked 4 shipments (Index 1–4 below) using 53" dry van trucks and a broker has posted a load (Index 5 below), which size and weight requires a truck with a Flatbed equipment type.

The pre-processing unit generates a count matrix with the previous information, after tokenization. Where each row represents a shipment and each column the presence or absence of each attribute of the shipment. This representation into vector space is called embedding. The attributes of the loads are also known as queries.

If the recommendation model takes into account the attribute frequency in the corpus, in Figure 3, which is the count of the attribute divided by the total of words. The recommendation model would respond with similar percentages ignoring the carrier’s preference.

The previous issue is compensated using the inverse term frequency (equation below), which means that the recommendation system needs to consider the frequency of the attribute across all records and the shipment itself. The frequency term and the inverse term frequency are directly related to the amount of information the recommendation model can retrieve.

Figure 3. Term frequency equation.

After the application of the inverse term frequency is possible to observe two things considering the shipments with the highest weight over the equipment type attributes (dry van and flatbed). The equation assigns more weight to less frequently occurring attributes rather than frequently occurring ones.

Thus, in order to know which attributes are similar between shipments, we applied the cosine similarity which is a measure of the cosine of the angle between two vectors (referring to the shipments) projected in a multidimensional space.

Recommendation model architecture

The recommendation model architecture includes a generation block that takes into account the user’s stated preferences and filter attributes like the equipment type, origin, and destination and a threshold for the percentage of similarity among the candidate shipments. In the future, we will include a feedback loop regarding the actual selection of shipments by users. Every component in the architecture is shown in Figure 4.

Figure 4. Recommendation model architecture.

Furthermore, in order to avoid the introduction of systematic bias in the data, a trade-off between the performance with pre-processing filters to reduce the hardware requirements should be considered. Although the proportion of items the recommendation model can recommend is related to the coverage of the amount of information and the percentage of all items that can ever be recommended.

Results and validation

In order to develop an unbiased recommendation model, we have considered three different types of users based on their shipment data sizes (historic) and user preferences such as equipment type, origin-destination. These preferences can be set by the user or set by historic criteria, like the time of a shipment offer in the market and the percentage of booked shipments selected within a time frame.

Types of users based on the number of shipments processed in the past:

  • With 10, 000+ historic shipment records.
  • With 5,000–10.000 historic shipment records.
  • With 1,000–5.000 historic shipment records.

In doing so, we avoid data distribution assumptions to replicate the influence of users’ behavior. Furthermore, to increase the performance of the recommendation model, assumptions about the user’s preference can be considered, but the selection of the respective parameters can introduce a systematic bias. Prior knowledge can correct the bias in the data, but due to the high variance in the logistics industry, this is a design constraint.

In the future, we will include carriers with a smaller number of historic records to account for bootstrapping problems and the fact that they are new in the market or new in the dexFreight platform.

Comparison of model performance based on user-preferences

The hypothesis that we wanted to test is related to recommendations for the different types of users and how the model’s execution time is affected by the size of the previously booked shipments (user’s historic shipments) and the number of shipments (batch size) being provided with matching percentage.

We executed the model with few user-preference attributes in order to achieve a shorter execution time. Analyzing the results we can see a positive correlation between the model’s execution time and the type of user. Figure 5 shows the execution time for the three types of users along with different batch sizes of input shipments. It shows that the number of historic shipments and depth of attributes affect the performance of the model. Figure 5 below shows that the model’s execution time (in the y-axis) is highly dependent on the number of past shipments considered, which is obvious. However, the batch size did not matter. Also, the execution time is not linearly proportional to the number of past shipments (shown in the legend) used in the model.

Figure 5. Recommendation model performance.

Furthermore, Figure 6 shows that the average Cosine Similarity score (in the y-axis) is between 0.373 and 0.529 for all three user types with a 95% confidence (higher Cosine Similarity score means a higher level of matching.) This figure implies that the batch size, which consists of a set of shipments randomly selected, is being proposed as a candidate for the recommendation system. The similarity score was not very different between 1000 vs 10,000 historic shipments. However, the score dropped for higher the batch size of input shipments for which matching percentages are to be determined. This shows the higher-level influence of stated user preference in the similarity score.

Figure 6. Cosine Similarity for different batch sizes and type of users.

In Conclusion

We have described our content-based recommendation model system for the dexFreight marketplace, candidate generation, and ranking of the suggested shipments using the Cosine Similarity. We have tested the trade-off of using pre-processing filters. The evaluation of the model was performed offline, thus we cannot directly measure the model’s influence on user behavior. At this stage, the goal of the model is to filter out unnecessary user-preference attributes resulting in a relatively small set of new shipments that might be of interest to the users. Instead of doing 80/20 (training/testing ratio) validation of the model, we created an experiment with a fixed number of new candidate shipments using different batch sizes.

We are already integrating the user’s feedback loop regarding the actual selection of shipments recommended by the model to fine-tune the recommendation model. By doing so, the model will become more responsive to the user’s actual preferences (which may be different from predefined preferences) and not limited to historic shipments booked by the users or trucking companies.

Feel free to send your thoughts and comments at rajat@dexfreight.io.

Follow dexFreight on Twitter, Telegram, LinkedIn, and Newsletter for project updates and announcements.

--

--

Rajat Rajbhandari, PhD
dexFreight

CIO|Co-founder at dexFreight, blockchain author, evangelist, transportation nerd, systems expert, bullshit filter, unapologetically introvert, father, and more…