KeepTruckin’s Freight Visibility Platform

Published in

motive-eng

11 min readMar 4, 2021

In 2015, nearly 18.1 billion tons of goods worth about $19.2 trillion were moved along the U.S. National Transportation Network. This number is projected to grow to 27 billion tons by 2045. At KeepTruckin, we have built a Freight Visibility Platform to give brokers, shippers, and other customers real-time visibility into their loads. In this article, we discuss our thought process as we tackled the technical challenges in designing the various components of this platform (these include the API design, ETL pipeline, access control model, workflow system, and more).

First Things First

In designing the Freight Visibility Platform, we had to take into account the requirements of brokers and carriers, along with additional requirements we added ourselves.

Primary Consumers and Their Requirements

The Freight Visibility Platform is primarily addressed to anyone representing the shipping party; this includes brokers, shippers, and anyone else using the platform. In the trucking industry, a shipper is the party who wants to transport goods and a broker is the middle party between the shipper and carriers. These primary users’ expectations for our visibility service include:

APIs that are easy to understand and consume
Ability to associate a load to the vehicle carrying it, and to detect and flag incorrect load-to-vehicle associations
An API to pull the location of the vehicle carrying a load
A push mechanism to receive locations on their configured endpoints
Minimal auth-related overhead

Carriers and Their Requirements

Carriers are companies that own fleets of vehicles and provide transportation. Our Freight Visibility Service accesses carriers’ data, and we must therefore consider their requirements, which center around data security; these include:

Full visibility and control of who has permissions to access their data
Complete visibility into who is accessing their data, and how often

KeepTruckin Requirements

Our additional concerns in designing our service included:

Rate limiting to prevent abuse and system overloads
Permission override access for the support team
Storing historical tracking data to serve billing and other business use cases

Next, the Factors and Our Approach

Broker Workflow

As part of our visibility service platform, we exposed a set of REST APIs. KeepTruckin has an app marketplace for integrations with our products. A broker must first develop and publish an application in our marketplace and have all permissions set up before making use of these APIs.

A broker must subscribe to a vehicle to track it. Subscriptions can be managed using the POST and PUT / subscribe APIs. Both pull and push mechanisms work only within the time window of an active subscription.

The set also consists of an API to help brokers determine the vehicle most likely associated with a load when given a location of interest, such as the load’s pickup or drop-off location. These serve as tools for affirming the accuracy of the electronic logging device (ELD) or other tracking ID a broker may have received against their loads, either offline or over other channels.

Managing Permissions Between Brokers and Fleets

First Challenge: Partner and Permission Overlap

The first challenge we encountered when onboarding a broker was partner overlap. We found that many of our own carrier partners were already in partnership with the broker. When we went to set up permissions for fleets that were common partners of both KeepTruckin and the broker, we found that the broker already had permission to track these fleets’ vehicles, perhaps using an intermediary visibility provider. This meant we could apply those same permission grants in our system during onboarding. The challenge was to find the mutual partners between KeepTruckin and a broker without disclosing either party’s entire customer base, or even the size of the customer base.

To solve this, we asked the broker to share a list of hashes of the Department of Transportation (DOT) numbers (these are numbers assigned to all registered commercial vehicles) of all their partner fleets. The list was interspersed with hashes of an unknown quantity of random numbers to eliminate any risk of leaking the size of the broker’s customer base.

*Figure 2. Permission grant during onboarding for mutual customers*

The hash function chosen generated a one-way hash, eliminating the possibility of recomputing DOT numbers given the hashes. We then compared this list with a similar list generated against the DOT numbers of KeepTruckin’s customer fleets. The broker was granted permission to track loads on any of the vehicles belonging to the overlapping fleets.

Second Challenge: Giving Fleets Full Control of Access to Their Data

We wanted to give full control of permission management to the fleets themselves. To enable this, we asked the brokers to create their visibility applications in the KeepTruckin App Marketplace.

*Figure 3. KeepTruckin App Marketplace: visibility applications*

A fleet manager can grant a broker the permission to track loads on any of their fleet’s vehicles by installing the broker’s application, available in the KeepTruckin App Marketplace (Figure 3). Removing the application revokes the permission.

Minimizing OAuth 2.0 Overhead for Brokers

Our freight visibility APIs use OAuth 2.0 for authorization. Typically, a partner who develops an application in the KeepTruckin App Marketplace must store and manage various states (auth grants, access tokens, refresh tokens) against every fleet that installs their application. In the context of tracking freights, brokers expressed a desire to minimize the number of states they needed to manage to use the freight visibility APIs. We had to reduce the number of states a broker needed to manage while still keeping the APIs behind OAuth 2.0 for security and consistency.

To achieve this, we create a proxy fleet for every broker when onboarding.

*Figure 4. Authorization using a proxy fleet*

Brokers now only need to remember and manage the auth-related states for one proxy fleet (Figure 4). Permission management (as described in the previous section) takes care of the mapping between the broker-specific proxy fleet and all the real fleets. A broker can now call the APIs using the access token generated for the proxy fleet and have access to all real fleets mapped to it.

Associating a Vehicle to a Load

When brokers track a load to a vehicle, sometimes the electronic logging device ID (ELD ID) they receive, or any other tracking ID they may receive against a load is incorrect. Part of our visibility service was to provide a mechanism for brokers to verify that the ELD ID (or any other tracking ID) they receive against a load is accurate. To this end, we built two versions of the association API. Using the association API, a broker can provide a fleet identifier and a location of interest, such as a load’s pickup location, and gain visibility into which one of that fleet’s vehicles is most likely to have picked up that load. If there is no potential match, they can flag the tracking ID as bad.

Proximity-Based Association (v1)

We started with the simpler version of the API, where we compute the distances of all of a fleet’s vehicles to the given point of interest and return them ordered by their proximity to the given location.

For large fleets with many vehicles, consistency during pagination becomes a concern, given that the vehicles we list may be in constant motion. If we recompute the distances when each new page is fetched, we risk missing some vehicles completely or repeating some vehicles. To maintain consistency, when the API is called, we compute and cache the distances of all vehicles from the given location for 300 seconds. All successive pages are served from the cache. The cache key is derived from the identity of the broker, the fleet, and the given location.

The current location information of a vehicle is read from the cache. The response of the API may also include historical locations of vehicles, depending on a broker-specific configuration. This information is stored in and read from DynamoDB.

Multiple-Criteria-Based Association (v2)

The problem with v1 is that it doesn’t make use of various other signals, which seemed more critical for associating a load to a vehicle. These include the direction in which the vehicle was moving compared to the location of interest, or the mismatch between actual ETA and broker-specified ETA.

In v2 we look at three different factors and assign different weightages to determine the load-to-vehicle association (Figure 5). We also return an explicit likelihood score with each vehicle in the list, enabling brokers to make more informed decisions.

*Figure 5. Criteria of association (v2) with weightage*

Vehicle location against the estimated time of arrival (ETA): The broker passes an ETA and a location of interest when calling this API. We take the current location of the vehicle and compute the ETA to the given location of interest. This computation is done by our Track & Trace service. A score is computed by matching the passed ETA with the computed ETA.
Hours of service (HOS) cycle left: We look at the remaining HOS of the vehicle’s driver, compare it with the destination ETA, and compute a matching score.
Direction of the vehicle: We look at N number of past locations and compare the vehicle’s bearing at each location with the angle between that location and the final location. The score depends on the total number of points for which the angles match.

API Rate Limiting

The expensive nature of our association API required a mechanism to throttle the incoming requests. We achieved this using Bottleneck, KeepTruckin’s in-house rate limiting service, developed in Go.

Location Update Push Using Webhooks

A broker can opt to receive location updates for a subscribed vehicle on their configured endpoint. The option is driven via an application-level configuration.

This push use case was served using KeepTruckin’s Webhook Dispatcher Service (which consumes new messages from Kafka and writes them to external systems over partner-configured REST endpoints) and our Webhook Pusher Service, developed specifically for this use case (Figure 6).

When all configurations are correctly set up and a broker creates a new subscription to track a vehicle, the webhook flow is triggered. Depending on the starting time of the subscription, a message with relevant details is either enqueued in Amazon SQS immediately, or put in a Backburner queue with a delay such that when the subscription’s start time arrives, the message is enqueued in SQS.

The Webhook Pusher Service keeps polling SQS for new messages. As soon as it finds a new message, it queries the cache for the vehicle’s latest location, and pushes a message to Kafka with various details about the subscribed vehicle, including its current location. It also then writes a new message back in SQS after a configured delay. The Webhook Dispatcher Service consumes the message written in Kafka and sends the information to the broker over their configured REST endpoint. The Webhook Pusher Service picks up the message that was added to SQS with a delay, and the flow repeats itself, with updated location details pushed into the broker’s system at regular intervals. The loop ends when the subscription period ends, or the broker’s permission is revoked.

Activity Dashboard for Fleet Managers

We gave fleet managers visibility into who was accessing their data and how often through an activity dashboard available in the embedded app marketplace (Figure 7).

The challenge here was storing the information around every tracking event of every vehicle, which can grow very large, very quickly. The ease of slicing and dicing the data made a relational database a tempting solution. But the size of the table and the need to add processes to clean up underutilized old metrics made us consider alternatives.

We settled on Redis for two reasons. The first was its speed of reads and writes. Speed of writes was important because we were plugging in additional code to capture metrics in existing APIs, and didn’t want to add significant latency. We also considered using a message queue and processing the events asynchronously, but we dropped this idea in favor of writing to Redis synchronously, the latter being a simpler approach with less added maintenance overhead. Our second reason for choosing Redis was the ability to set TTLs when writing to it. This meant less overhead for purging old metrics.

Our ultimate solution consisted of three sorted sets in Redis. Multiple sets were a consequence of choosing a datastore that only supported key-values and simple data structures, rather than columns with indexes on them. We identified the fields on which we wanted to let the users filter. For each, we added a new sorted set, with the score being the value on which to filter. To optimize writes to multiple sets for each tracking event, we used Redis transactions. Using this approach, a request to track 100 vehicles saw less than 20ms of added latency due to the newly inserted code for capturing metrics. For managing TTLs at the level of individual items in the set, we used an approach similar to Redis’s passive expiry approach; that is, every time a set is accessed, we remove from it all items older than N days using Redis’s ZREMRANGEBYSCORE command (this is possible only because all scores are equal to or derived from timestamps).

Metrics for Business Intelligence (BI) and Billing

One of the requirements was to allow businesses to monitor and analyze the brokers’ usage patterns and bill them based on usage.

We could not serve these analytical use cases from our relational database because we don’t store individual tracking events there. Moreover, we were reluctant to burden our primary relational database with additional internal analytical use cases. The tracking events stored in Redis that power the activity dashboard discussed above are stored only temporarily, expiring after one month. Retaining these events in the cache for a longer period would have significantly increased cache utilization, affecting both performance and billing, and making Redis unsuitable for this use case.

We, therefore, created a data pipeline where every API hit puts a message containing relevant metrics in Kafka. The message is then processed asynchronously by PySpark streaming jobs, storing the output in AWS S3. The data from S3 is then consumed by ETL jobs, which transform the data to the desired formats and move them to our data lake. Finally, we use Redash to connect to the data source and serve the relevant reports and dashboards.

To Conclude: Our Architecture Diagram

Our deliberations regarding the design of this platform’s components are summarized in the diagram below. Figure 8 depicts the overall architecture diagram of the Freight Visibility platform.

The Freight Visibility Platform has succeeded in its mission to give brokers, shippers, and other customers real-time visibility into their loads, while meeting the requirements of all stakeholders.

Last, but Never Least: Contributions

Thanks to everyone who helped deliver this:

Engineering: Henry Phuong, Yuvaraj Thiagarajan, Jonathan Palardy, Amit Goldy, Hasan Ibrahim, Mohammad Ammar, Muhammad Waleed, Noman Bashir, Shashi Singh
Product: Charles Julius

We’re excited to be a part of the Medium community! We plan to post many new stories, so be sure to follow our publication. Learn more about our engineering team here. Want to join KeepTruckin? We’re hiring!