Down to the last mile: Revolutionizing route optimization with NVIDIA cuOpt

Published in

Slalom Data & AI

10 min readMay 2, 2024

Photo by Maarten van den Heuvel on Unsplash

By Teddy Crane, Behrooz Koohmareh Hosseini, Prekshya Singh, and Ryan Sitter

Thousands of thought pieces have been written over the past decade on the need for real-time agility and digitization in supply chain management. However, many players across industries have not yet moved beyond traditional supply chain planning and execution methods because the journey from AI conception to production is complex and requires both technical proficiency and strategic vision.

Slalom is dedicated to unlocking the transformative potential of AI for our customers, serving as a driving force behind substantial value generation within their businesses. Our series of Innovation Days, held across major US cities, provides an exclusive platform for interaction with our leading technology innovators.

A prime example of our technological advancements is our collaboration with Kawasaki Heavy Industries, where we leveraged the NVIDIA cuOpt route optimization engine to revolutionize Kawasaki’s track maintenance and inspection processes, as illustrated in this blog.

One area that is especially ripe for disruption is last-mile routing and scheduling, where goods are transported from fulfillment centers to the customers’ doorstep, dock, or worksite. This final and relatively short leg of the end-to-end supply chain is critical to profitability, often accounting for 40%–50% of total logistics cost. As delivery volume increases during high growth periods or seasonal peaks, for example, inefficient last-mile operations can result in margin erosion of up to 20%–25%.

The human quest for optimization

In essence, optimization problems have existed for as long as human beings have been engaged in goal-directed behavior and decision-making. These problems involve finding the best solution from a set of feasible options given a set of constraints. However, it can be challenging to determine which solution will work best with the unquantifiable human factors surrounding a given problem. To illustrate this, let’s look at some examples from history.

In ancient Greece (300 BC), Euclid — now considered the father of geometry — found that the optimal distance between a point and a line to show the maximum area derives when the quadrilateral is square. A century later, Princess Dido was granted land based on how much she could encircle with the hide of an ox. Brilliantly, she cut the hide into strips to maximize the surface area, encompassing the territory that later became known as ancient Carthage. In graph theory, Guan’s route problem — famously known as the Chinese postman problem — focuses on optimizing the shortest closed path that meets all edges of a connected graph at least once. While optimization comes down to a science, the pursuit of achieving it is inseparable from the inherent human desire to achieve the best outcomes.

*World supply chains can be revolutionized by real-time route and schedule optimization (photo by* *Denys Nevozhai* on Unsplash)

In today’s world of technology and AI, we have access to more data than ever. But how do we make sense of all this information and use it effectively in our processes?

With the advent of AI-driven optimization, organizations across industries are seeking to quantify and improve on efficiency and productivity, time and cost savings, real-time decision-making, and environmental impacts, to name a few. These metrics serve as indicators of a company’s AI adoption and business optimization, revealing what AI’s role in the company’s functions could be in the future. For example, if you work in the utilities sector, you can use the technology to optimize the time it takes to provide services to your customers. Similarly, if you are in transportation, optimization can help specialists focus on high-value business cases. In healthcare, optimization can lead to higher customer satisfaction rates. Ultimately, optimization is about creating more value in your industry.

Kawasaki: Enhancing rail repair with intelligent track maintenance support

In 2021, Kawasaki set out to create a digital product offering for the US railroad market. Slalom partnered with Kawasaki to create a digital platform, leveraging NVIDIA cuOpt, to collect and analyze rail data. The platform provides optimized maintenance routes, making it easier and less expensive for US rail companies to keep their tracks safe.

In the initial requirements-gathering phase, the team realized that by creating on-train devices for collecting rail defect data, the data volume would likely overwhelm the manual scheduling processes that are prevalent in the rail industry today. Since the routing and scheduling problems across the industry are highly constrained, requiring multiple human-in-the-loop validations to ensure that rail lines are safely shut down without impact to freight and passenger traffic, a fast and accurate route optimization solver is required. With this in mind, Slalom and Kawasaki turned to NVIDIA technology, using cuOpt to create an application to optimize routes and schedules for rail repair teams.

The key requirements and constraints in the optimization problem included speed, solution accuracy, the ability to “lock” a task in a specific time slot, and the ability to rapidly regenerate schedules. Since cuOpt met all these requirements, with the ability to return accurate fleet routing solutions in under a minute, Kawasaki incorporated it into its product offering, Maintenance Advisor.

With cuOpt, rail maintenance managers reduced the multi-hour process of scheduling repair work for the week down to a single task that takes 10 minutes or less. This also means that maintenance managers can spend less time on paperwork and more time ensuring that rail lines are safe for freight and passenger use.

Additionally, since cuOpt reaches a solution so quickly, maintenance managers can account for external factors, such as scheduling conflicts with track usage by trains, absence or sickness, major weather events, and major infrastructure events and closures. This means that, thanks to cuOpt, more rail defects can be repaired, and rail lines are safer for their end users.

Revolutionizing the last-mile problem with NVIDIA cuOpt

NVIDIA cuOpt is a record-breaking GPU-accelerated optimization engine that can be used for processing vehicle routing problem (VRP) variations such as fleet optimization, last-mile delivery, and field dispatch. The speed and efficiency of cuOpt can be attributed to two main features: (1) the use of a heuristic approach for solving optimization problems, and (2) parallel computation and memory hierarchy of native NVIDIA GPUs, which help accelerate optimization problems. NVIDIA cuOpt supports NVIDIA A100 Tensor Core and newer GPUs.

One of the main challenges of running optimization problems like the VRP is its “time to solution,” which is the computational runtime needed for each optimization run. As the number of variables and locations in the VRP increases, finding the optimal solution becomes even more challenging.

Traditionally, linear programming or mixed-integer programming (MIP) has been used to solve these problems and provide valid solutions based on the best available approximation—in other words, local maxima or minima. However, VRP and most optimization problems require frequent optimization runs (e.g., every three hours or daily), which can be time-consuming if it takes several hours to suggest the best route to each location.

Therefore, combining heuristic approaches with GPU-accelerated computing can be advantageous for real-time solutioning and quick runtime. NVIDIA cuOpt has been shown to run a problem with 10,000 locations in about 30 seconds with one NVIDIA A100 GPU. Slalom and Kawasaki’s use case was also run on a single NVIDIA A100, with a runtime of about 30 seconds, 100 locations, and many constraints. This feature of cuOpt also allows for dynamic rerouting to adapt to unexpected changes in scheduling requirements.

How it works: Leveraging cuOpt to solve VRP

A typical solution for routing problems in a production-tier scenario involves dedicating a large amount of computational capacity and waiting for the results. However, we are disrupting traditional design patterns for scheduling and optimization solutions by using cuOpt, which generates results within a fraction of the time taken by other leading solvers for the VRP. Even with the increase in schedule generation speed, there is little impact on existing business processes. This allows users to generate more schedules and increase the dynamic impact of these schedules.

Adopting cuOpt also enables businesses to evaluate their current processes, which could potentially help identify other inefficiencies for optimization. In a field where computation speed has traditionally been the primary bottleneck in scheduling, cuOpt removes this bottleneck, allowing businesses to re-optimize other portions of their workflows. It not only minimizes the impact on existing processes but also generates results that are within 5% of the best-known solution.

Below is an example of a solution architecture for solving the VRP with NVIDIA technologies. The architecture is platform agnostic and can be applied in any platform where NVIDIA cuOpt can be called in (e.g., AWS, Azure, or Databricks).

Sample end-to-end cloud-agnostic architecture for leveraging NVIDIA cuOpt at enterprise scale

1. Data layer

The historical/source data (e.g., stops, route coordinates, latitude/longitude, or shift schedules) can come from a database (i.e., Azure SQL, Amazon Redshift, Snowflake, Databricks’ lakehouse, Microsoft Fabric’s OneLake, or PostgreSQL) or in a streaming fashion. In either approach the data can be collected and fed into cuOpt to find solutions for the optimization problem.

2. Computation and optimization layer

Once the matrices of distance and travel time between stops are built, the problem can be run by calling NVIDIA cuOpt.

When using NVIDIA cuOpt, users have multiple deployment options based on their requirements. Currently, there are two different deployment methods: an NVIDIA-hosted CUDA-X microservice and a self-hosted containerized service. Since both methods have the same API specification and use the same underlying GPU acceleration technology, the choice between managed service and self-managed becomes a question of what the best technology fit is for each user, rather than making trade-offs determined by a specific hosting option. This allows for easy adoption of cuOpt, regardless of cloud maturity, cloud provider choice, and cloud presence compared to on-premises deployments (although cuOpt can be run on bare metal, provided the right hardware is present). Once the deployment is successful, the data for the stream can be ingested into the API built from the cuOpt solution. The input data can come from a data stream (and/or on a regular batch load, for instance daily) for dynamic route or schedule optimization.

The solver can target multiple optimization targets, including minimum route distance, minimum travel time, and minimum variability between different vehicle routes. In the case of the VRP (in step 3.1), the output is fed into a web back end at step 3.2. The problem can be solved by scheduling in step 3.3 in an iterative loop, allowing for potential human feedback and the application of potential “overrides” to the optimal schedule (likely due to environmental or business factors).

3. BI and visualization

If the output fed into the back end is approved by a human in the loop (step 3.3), the results can be shown in a tabular or visual fashion on a front-end web application or a visualization tool (such as Power BI or Streamlit). If the human in the loop does not approve the schedule/route in step 3.3, the problem needs to be solved again in step 3 by calling the API that is built based on the historical data, and the loop in step 3 repeats itself until the human in the loop approves the process.

cuOpt’s strengths include flexibility and speed. Even though cuOpt is currently highly specialized for solving the VRP and other traveling-salesman-type problems, NVIDIA recently announced at GTC, a global AI conference, its GPU-accelerated linear programming solver, which increases the applicability of cuOpt to other NP (nondeterministic polynomial time) problems, such as bin packing and non-routing scheduling. Over the next decade, we are likely to see a paradigm shift from data center compute primarily driven by CPU, to this compute being primarily GPU driven. The adoption of tools like cuOpt will be crucial for industries as this paradigm shift continues and GPU acceleration becomes the norm.

Conclusion

Integrating AI and optimization in the workplace can have numerous benefits such as new revenue generation, cost savings, streamlining time to decision, and efficiency gains. However, it is essential to consider the challenges that come with these rewards. Like collaborations, the output of AI is highly dependent on the input. It’s crucial to invest sufficient time and effort to reap the maximum benefits of AI in the workplace.

Develop a cross-functional phased approach when considering how new technology can be integrated with your current tech stack.
Consider the end user, set measurable goals and data-driven key performance indicators (KPIs), and create an action plan to help determine who the right stakeholder and business partner is for your organization.
Ensure that your choice of technology and model address actual needs, maintain high data quality, monitor bias, and promote transparency and accountability while being compatible with current architecture.

Adopting a product mindset of continually iterating, refining, and shipping, along with the use of GPUs and heuristic computational approach, can greatly improve risk mitigation and accuracy. This approach enables continuous and near-real-time optimization of problems, making it an effective strategy for achieving better outcomes going forward.

By leveraging NVIDIA accelerated computing platforms, we can solve complex problems at scale. Our approach is not limited to isolated instances, but we strive to enhance and optimize the entire ecosystem of products and services. This creates a synergy that propels businesses towards true disruption. Our aim is to help you reimagine how AI can integrate into your work stream, extending beyond a proof of concept and becoming an integral part of your suite of products.

Slalom is a next-generation professional services company creating value at the intersection of business, technology, and humanity. Learn more about how Slalom can help you succeed with NVIDIA and read our AI thought guide.