Solving Complex Supply Chains with Adversarial Optimisation

Published in

GAMMA — Part of BCG X

14 min readNov 16, 2021

Decision making in complex systems often involves search spaces of an explosive number of possibilities and outcomes. By “letting trade-offs fight to equilibrium”, insights that reveal ways to optimise these systems start to emerge.

By Kelvin Hsu, Thomas Sandeman, and Artem Vladimirov

Optimising large supply chain networks is an essential part of reducing inefficiencies and capturing missed opportunities — and it can potentially generate or salvage tremendous value over the entire system of scale. Companies may cut costs by shoehorning off-the-shelf optimisers to complete this critical task. But these readymade products often fail to grasp the underlying network dynamics and, instead, treat the system as a black box to be optimised using a top-down objective. Solutions obtained this way are rarely insightful, are often inefficient, and can take longer to complete as they struggle to navigate large, complex search spaces.

For more than 20 years, BCG has been a pioneer in the field of simulation, including the development of digital supply chain twins. In this article, we share how we developed an “adversarial” approach to optimise supply chains for a steel producer by driving trade-offs inherent within the network dynamics to equilibrium.

Fighting to equilibrium

In complex systems, decision making often calls for the careful balancing of trade-offs. It requires searching through a staggering number of possibilities and outcomes, and navigating through countless intricate dynamics that determine how, subject to various uncertainties, individual components may interact over time. The process of balancing trade-offs can be particularly challenging when optimising supply chains in operations-centric industries such as steel manufacturing.

Given their complexity, finite systems cannot avoid creating trade-offs. One such trade-off might be physical, as in the need to balance steel mass conservation constraints throughout a supply chain. At other times, it might involve the need to balance priorities from competing production demands. Insights into how these trade-offs interact can be uncovered using a bottom-up approach that simulates the larger system to highlight the interplay between constituent parts. By taking an adversarial approach that involves “letting trade-offs fight to equilibrium”, these insights can reveal ways to optimise the system. Solutions obtained in this manner are often easier to interpret and faster to reach because of the way they explicitly incorporate the dynamics of the problem. This can make the difference between being able to arrive at an approximate solution or, given practical time and resource constraints, never arriving at any solution at all.

A case for optimising inventories for contract exports

Typically, our clients prioritise supplying the domestic market, exporting excess volumes only “as available”. We refer to these exports as spot-exports. These spot-exports serve as a release valve for any remaining excess capacity at various points of the supply chain. Given the uncertainty from domestic demands of each steel product and capacity levels for each production asset, these excesses can vary from week to week.

Spot-market prices and margins are typically much less than those achievable through long-term contracts. This presents an opportunity to lock in higher-margin contract exports, then utilise inventories throughout the supply chain to buffer volatility.

The challenge is to determine the right monthly contract export volume for each available steel product and the month-by-month inventory targets across each point of the supply chain. This is made more complex when a supply chain is:

· Constrained: The supply chain network is distributed across the nation with point-of-production constraints.

· Dynamic: Events and decisions in previous months affect those in future months.

· Stochastic: There is uncertainty from volatilities in production capacities and domestic demands.

Abstracting supply chain dynamics for simulation

A simulation approach allows us to capture the stochastic nature of demand and supply. Determining the right level of abstraction is key to balancing two primary concerns that often pull in opposite directions because of the need to:

· Capture enough intricacies so that solutions can be truly useful

· Make assumptions and simplifications so that problems remain feasible

To structure the simulation, we broke the supply chain into generic components and created rules that defined how each component should interact with the others. This allowed us to construct a highly configurable supply chain across multiple echelons.

We used two core objects, production assets and demand handlers (green boxes), to describe the supply chain. The objects are characterised by physical or priority rules that govern the monthly flow of steel mass through the supply chain.

To abstract one level further, production assets need not correspond to physical machines. Instead, they are characterised by the type of output they create. This means that each asset’s available capacity is the sum of the available capacities from the part of each and every physical machine that produces its specific characterising output. This one-output-per-production-asset simplification allowed us to further map inventory points one-to-one to these abstracted production assets. We then placed these points at the output of each asset (white boxes).

Capacity uncertainties are still allocated to physical machines so that statistical independence remains across physical machines instead of across the abstracted production assets. As such, the production of each asset at each point in time depends on the combined capacity and working time available to each machine, which also factors in scheduling and planned-maintenance impacts.

Each of these assets is responsible for letting its upstream know how much feed it requires. Each asset must consider any feed requirements from its own downstream, along with its own demands (domestic and export) and inventory targets.

Demand handlers are characterised by the type and priority of products whose demand they handle, as well as by the production assets they point to as suppliers. The handlers are responsible for distributing demands as production requests to each supplier at the start of the period, and for fulfilling demands at the end of the period — with the resulting products at the right priority ordering. To capture both long-term forecast errors and short-term demand variability, we encoded uncertainty separately between annual demand projections and monthly demand variability.

The result was a supply chain represented by a highly configurable and flexible directed acyclic graph. The graph was formed by composing production assets and associated inventory points throughout and ended with demand handlers at its leaves.

With the network structure of the supply chain defined, we could then focus our attention on structuring the execution of the optimisation, as well as on the temporal stages of each simulation. In general, we find it helpful to define abstractions layers early on to help create structure and break down the overall optimisation. The layers include:

1. Optimisation: A single optimisation run is comprised of multiple iterations, with each iteration being a single simulation.

2. Simulation: A single simulation is comprised of multiple playouts under the same initial conditions, with the difference between playouts arising stochasticity from domestic demand volumes and production capacities. (We typically use 100 playouts.)

3. Playout: A single playout is comprised of multiple time periods. The time periods can be either months or weeks. (In practice, we found months to be a more appropriate period.)

4. Time Period: A single time period is comprised of multiple stages.

5. Stages: A single stage is comprised of multiple unit steps, each of which is an instance where a single production asset or a single demand handler performs some action.

In the 4th layer (Time Period), we broke down the supply chain dynamics within each period into distinct stages. In doing so, we paid particular attention to how operational priorities were factored in and how the states that emerged were interpreted. This approach had the added advantage of being easy to pause so we that could stop and interpret each step at any point. The five action stages are:

Stage 1: Pull signal creation: Domestic and contract export demands ask their suppliers (assets) for production.

Stage 2: Pull signal execution: Assets execute planned productions.

Stage 3: Pull signal resolution: Demands take supplies from suppliers.

Stage 4: Push execution: Assets execute with remaining capacity.

Stage 5: Push resolution: Assets remove excess via spot exports.

In the 5th layer (Stages), we paid particular attention to executing unit steps in a way that encoded priorities. Some of these priorities were plain physical requirements. For example, upstream assets had to execute productions first so that downstream assets would have feed to execute theirs. At other times, these priorities encoded business requirements. For example, demand handlers for domestic demands generally should have a higher priority than those for contract exports, or producing feed for downstream products should come first before meeting demands for the midstream product in question.

An adversarial approach to inventory-export optimisation

A simulation on its own can give you only a range of likely results for a given scenario. We needed to determine the optimal levels at which the steel producer’s contract exports would lock in, along with the corresponding inventory targets required to ensure that domestic demands continue to be met. The large number of products and inventory points created a huge combination of options from which to choose, which in turn required a sophisticated approach to finding the optimal configuration.

It was at this point that we used the adversarial approach that alternates between an “attack” stage and multiple “defense” stages. Each attack stage increases the levels of contract exports to lock in through spot-to-contract conversions for each product. Each defense stage increases inventory targets across the supply chain to buffer volatility in demands and capacities. Both stages are tunable in terms of how aggressive or conservative the steps are.

The two types of stages are adversarial in the sense that they work against each other: Raising commitments for contract exports presents a higher risk of not meeting demands, while raising inventory targets leaves less volume for exports. This represents a trade-off. This first stage is an “attack” because spot-to-contract conversations are the direct source of value we are attempting to capture. The latter stages are “defensive” because, by protecting existing demand commitments, they serve to mitigate the side effects of capturing this value.

Before allowing an attack stage to execute, we repeated the defense stages until the constraints were met. Consequently, each attack stage was followed by multiple iterations of defense stages. In essence, we struck at opportunities as they came, but only if they did not jeopardise our bottom line.

What resulted was an adversarial optimisation procedure that rapidly converged to a stable solution, with each optimisation stage or step being highly interpretable in terms of its purpose. By identifying the trade-offs involved in directly driving the source of value, we were able to focus the optimisation on these trade-offs directly and avoid the need to estimate value in dollar margins.

The choice of step size is important. As contract exports can come only from executing spot-to-contract conversions, a reasonable choice of the magnitude of increase in each step of the “attack stage” should depend on the amount of spot exports currently being done. Specifically, we make these choices dependent on the quantiles of the spot-export distribution. The corresponding percentile is expressed as a hyperparameter and endowed with a learning rate of less than one so that convergence will be smoother and without oscillations. Since this move serves to increase the objective, there is no need to take aggressive steps. In practice, we choose the 0th quantile, which is simply the minimum, making these the most conservative steps in spot-to-contract conversions.

Increasing export commitments, however, can raise the risk of higher unmet demands for existing commitments. We know that having higher inventory levels would provide better buffers to meet demand in cases of above-average domestic demands and below-average production capacities. As such, increasing export commitments can serve as a driving force for increasing inventory targets, while also hinting that the magnitude of the increase should depend on the amount of currently experienced missed demands. Again, we make the increase in commitment dependent on the quantiles of the missed-demands distribution, with the corresponding percentile expressed as a hyperparameter. Since this move serves to meet constraints that cannot be compromised, we want to do this aggressively. In practice, we choose the 100th quantile, which is the maximum, making these the most aggressive steps to remove unmet demands.

After the two opposing stages interact for a period of time, they eventually reach a state of equilibrium, via scheduling of monthly inventory targets and export commitments. This is a state where as much contract exports as possible have been squeezed from spot exports for each product — without jeopardising existing demands and commitments.

In fact, because we encode the priorities to ensure that downstream products receive higher priorities during the simulation, we are able to capture another type of opportunity in addition to spot-to-contract conversion: Product upgrades. We were able to move more products further downstream in the supply chain than previously possible by supplementing feed bottlenecks with additional inventories. Steel products produced further downstream also fetch higher premiums.

Lessons in effective abstractions and heuristics

Effective abstractions enable useful representations of dynamical systems for simulation and scenario generation. Heuristics then exploit known system characteristics and dynamics to quickly reach sensible solutions. Together, by forming an in-depth, bottom-up understanding of system properties, abstractions and heuristics make solving a large and complex problem both feasible and practical.

For physical systems such as a wide network of supply chains, there are often highly interpretable heuristics that can be leveraged to push the search in the right direction and promptly arrive at a solution. Additional effort may be needed at first to understand the operational dynamics and driving sources of value in the system. However, by uncovering these heuristics, this early investment pays off in the practicality and interpretability of the resulting solution.

We learned numerous lessons from this experience:

1. When the system dynamics are complex, it is key to balance the level of granularity and detail for the simulation. On one hand, it can be useful to identify the natural, foundational “atoms” of the complex system and view all subsequent dynamics as derivations from a minimal set of logical rules that describe how these atoms interact with each other. On the other hand, modeling at this level can be prohibitively expensive or impractical. Instead, it is often worthwhile to formulate the right level, instead of treating the system as a black box into which an existing technique must be shoehorned.

In our case, we were able to simplify and make feasible the problem space by first identifying that the type and volume of mass flowing throughout the system each month was enough to provide a practical recommendation for inventory scheduling.

2. If a decision can be made “without regret”, the approach can be simplified by incorporating the decision directly within the simulation. We can often make these kinds of decisions with the help of a few mild assumptions. If there are no complicated trade-offs involved in a decision, it may not be necessary to include them as part of the optimisation. We can encode some of the necessary conditions for the optimal solution in the way we choose to simulate the system.

In our case, these conditions were encoded via the ordering of our five action stages. For example, we assumed that, all things being equal, downstream products fetch higher margins than upstream products. Under this assumption, the system should have been pushing steel as far downstream through the supply chain as physically possible — but only after existing demands were met. Importantly, this was one of those decision we were able to make “with no regrets” in that, once existing demands were met, there was no other reason (except physical feed constraints) to prevent us from making this decision. This is unlike performing spot-to-contract conversions or increasing inventory buffers where we know there are secondary consequences that can be calculated only with an optimisation procedure. Therefore, as long as we accept the assumption, it does not matter how much higher the margin is: We will always be better off making this decision.

3. When the search space is large, top-down specifications of a final objective can be inefficient or difficult to solve for. While each individual component and step of the simulated dynamic supply chain is relatively simple, the combined system can be quite complex. If we were to treat the system as a black box and apply an off-the-shelf generalised optimisation algorithm, it would very likely take many iterations to converge and be difficult to debug or interpret. This is because the algorithm would have limited problem-specific guidance for searching over the solution space.

In our case, if we were to specify a top-down objective over which to optimise, such as the expected profit margin, we would not have been able to easily obtain gradients of the objective with respect to the decision variables. These variables included inventory targets across the supply chain and the contract export volume to commit for each product. Importantly, forming the financial objective itself requires another layer of estimation regarding the margin from both domestic and contract export sales. Furthermore, it requires that stochasticity caused by other external factors be captured throughout the year.

4. Thinking about effective abstractions and heuristics when facing a new problem can reveal useful properties to reason about the system. This not only helps produce results efficiently but can produce useful insights from the way the solution is reached.

In our case, both the attack and defense stages corresponded respectively to what a decision maker would do directly to pursue higher margins and separately protect existing commitments. We can let the optimisation determine the exact trade-off between the two stages. We make the important directional decisions and let the system balance itself out. Furthermore, because the drivers of each stage and step of the optimisation are clear, it becomes an intuitive process to tune exactly how aggressive or conservative they should be, or to spot where any imbalances, bottlenecks, or even opportunities may lie within the system.

5. Coming up with tailored optimisation strategies can be effective, insightful and — last but not least — fun! Simulating a system from the ground up raises all sorts of curious questions about the core properties and scenarios one must capture. Optimising it further requires iterating (with human discussions) on the properties of a good solution before boiling it down to smaller steps (the computer iterations) to get there.

The problem-formulating process is as dynamic as the problem itself and brings great satisfaction and joy when it eventually leads to an effective and insightful solution.

Balancing abstraction levels as a critical problem-solving trade-off

In this article, we have shared how we formulated an adversarial approach to balance difficult trade-offs. Once formulated, algorithms would be able to figure out what the correct balance should be. However, there remains another critical trade-off that algorithms cannot necessarily help us with but is, perhaps, even more pressing to balance correctly: Finding the right abstraction level across a wide spectrum ranging from formulating a generic abstract framework to developing a domain-specific solution.

Decision making in complex systems often involves search spaces of an explosive number of possibilities and outcomes. On the top-down extreme, we can invest efforts in constructing a value objective to be optimised under a general-purpose optimisation algorithm. These solutions may benefit from the simplicity and clarity in what the optimal solution should look like, but may lack the in-depth knowledge of the problem at hand needed to solve it efficiently. On the bottom-up extreme, we can choose to use highly granular and detailed agent-based or even physics-based simulations to model the problem. These solutions may benefit from a forward model that most clearly replicates reality, but may also require so much detail that the truly important aspects of the problem are obfuscated.

Good solutions require a careful balance within this spectrum. Throughout this effort to formulate a domain-specific solution for a large supply chain network, we worked with domain experts in the steel manufacturing industry to understand where the major levers are, what the important drivers are, how critical factors influence the system, and which level of detail the solution should capture. It is this collaboration between industry and functional experts that is critical in determining the abstraction level at which the problem can best be solved.