How to Quickly Optimize a Non-Linear Optimization Problem by Exploiting Local Properties

Published in

GAMMA — Part of BCG X

9 min readSep 12, 2022

Algorithmically Set Prices and Adjust Hundreds of Thousands of Individual SKUs to Meet Pricing Goals

In 2021, a large specialty retailer sought out BCG to help it improve its pricing methodology. With thousands of stores across the U.S. and hundreds of thousands of individual Stock Keeping Units (SKUs) — and operating in an increasingly volatile market — the company needed a more agile, scalable way to determine optimal pricing than its current, largely manual system. In response, and taking global constraints into account, we developed an algorithm to find the optimal price for each SKU. Pricing can have a dramatic impact on profit and customer perception, as is evidenced by the wealth of literature on pricing strategies.

During the first phase of our engagement with the client, our team researched the history of the company’s pricing. Using AI and machine learning, we isolated the impact on demand of pricing increases and decreases, and computed price elasticities based on such factors as seasonality and product type. With this information in hand, we then moved into a second phase, during which we built a pricing tool that would give the company a more data-driven way to set prices. As we will explain below, the pricing tool uses machine learning — and human intuition — to calculate how much the company should increase, decrease, or leave untouched individual SKU prices to reach any number of strategic pricing goals.

For the AI system we had in mind to work we had to be aware of its potential pitfalls — and design the system in such a way that it took into consideration what humans, such as pricing analysts, might know. For example, a pricing analyst will manage the overall price perception at a category level because he or she is aware of the long-term effect of perception in retail. But with a few years of data, and since the company has carefully managed its perception, we were not hampered by the lack of specific historical data we would otherwise need to train the model on price perception. We also considered the fact that the pricing analyst also knows the importance of avoiding price wars — and how price gouging can anger customers and potentially damage the brand.

Solving the Problem for all SKUs

Our approach to building the right solution, therefore, was to leverage price elasticities at a tactical level to find the best price for each SKU, while maximizing overall revenue or profit. We did this while still meeting a global strategic constraint represented by a pricing goal the client management team had set at a category level. We took into consideration that each SKU has a minimum price (often manufacturer imposed) and maximum price (also called insult price).

For the pricing tool to be the most useful to management, it had to be able to very quickly compute optimal SKU prices and their impact on overall profitability. In this way, management could use the tool to construct various scenarios: If the company decided to raise average prices by 5%, these would be the best SKU prices; if the goal is to raise average prices by 10%, these prices would be best.

To summarize:

Choosing the Best Approach

As we initially studied the client’s problem, we considered using publicly available solvers. The problem, however, was that the function was nonlinear — which immediately excluded a mixed-integer programming solver. Off-the-shelf gradient descent is not adapted to our constraints. And finally public nonlinear constrained optimization solvers have no guarantee of finding the optimal solution and, like black-box optimizers, have an excessively long run time.

We also considered a simple heuristic, optimizing each SKU independently and applying a scaling factor to meet the strategic constraint. But the heuristic actually yielded a negative impact on profit (since the profit curve is flatter around the optimal point, going all the way to the optimal point leads to price increases that, compared to the impact on the average weighted price, do not add significantly in terms of profit). We also brainstormed other heuristics (such as increasing SKUs to their independent optimal prices in order of inelasticity), but they suffered from similar problems, and could require more iterations to meet the global constraint. When optimizing prices that have a decent baseline, precision matters.

Our goal was to find a solution that would be as accurate as possible and could still be computed fast enough to enable real-time tests of various pricing scenarios. The solution we have developed exploits the property of our optimization problem and quickly finds the optimal solution for all values of the strategic constraint parameter. The strategy could be summarized by transforming the problem such that greedy solutions (which are usually easy to compute) are optimal. In the final analysis, the AI/ML-based pricing tool met all the client’s criteria.

Before we go in more technical details of our solution, let’s formulate the full problem more formally to set the technical context:

Objective: Max ∑_i profit_i(ie sum the profit of all SKUs)Decision variable: Pi (price for SKU i)Constraints: 
(1) ∑_i(Wi * Pi) /∑_i(Wi) = Constant 
(2) Pi >= mi, Pi <= Mi, for all i (min and max prices)
(3) The quantity follows a negative exponential demand.
Quantity new price = Quantity base price * exp(elasticity * ((new_price/old_price)-1), with negative elasticity convention.
(4) profit i  = Qi * (Pi – Ci), (i.e., profit is simply quantity * (revenue – cost))Parameters:
Wi (weight of SKU i)
mi (minimum price for sku i)
Mi (maximum price for sku i)
Elasticity i: the price elasticity of sku i
Quantity base price, the baseline quantity of each sku
Baseline base price, the baseline base price of sku
Ci: cost of sku i

The Unlock

We can precompute the solutions for any feasible constant (in constraint 1) by leveraging the fact that the exponential function derivative is strictly monotonic (within some bounds), with the computational complexity of a simple sort.

As a reminder, any SKU has a profit curve that looks like something like this:

Let’s set an example to build up the intuition, and then go through the full optimal heuristic:

Without loss of generality, in our example we optimize 3 SKUs, with elasticities -0.6, -0.7, and -1.1, base prices of $1.00, base quantities of 1, and costs of 0. We also assume that we will optimize in the range of price of [0.5; 1.8] due to business constraints on the prices.

For context, we can plot the three profit curves along price changes. (Notice that with a price of $1.00, we do not change any prices and, therefore, we have the baseline quantity of 1 for all SKUs):

To optimize, let’s start the prices of all SKUs at their minimums (i.e., 0.5). If the average weighted price needs to increase by epsilon, we also need to determine which SKU we would increase the price for. Clearly, it would be the SKU that can give us the highest profit increase — the one with the highest derivative at the current price point.

Let’s plot the derivative of the profit curve:

We notice that at the minimum price, SKU “a” has the highest derivative, meaning it gives the highest increase in profit for an epsilon increase in price (which makes sense since this SKU is the least elastic). We also notice that after the price increase, SKU “a” ’s derivative will be lower. If we repeat this process to reach any feasible average weight, the SKUs will take turns getting a price increase, and we will arrive at an optimal path of price increase:

At first, it is best to increase the price of SKU “a.” But this has the impact of decreasing the derivative along the profit curve, making it more optimal to start increasing SKU “b.” At this point, we begin alternately increasing different SKUs based on the derivative value. By allowing ourselves to make price changes for one SKU then another, we can build a combined numerical derivative plot that shows which SKU is getting a price increase in which order. This, in turn, leads to the optimal path of price increase:

Notice on the graph that the SKUs take turns getting price increases. This is what makes the problem difficult to solve. If there were simply three colored lines, one after the other, then we could solve this problem using a simpler algorithm — and still get an optimal solution.

For each point on this optimal path, we can compute a corresponding profit and average weighed price. And for a given average weighted price, we can find the optimal price path that gets us there.

The Algorithm:

A step-by-step code walkthrough is available here:

Based on this approach, we can therefore use the following algorithm, which is optimal within the level of the numerical precision desired:

We break each profit curve in mini-steps to the desired level of precision for the solution.
We compute the associated numerical derivative of each mini-step of each curve.
We sort the values of the derivatives from greatest to lowest.
Since, for each mini-step, there is a weighted average price (Constant defined earlier in constraint (1)) value associated with it, we get the desired weighted average price (within the chosen precision) by finding the closest match.
Once a match is found, we apply the optimal path of price increase that yields that weighted average price.

In other words, we have transformed a “hard” problem to solve in such a way that a greedy solution is optimal as well — and provides an optimal answer quickly. Our transformed problem is equivalent to the original one and yields the same optimal solution — but is much easier to compute.

We should note that in sorting the derivative as per the above example, we have assumed that the derivative function of the profit curves is always decreasing. After a certain price increase there is, however, an SKU inflection point at which the derivative comes back down. For our purposes, we can simply choose not to explore price increases past that point.

Using ML Pricing in a Bionic Company

Clearly, there is a tremendous competitive advantage in management being able to instantly see how it can precisely adjust the pricing of thousands of individual products to reach a strategic pricing goal. The pricing tool’s ability to very quickly determine price changes needed to reach these various goals gives management the ability to fine tune its pricing strategy to meet larger business objectives.

As we note in our description of the bionic company, technology reaches its fullest potential when it is combined with the flexibility, adaptability, and comprehensive experience of humans. The pricing tool itself is built on mathematical rules and so, accordingly, the prices it calculates can be viewed as suggestions, not dictates. For the company to reach optimal performance, it must add human insight to the equation, as relevant product and pricing managers review pricing options and apply additional, non-mathematical considerations that might affect the final SKU pricing. These human insights themselves can be informed, for example, by applying an analytical thought process such as using various approaches based on observed price variances when setting minimum and maximum prices. Approaches to solving these kinds of problems can vary, but it is this overall fusing of artificial intelligence with human intelligence that enables companies to compete effectively in a complex, rapidly changing world.

How to Quickly Optimize a Non-Linear Optimization Problem by Exploiting Local Properties

Written by Herrmann Charles