Attention to Bigger Picture Is All You Need

Boran Deniz Bakır
Trendyol Tech
Published in
7 min readAug 1, 2024
Generated by DALL·E 3

Introduction

Have you ever wondered what happens after you click that order button in e-commerce? As the confirmation screen displays a green check icon, it marks the beginning of a business’s epic journey through rigorous sequential decision making. Orders are fulfilled from warehouses, where items are allocated from specific locations. Then, they are picked from these allocated areas and packed for their final destination — you. Let’s explore the steps between allocation and picking, which considerably extend the Order to Cargo (OTC) time and increase the waiting time for your product.

Allocation to Picking

AAbout 80% of the time in the warehouse before packing is spent between allocation and picking. Reducing this time will shorten the OTC, assuming that the cut-off times (when packages leave the warehouse) are dynamic.

Figure 1: Allocation to Packing Flowchart

First, the picking locations of the items in your order are determined through allocation. Next, these allocated orders are bundled together by zone using a worklist creation algorithm. This algorithm, triggered manually, aims to maximize the average items per zone (avgIPZ). Consequently, this process creates a worklist buffer for the picklist creation algorithm, which combines the worklists in each zone to maximize items per corridor. Once the picklist is created, we immediately run the picker routing algorithm. Given the item locations on the picklist, we generate a picking route for the picker to follow. After picking, orders are sorted and packed.

Picklist creation and picker routing triggered when the picker demands a picklist. Allocation is done automatically when the order received. However, there is only one algorithm that does not have an automated running, it is worklist creation. Let’s delve into Worklist creation and algorithmic improvement.

Worklist Algorithmic Improvement

In Trendyol, main warehouses are divided into zones. An algorithm identifies the necessary zones for each multi-item order, creating a pool of Zone Combinations linked to specific orders and items. The task is then to group these combinations into worklists.

The flow is divided into two parts: same-zone and different-zone. Different-zone handles orders with more than one allocated zone, while same-zone deals with single-zone orders. This division exists because pickers are assigned to only one zone. Same-zone orders do not require further division, so they move directly to the picklist creation algorithm. In contrast, different-zone worklist creation must bundle orders to maximize the average items picked per zone (avgIPZ), allowing for the use of fewer pickers to handle the maximum number of items [1]. To reduce lead time, this work focuses on analyzing and improving the performance of the different-zone worklist creation.

The current algorithm efficiently combines orders to maximize the avgIPZ quickly. We proposed and developed a search mechanism called Adaptive Large Neighborhood Search (ALNS) to enhance this base_greedy approach [2]. In this context, we aim to integrate ALNS for the different-zone worklist creation algorithm, which is essentially an assignment problem, similar to a modified bin packing algorithm.

Figure 2: ALNS flow chart

The main constraints are strict lower and upper bounds on the number of orders in a worklist and a fixed number of worklists. The ALNS framework adheres to these constraints in its insertion heuristics. Starting with a heuristic solution, the algorithm first uses removal heuristics to select zone combinations to remove from worklists, storing them in an order bag. Next, orders from the bag are inserted into worklists through insertion algorithms. The algorithm checks if the new worklists have a better avgIPZ than the best solution (initially the base_greedy heuristic). If so, it updates the best and current solutions. Termination constraints are checked before further processing. The algorithm ends and returns the best solution if it reaches 200,000 iterations(MaxI), exceeds 29 seconds, or hits the non-improved iteration threshold (MaxI*0.40).

The chosen removal and insertion heuristics are based on their performance. Initially, each removal-insertion pair receives 0.2*MaxI points. If a pair’s avgIPZ surpasses the best solution, it gains an additional 0.01*MaxI points; if it fails, it loses 1 point. Pair scores are bounded between 0.1*MaxI and 0.4*MaxI points to prevent overuse of top-performing pairs. The removal heuristic is selected based on the total points of its insertions, while insertion heuristics are chosen using weighted wheel selection based on the removal algorithm. For example, Removal_1(R1) is chosen with Score(I1|R1 + I2|R1), and R2 with Score(I1|R2 + I2|R2).

| testy_small dataset | Execution Time(ms)  | coverage | avgIPZ_%_change |
|---------------------|---------------------|----------|-----------------|
| alns_simulation_1 | 170.49 --> 23870.25 | 9 | 2.01 |
| alns_simulation_2 | 170.49 --> 23834.57 | 12 | 1.81 |
| alns_simulation_3 | 170.49 --> 23921.02 | 12 | 2.31 |
| alns_simulation_4 | 170.49 --> 23923.71 | 10 | 2.85 |

Table 1 base_greedy vs base_greedy + ALNS

It's time to test ALNS. We compared base_greedy with base_greedy + ALNS as shown in Table 1. Four runs(because stochastic nature of ALNS) with 100 different subproblems were conducted in. Coverage indicates improvement over the base_greedy, averaging 11 subproblems with a 2.25% improvement rate. Execution time increased from 0.17 to 24 seconds (Tests done in M1 Max chip with 32 GB RAM).

The outputs show that ALNS underperformed, especially given the increase in execution time. We now question whether ALNS performed poorly or if our base_greedy algorithm already achieved low gap values, leaving little room for ALNS improvement.

One Step Back and Look at the Bigger Picture

Since ALNS underperformed, we propose a new, poorly performing algorithm (namely novum_greedy) to test ALNS’s search capabilities.

| testy_small dataset | Execution Time(ms) | coverage | avgIPZ_%_change |
|---------------------|--------------------|----------|-----------------|
| alns_simulation_42 | 2.71 --> 170.49 | 100 | 56.81 |

Table 2 novum_greedy vs base_greedy

The comparison between Novum_Greedy and Base_Greedy algorithms, as presented in Table 2, indicates that Base_Greedy achieves an average IPZ that is 56.81% higher than the newly proposed greedy algorithm, demonstrating potential for improvement. Novum_Greedy constructs 4,145 worklists, whereas Base_Greedy constructs 4,218 across 100 subproblems, utilizing 2% fewer orders. Now, let’s apply ALNS to Novum_Greedy and transfer this 2% to the order bag.

| testy_small dataset | coverage | avgIPZ_%_change |
|---------------------|----------|-----------------|
| alns_simulation_1 | 100 | 75.12 |
| alns_simulation_2 | 100 | 76.17 |
| alns_simulation_3 | 100 | 74.51 |
| alns_simulation_4 | 100 | 74.64 |

Table 3 novum_greedy vs novum_greedy + ALNS

Achieving an average improvement of 75.11% over Novum_Greedy’s objective, our ALNS demonstrates its effective search mechanism, as illustrated in Table 3. Now, we are curious about how our new, poorly performing greedy algorithm combined with ALNS compares to Base_Greedy.

| testy_small dataset | avgIPZ_%_change |
|---------------------|-----------------|
| alns_simulation_1 | 11.97 |
| alns_simulation_2 | 12.66 |
| alns_simulation_3 | 11.55 |
| alns_simulation_4 | 11.69 |

Table 4 base_greedy vs novum_greedy + ALNS

Novum_Greedy combined with ALNS outperforms the base_greedy algorithm by 12% with 2% fewer orders, indicating the potential of our approach. Broadly, we need to consider the number of orders to select, which orders to choose, and the timing for triggering the algorithm. This leads us to sequential decision-making, as depicted in Figure 1.

More on Sequential Decision Making

Figure 3: Bigger Picture

Executing allocation, worklist, picklist, and picker-routing simultaneously maximizes the use of new orders but significantly constrains execution time. As demonstrated by Zalando, this approach introduces randomness and results in ‘poor’ picklists [3]. An alternative method involves solving the mathematical model for each step with buffer time, optimizing for T0 but missing the opportunity to combine new orders with previous ones at T1. For example, while picking items in a specific warehouse location, new orders arriving at T1 are not taken in the picking process due to this myopic approach.

Balance is crucial in this problem. Incorporating new orders while enhancing objectives should be addressed together. Questions about how many orders to choose, which ones to select, and when to execute the algorithm can be tackled with reinforcement learning or advanced deep learning techniques, considering cut-off times and active pickers. While the broader scope and potential improvements are acknowledged, the questions of how many and when to choose orders remain for the future iterations. Our ALNS model excels in selecting the optimal orders for the current window.

Conclusion

The ALNS framework demonstrated improvements over the Base_Greedy algorithm, achieving an average of 11 subproblems and a 2.25% improvement rate, despite a significant increase in execution time for the worklist creation algorithm. When we introduced novum_greedy algorithm enhanced with ALNS, we achieved a 12% improvement rate using 2% less input. This raises important questions about setting the right strategy for sequential decision-making in order processing.

In the grand scheme, we must address the critical questions of how many orders to select, which ones to prioritize, and when to execute the algorithm. While ALNS showcases effective search capabilities with input selection, future iterations should delve into reinforcement learning or advanced deep learning techniques to further refine and optimize these decisions. This exploration will unlock the full potential of our approach, paving the way for more sophisticated and intelligent order processing strategies.

References

[1] Previous different zone iteration.

[2] Pisinger, David, and Stefan Ropke. “A General Heuristic for Vehicle Routing Problems.” Computers & Operations Research, vol. 34, no. 8, Aug. 2007, pp. 2403–2435, www.sciencedirect.com/science/article/pii/S0305054805003023, https://doi.org/10.1016/j.cor.2005.09.012.

[3] Khan, Imran, et al. “Joint Order Selection, Allocation, Batching and Picking for Large Scale Warehouses.” ArXiv (Cornell University), 9 Jan. 2024, https://doi.org/10.48550/arXiv.2401.04563. Accessed 10 Jun. 2024.

Join Us

We’re building a team of the brightest minds in our industry. Interested in joining us? Visit the pages below to learn more about our open positions.

--

--