Alibaba’s AI solution for the 3D bin packing problem
This article is part of the Academic Alibaba series and is taken from the paper entitled “Solving a New 3D Bin Packing Problem with Deep Reinforcement Learning Method” by Haoyuan Hu, Xiaodong Zhang, Xiaowei Yan, Longfei Wang, and Yinghui Xu. The full paper can be read here.
The bin packing problem (BPP) is a classic and important optimization problem in logistics and production systems. Put simply, it is the problem of how to use space in the most economical way when placing a given number of items into containers (or “bins”).
Although it comes in many variations, the most challenging is 3D BPP, where cuboid-shaped items of different sizes should be packed into bins orthogonally. Besides being a popular research direction, 3D BPP also has many practical applications: Applying an effective bin packing algorithm can mean a reduction in computation time and total packing costs, as well as an increase in resource utilization.
In general, the 3D BPP surface area is determined by the sequence, spatial locations, and orientation of items. Out of these, item sequence plays a key role in minimizing the total surface area taken up by the items. A popular approach is to use various approximation, or heuristic, algorithms. In typical versions of this problem, the size and cost of bins are fixed, and the overall objective is therefore to minimize the number of bins required.
However, a limitation of traditional 3D BPP solutions is that it is restricted to only those scenarios where the bin size is fixed. In the cases where bin size is flexible, the cost is proportional to its surface area. This means that heuristic algorithms must be designed specifically for different scenarios depending on the bin size, and therefore have limited general application.
Inspired by recent developments in artificial intelligence and deep reinforcement learning (DRL), especially Pointer Networks, the tech team at Alibaba have overcome these limitations by proposing a DRL-based method that optimizes the sequence of items to be packed into the bin.
Unlike typical 3D BPP solutions with fixed-sized bins, the Alibaba tech team focused on the problem of designing a bin with the least surface area that could be used to pack all the items. In real-world business scenarios, such as cross-boarder e-commerce, fixed-sized bins are not available and soft and flexible materials are usually used to pack all the items. Therefore, minimizing the bin surface area brings economic benefits. Details of the decision variables applied in the DRL-based approach are outlined below.
During the development of this approach, the DRL method was used to find a better sequence in which to pack the items. Then Alibaba’s heuristic algorithm was used to choose item orientation and position to yield the least wasted space.
To evaluate the effectiveness of this new approach, the model was trained with 150 thousand samples and tested with a further 150 thousand. To ensure real-world applicability, the tests were classified into three categories based on the number of item in a customer order, i.e. 8, 10, and 12, as would be the case in the e-commerce domain. The training procedure is outlines below.
Analysis of numerical results show that Alibaba’s DRL-based method achieves a 5% performance improvement when compared with the heuristic approach. If applied to real-world business, such as e-commerce, utilizing this method would offer real economic benefits and improve efficiency.
Moving forward with this approach, the Alibaba tech team will focus on investigating more effective network architecture and training algorithms. Additionally, integrating the selection of orientation and empty maximal space into neural network architecture will be studied further.
The full paper can be read here.
First hand and in-depth information about Alibaba’s latest technology → Search “Alibaba Tech” on Facebook