Efficiency of the Shotput Packing Algorithm

Stephanie Hutson
Jul 7, 2016 · 10 min read


The Shotput packing algorithm is used at present for determining the best box size for an order we receive into our system. Through a series of testing the accuracy of the Shotput box packing algorithm against the pyShipping packing algorithm, it was determined Shotput packs significantly and consistently more efficiently than pyShipping.

Note: The links in this article no longer work. If you have any questions, please feel free to reach out.


Though the algorithms being tested are specifically looking at their box packing algorithm efficiency, the problem lies in the greater context of selecting the correct box for shipment. After originally implementing pyShipping into the Shotput codebase and discovering inaccuracies, I was tasked with writing a new packing algorithm. I wrote the box packing algorithm in the context of needing to achieve the following goals:

  • Determine the smallest box and with fewest number of parcels required to pack and ship a predetermined set of products/items.
  • Reduce the number of items needing to be packed into the last parcel, thereby allowing it to be repacked into a smaller box.
  • Maintain and improve upon the speed and efficiency of a generating details for a shipment

We use the algorithm for every order that goes into the system. We take boxes available to a team — some teams we work with have specific boxes, they use, where as others use boxes available to everyone — from there we send each box accompanied by all of the products for an order through a weeding out process.

To cut down on runtime and the number of boxes we need to look for, we check in a speed optimal way that all the items with their given dimensions will fit into a box. For each of the acceptable boxes we send the products through the packing algorithm, which returns for each box, an arrangement of how the products will be organized into different boxes. Once we have determined which box can be used which both requires the fewest number of parcels to pack and is the smallest in volume relative to others that require the same number of parcels to be packed, this box is selected as the box for the order. Because there can be a variety of sizes of products shipped, and the products have been packed, selecting for the largest products in the first parcels, often the last parcel may contain fewer or smaller products, so we resend the contents of the last box through the algorithm to select the smallest box which will fit all the products.

By reducing the number of parcels, or the size of the box required, we can reduce shipping cost and our environmental impact of each order we ship out, while upholding and improving customer and client experiences.


In comparison to pyShipping, an open source python library for 3D bin packing:

What was tested:

  • Number of bins of a certain dimension that were required to pack a set of items
  • When the number of bins is equal, the number of items packed in the last bin
  • Runtime

How it was tested:

I tested using three types of datasets:

Large — a large range of box and item size, and a large number of items packed
Medium — a medium range of box and item sizes — close in dimensions, and a moderate number of items packed
Small — a small range of box and item sizes — closest in dimensions, small number of items packed

Through use of the python random library I generated boxes with random dimensions and a random number of items with random dimensions. I then sent the data through both the packing algorithms (pyShipping and Shotput). The number of parcels required and the number of items in the last box (with a qualifier for when all items fit into a single box) were recorded.

I repeated those steps 10,000 times and retrieved the packing and efficiency data. Though 10,000 trials with 3 different data sets may seem excessive, as bin-pack is an NP-Hard problem, I wanted my results to be unconditionally evident, significant, and repeatable.

To test the processing time, I used the dimension ratios from test case 2, and incrementally increased the number of items packed with 1000 trials at each item set size. Rather than using a random range of the number of parcels, like I do for trials 1, 2, and 3, I use a set, hard coded number of items.

According to their readme, pyShipping has been optimized to have overall similar speed as David Pisinger’s 3D bin packing solution in C. PyShipping was discovered to be a little slower and a little more accurate than Pisinger’s algorithm, from which we can deduce that the speed and accuracy of our algorithm relative to a third algorithm. Through pyShipping’s research they determined that they on average use fewer parcels to pack 500 items. Because I use randomly selected numbers of items to be packed, with a wide range, I cannot use the same metric. However, I gathered statistical data on the number of parcels saved through use of our algorithm vs pyShipping, as well as the percent fewer of the overall number of parcels needed. From that we can deduce relative to pyShipping’s success against Pisinger’s, the success of Shotput’s.


For the sake of using only data which can be interpreted, we can exclude in our statistical analysis the cases where all items for both algorithms were packed into a single box, though it will remain reported.

Test Case 1 (Large):


  • Box size: 50–400 units for each dimension
  • Item size: 1–50 units for each dimension
  • Items packed: 100–1000 items

The Shotput algorithm improved upon pyShipping in the number of parcels require to pack the same randomly sized items 77.93% of the time. All instances of a tie in the number of parcels required, resulted in either all items fitting into one box for both algorithms, or Shotput packing fewer items in the last box — and therefore is more efficient. If we exclude the ties where all items fit into a single parcel, we are left with 8416 times that we performed better, and 18 where pyShipping outperformed Shotput, or in other words 99.78% of the time, the Shotput algorithm packed more efficiently.

In the cases where pyshipping performed better, all were instances where the pyShipping required 1 box, where the Shotput algorithm fit all but one item into the first box. (Example 138 items to be packed, pyShipping fits all in one bin, Shotput fits 137 in one bin, and the last item in a second bin.

Test Case 2 (Medium):


  • Box size: 100–200 units for each dimension
  • Item size: 20–100 units for each dimension
  • Items packed: 20–100 items

This test case tests boxes that were closer in size to the items being packed with less variability and fewer number of items. Shotput packed using fewer parcels 97.8% of the time, and packed more efficiently — including fewest parcels in the last bin, and excluding instances where they were all in one bin 99.55% of the time. .21% of the time there was a tie, and .45% of the time pyShipping performed better.

The large mean and median are an indicator of the large average number of parcels needed to pack an average of 60 items.

Test Case 3 (Small):


  • Box size: 100–200 units for each dimension
  • Item size: 40–100 units for each dimension
  • Items packed: 10–30 items

This test case is closest to how Shotput’s algorithm is currently used. For this case I used smaller boxes and smaller items. As a result, there were more ties, and more variation. 85.88% of the time, Shotput used fewer number of parcels to pack the same items, where .58% of the time, pyShipping required fewer parcels. 92.24% of the time the Shotput algorithm packed more efficiently, while pyShipping packed more efficiently 3.2% of the time.

Time Component:

The importance of the time component was placed lowest on the evaluation of the two algorithms as for our immediate needs at Shotput — which is to say our immediate needs require precision over scalability — however I think it is important to mention that pyShipping’s time component is faster across the board. For packing of fewer parcels — which is closest to our real world application, the difference is calculable but negligible from a user stand point, however as complexity increases, as does the disparity between pyShipping and Shotput time components.

From from an exponential best fit of both algorithms, we see that Shotput increases at a slightly higher exponential rate than pyShipping, but also if we were to say `y = cx^p` where p is 1.8 and 1.7 respectively, the linear component, c, is nearly 4x greater in Shotput’s algorithm. As we also know Pisinger’s algorithm runs slightly faster than pyShipping, these results leave Shotput something to shoot for as we continue to build and adapt the algorithm and put greater emphasis on scalability.

Of note: When I tested the medium box and item size ranges with 500 items to be packed, the mean percent parcels saved was 53.2% with a standard deviation of 5.4% while the average number of parcels saved was 56, with a standard deviation of 25. Over all 1000 trials, Shotput packed fewer parcels 100% of the time.


The results were very clear. After compiling the data from 28,230 trials over test cases 1, 2, and 3, (30,000 trials minus ties where all items were packed in one bin for both algorithms), Shotput’s box packing algorithm successfully required fewer parcels for the same items 92.5% of the time, and packed more efficiently 97.0% of the time. The z-score on these results on a binary distribution is 156 with a standard deviation of 86.6. In other words, we can say with off the chart certainty that the Shotput shipping algorithm is more precise and more accurate that pyShipping.

In our current phase of development, and the current use cases (represented by test cases 2 and 3) of the Shotput box packing algorithm, the algorithm is fast, and from a user stand point does not preform badly relative to pyShipping. Both algorithms have a Big O better than O(n)^2, which is far better than a brute force approach at O(n)^6. Because Shotput scales at a slightly higher exponential rate and a noticeably greater linear rate, as complexity increases, as does the difference in runtimes. This data can be used to entice the next stage in optimization and development, while maintaining accuracy.

From the statistical analysis, we see the most striking success from test case 2, at 41% fewer parcels on average used with the Shotput algorithm vs the pyShipping algorithm. Test cases 1 and 3 show the largest standard deviations, however from further analysis of results we can see that the data is skewed. In test case one, the mean (3.68) number of parcels saved is significantly higher than the median (2) while the median percent parcels saved is 50%, vs the mean of 45%. This is an indicator that a large portion of the test cases resulted in 1 or 2 parcels being packed by both Shotput and pyShipping, resulting in a greater percentage difference for fewer parcels different. The mean being so high is an indicator that in the dataset there is a wide range of how many parcels are required for different trials. In general the large standard deviations are an indicator of wide ranges of results, but as we seen by the low number of successes of pyShipping in outperforming Shotput, where the median parcels saved is always lower than the mean, that wide range is in favor of the Shotput algorithm.

Note: additional tests were run with 1,000 trials for num parcels averages at 10, 100, 200, and 300 with small item and box dimensions

One thing that isn’t tested in the remaining volume in the final parcel, which when implemented into the larger context of the our use case, will affect the overall success. As in the test cases, all the products are of random dimensions, depending on which items are placed in the final parcel could be significant. The Shotput algorithm iterates through the items to pack from largest to smallest, resulting in the smallest items which won’t fit in the other parcels in the final parcel, but without additional testing, we do not know the volumes left available in the final parcel by the pyShipping algorithm, and it is possible that in cases of ties where one algorithm packs more items into the last parcel, the other has an overall better efficiency in a tie.

Because there are checks ahead of time for boxes that will not be useable, we are able to limit the number of times the algorithm needs to be hit per use case. With improvements in packing efficiency this significant, the increased runtime for a more accurate result is favorable to a quicker but less efficient option. In summary, the Shotput algorithm performs significantly better at efficient packing than pyShipping’s algorithm, from which we can deduce it’s performance is also more efficient than Pisinger’s algorithm.

Documentation for using the Shotput Box Packing API can be found at https://warehousing.shotput.com/developers/box-docs.

The Chain

Simplifying commerce and logistics with hi-tech ops |…

Stephanie Hutson

Written by

The Chain

The Chain

Simplifying commerce and logistics with hi-tech ops | Shotput.com

More From Medium

Also tagged Algorithms

Related reads

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade