Weighted Random Selection

This is a question I was asked to solve during an interview for a games company I ended up working for. Given a list of items where each item has a weight (integer value), select a random item from the list based on that weight.

The key requirement — items with a higher weight value are more likely to be returned. So even the lowest weighted value will still sometimes be returned from the function.

While my brain was cranking up, I thought to create a new list where the weight determines how many times an element appears in the new list.

You then just pick an item from that array at random. However this is a pretty limited answer due to the alarming size of the new array as the size of the list increases. Not very efficient for anything except quite small lists.

I finally came up with this solution — there are many ways of solving this problem but I think this one is easy to understand. Essentially the algorithm is

  1. Add up all the weights for all the items in the list
  2. Pick a number at random between 1 and the sum of the weights
  3. Iterate over the items
  4. For the current item, subtract the item’s weight from the random number that was originally picked
  5. Compare the result to zero. If less than or equal to zero then break otherwise keep iterating. The key is that the larger the weight the more likely to be less than zero when compared to the random selection between zero and the sum of weights.
  6. If not less than zero, continue iterating over the list, all the while subtracting more and more weights off the random number chosen from the sum of the weights.

Remember — that random number chosen from the sum of the weights gets lower and lower each pass through the array of items, so you will always hit zero i.e. it is guaranteed that something will be picked every time. But it will be more likely that an item with a larger number will be picked.

The key part of the algorithm in pseudo-code is

randomWeight = rand(1, sumOfWeights)
for each item in array
randomWeight = randomWeight - item.Weight
if randomWeight <= 0
break // done, we select this item

Here is an example of this I wrote in Go. In this example the list contains 6 games and each game has a weight. Lets say the weight in this example means the likelihood the current user specifically would buy the game (perhaps coming from a recommendation engine). Our function will select a game that is more likely to be of interest to the user, possibly for surfacing to them through a UI in this make-believe scenario. This code demonstrates the selection over a 10,000 iterations.

This will give results where the higher weighted games are selected more often than lower weighted games. In this case, our user is more of an RPG gamer so those games are weighted higher and are more likely to be selected than Sonic, Mario or CoD. Example, in this 10,000 run Fantasy was selected 3196 times compared to 576 for Mario.

map[CoD:1272 Link:2531 Fantasy:3196 Mario:576 Destiny:1816 Sonic:609]

Notes: math/rand is the right choice here over crypto/rand — see this illuminating blog post about the two rand packages

Show your support

Clapping shows how much you appreciated Peter Kelly’s story.