Subsets II

Michael Yu
the clubhouse
Published in
2 min readAug 17, 2018

Question: Given a collection of integers that might contain duplicates, nums, return all possible subsets (the power set).

Note: The solution set must not contain duplicate subsets.

Example:

Input: [1,2,2]
Output:
[
[2],
[1],
[1,2,2],
[2,2],
[1,2],
[]
]

First thought is to solve this problem without dealing with duplicates. So, take the case input=[1,2,3] Because of the nature of power sets and subsets of a set, backtracking is probably a good technique to use.

Indeed, if we start with an and empty subset,[] , and we loop through our input, each time appending on a new value to each existing subset, we can get something that works like this:

input = [1,2,3]START:          [[]]index = 0
SUBSETS: [[], [1]]
index = 1
SUBSETS: [[], [1], [2], [1,2]]
index = 2
SUBSETS: [[], [1], [2], [1,2], [3], [1,3], [2,3] ,[1,2,3]]

Perfect! Let’s write this in code.

subsetsWithDup(nums):
subsets = [[]]
for num in nums:
subsets_length = len(subsets)
for i in range(0, subsets_length):
subsets += [subsets[i] + [num]]
return subsets

Cool, that’s simple enough. But we still have to deal with the duplicate case.

When looking at an input like [1,2,2] and taking into account the solution we wrote. The moment we hit the 2nd 2, we end up duplicating the subset [1,2] .

There’s a simple way to avoid this. Every time we see a number that is a duplicate, append all unique subsets of ONLY that number to each existing subset. It’s confusing when put into words-take a look at this example:

input = [1,2,2]START:          [[]]index = 0
SUBSETS: [[], [1]]
index = 1
SUBSETS: [[], [1], [2], [1,2]]
And then we notice that 2 is duplicated once. so:
SUBSETS: [[], [1], [2], [1,2], [2,2], [1,2,2]]

Notice, [2,2] & [1,2,2]
are simply [] & [1]
with [2,2] appended to them.

This is an interesting solution, but how do we tell how many time a number is duplicated? There’s many ways to do it but one simple way is to have a dictionary (hashmap) where the key is the number and the value is the number of times it appears.

Nice, let’s write this in code:

import collectionssubsetsWithDup(nums):
subsets = [[]]
num_appears_map = collections.defaultdict(int)
for num in nums: num_appears_map[num] += 1 for num in num_appears_map:
subsets_length = len(subsets)
for i in range(0, subsets_length):
for j in range(1, num_appears_map[num]+1):
subsets += [subsets[i] + [num]*j]
return subsets

Yay. Thanks for reading.

--

--