Competitive programming: a more efficient way of representing and manipulating sets of integers

Published in

Geek Culture

6 min readSep 20, 2021

Bitwise operators offer a way of performing several operations on sets of natural numbers, without storing the actual numbers. In this article, we will understand how this technique works, what operations it supports and how they are implemented. In particular, we will learn how to find the union, intersection and minimum element of a set of integers in constant time.

General idea

The idea behind this technique is to represent a set S of n integers as one single integer X such that integer a is in S precisely when the bit at index (a + 1) of X (counting from the right) is set to 1. For example, the singleton {3} would be represented by the number whose binary representation is 00001000. Likewise, the set {3, 5} would be represented by 00101000. In other words, the representation of {a} is the integer 2^a, and the representation of the set A is the sum of 2^k, for all elements k in A.

With that in mind, we can implement basic operations on sets using bitwise operators.

Creating a singleton

Setting the i-th bit of a number to 1 and all the others to 0 can be done using the left shift operator, usually represented by the symbol <<. If u and v are integers then u << v adds v 0’s to the right of the number. For example, 4 << 3 = 100000, because 4 in base 10 is 100 is base 2. For all x, 1 << x is the number whose (x + 1)-th bit is set to 1 and with 0’s everywhere else, i.e. 2^x.

Therefore, for all natural numbers a, the singleton {a} is represented by 1 << a.

Calculating the union of two sets

So far we’ve only worked with singletons. Now we would like to be able to take the union of several singletons to make a set with multiple values. This can be simply performed using the or bitwise operator. This operator is called the disjunction operator. or takes two numbers u and v and returns the number z = u | v such that for every index i, the i-th bit of z is set to 1 exactly when the i-th bit of u or the i-th bit of v (or both) is set to one.

Using this definition, if A = {a} and B = {b} then the set A ⋃ B is represented by the disjunction of the representations of A and B.

For example, {2} = 00000100 (third bit set to 1) and {5} = 00100000 (sixth bit set to 1) and 00000100 | 00100000 = 00100100.

Finding the intersection of two sets

The intersection of two sets A and B is the set of all elements which belong to both A and B. This corresponds to the number where the i-th bit is set to 1 if, and only if, the i-th bit of A and the i-th bit of B are both set to 1. This is exactly what the bitwise conjunction, &, does.

Given two numbers A and B which represent two sets of natural numbers, the intersection of A and B is given by A & B.

Finding the symmetric difference of two sets

Finding the difference between two sets is actually not as simple as the union and the intersection. Instead, we can compute their symmetric difference. The symmetric difference of two sets A and B is the set of all elements which belong to exactly one of the two sets. Equivalently, it is their union, minus their intersection.

It is represented by the exclusive disjunction of the representations of A and B. The exclusive disjunction of u and v, denoted u xor v, has its i-th bit set to 1 exactly when the i-th bit of one of its operand is set to 1 and the i-th bit of the other operand is set to 0. This should make it clear that this corresponds to the symmetric difference of two sets.

Subsets and membership

There exist two last fundamental operations we want to be able to perform on our sets: checking whether some set A is a subset of another set B, and checking whether some natural number a belongs to some set A. Both these operations are actually derived from the ones we saw earlier.

To check whether A is a subset of B, we will check whether the intersection of A and B is A itself. Therefore, A ⊆ B if, and only if, A & B = A.

Similarly, the number a belongs to A if, and only if, the intersection of the singleton {a} and A is non empty, i.e. (1 << a) & A != 0.

Finding the minimum of a set

It is also possible to easily find the minimum element of a set. This requires to calculate the two’s complement of a number A, which we’ll denote -A. It is given by the formula -A = (not A) + 1, where not negates all bits, i.e. turns 0’s into 1’s and 1’s into 0's.

With our representation, the minimum element of a set is the right-most bit set to 1.

Therefore, if we negate a set, the minimum is the right-most bit set to 0.

When we add 1, we obtain the set -A which satisfies min -A = min A. The reason for this is explained in the image below.

The image shows the calculation of (not A) + 1. The minimum of the set represented by the sum is exactly the the right-most bit set to 0 in the first term. This is true because carries are accumulated until we reach a position where both digits are zero. This is the position of the right-most zero of the first term, i.e. the position of the minimum of A. The carry is thus consumed and the addition yields a one: the minimum element of the sum is indeed the same as A.

At this point, there are no more carries, and since we’re only adding zeros to the digits of A, there won’t any more in the future. Therefore, all bits to the left of the right-most one are the same as in (not A). It follows that the minimum of A and -A is actually the only element which is both in A and -A.

Therefore, A & -A represents the singleton containing the minimum of A.

The set of the first n natural numbers

We can directly create the set of all natural numbers less than some natural number n.

Let’s consider the number X(n) whose n-th bit (from the end) is set to 1, with all other bits set to 0. If we add to it the number with all bits set to 1 then we’ll get a representation of the set of all integers between 1 and n - 1.

The image above shows why this is true when the 1 is at the sixth position, but the idea is the same in general: we evaluate 0 + 1 to 1 until we reach the position where both bits are set to one. At this point, the sum evaluates to 10, so we write 0 and carry the 1. We keep a carry of 1 all the way up to the last digit. The sum is the representation of the set of the first n natural numbers.

The number whose binary representation has 1’s everywhere is generally, -1, if we’re using a signed type based on two’s complements, so we would implement implement this operation as (1 << n) - 1.

References

Competitive Programmer’s Handbook by Antti Laaksonen.
Competitive Programming in Python by Christoph Dürr and Jill-Jênn Vie.