Find union of sections (in Python)

Given an array of sections in form [[1,5], [3,7]], return all unions of the sections.

Andriy Lazorenko
4 min readMay 29, 2019

Beginning of section is denoted as el[0], end is denoted as el[1]. Union of two overlapping sections is determined as below: [1,5], [3,7] -> [1,7]

TLDR: code to the article is available here:

https://github.com/AndriyLazorenko/medium_blog/blob/master/Union%20of%20sections.ipynb

1. Quick attempt at a pairwise solution

Let’s focus on comparing two sections first to get the idea of union. There are 4 possible scenarios on how sections might look relative to each other:

4 positions of sections relative to each other. 1st position contains coordinates for reference

Let’s code checks for each of these cases and produce a united section:

def unite_pair(pair):
if pair[0][0] <= pair[1][0] <= pair[0][1]:
if pair[0][1] > pair[1][1]:
return pair[0]
elif pair[0][1] <= pair[1][1]:
return [pair[0][0], pair[1][1]]
elif pair[1][0] <= pair[0][0] <= pair[1][1]:
if pair[1][1] > pair[0][1]:
return pair[1]
elif pair[1][1] <= pair[0][1]:
return [pair[1][0], pair[0][1]]

The code will try to unite a pair of sections. Let’s try if algorithm works on several examples:

test_1 = [[1,5], [3,7]]
test_2 = [[1,10], [3,7]]
test_3 = [[3,10], [1,5]]
test_4 = [[1,10], [11,20]]
test_5 = [[1,10], [10,20]]
test_list = [test_1, test_2, test_3, test_4, test_5]
expected = [[1,7],[1,10],[1,10],[[1,10],[11,20]], [1,20]]
for i, el in enumerate(test_list):
pair = [el[0], el[1]]
received = unite_pair(pair)
print(f"Test case:{pair}, expected: {expected[i]}, received: {received}")
try:
assert received == expected[i]
print("Test ok")
except AssertionError as err:
print("Test failed")

Okay, the test fails on one of the cases. The case differs from others because there is no union between sections there to be found, which is the 5th case that can happen. Let’s handle this case:

def unite_pair(pair):
if pair[0][0] <= pair[1][0] <= pair[0][1]:
if pair[0][1] > pair[1][1]:
return pair[0]
elif pair[0][1] <= pair[1][1]:
return [pair[0][0], pair[1][1]]
elif pair[1][0] <= pair[0][0] <= pair[1][1]:
if pair[1][1] > pair[0][1]:
return pair[1]
elif pair[1][1] <= pair[0][1]:
return [pair[1][0], pair[0][1]]
else:
return pair

Output is now as follows:

Test case:[[1, 5], [3, 7]], expected: [1, 7], received: [1, 7]
Test ok
Test case:[[1, 10], [3, 7]], expected: [1, 10], received: [1, 10]
Test ok
Test case:[[3, 10], [1, 5]], expected: [1, 10], received: [1, 10]
Test ok
Test case:[[1, 10], [11, 20]], expected: [[1, 10], [11, 20]], received: [[1, 10], [11, 20]]
Test ok
Test case:[[1, 10], [10, 20]], expected: [1, 20], received: [1, 20]
Test ok

2. Scaling the pairwise union

There might be more than 2 sections in an input array, so we should think on scaling the solution to produce a union of all the possible sections. Bruteforce solution would try to apply pairwise comparison of section 1 with each of the remaining sections, then - comparing union to the rest of the elements and so on. Complexity of the described solution is huge and I won’t even bother to estimate it.

What if we try sorting first? If we traverse an array of sections, sorted by 1st coordinates, we would be able to do pairwise comparisons till we hit a break in union. In that case, we can return result to a results array containing united sections. We can then produce a union array in a single array traversal! Let’s code it!

First of all, we will not need to handle cases 3 and 4 on the picture above, so we can cut our if statements as follows:

def unite_pair(pair):
if pair[0][0] <= pair[1][0] <= pair[0][1]:
if pair[0][1] > pair[1][1]:
return pair[0]
elif pair[0][1] <= pair[1][1]:
return [pair[0][0], pair[1][1]]
else:
return pair

Now, we can write a function accepting array of any size and returning array of united sections:

def union_of_sections(sections):
sections = sorted(sections)
for_ret = list()
while len(sections) > 1:
one = sections.pop(0)
two = sections.pop(0)
pair = [one, two]
united_pair = unite_pair(pair)
try:
len(united_pair[0])
for_ret.append(united_pair[0])
sections = [two] + sections
except TypeError as err:
sections = [united_pair] + sections
for_ret.extend(sections)
return for_ret

Here, a try-catch block is used to handle a case when sections are not overlapping. In that case an array of arrays will be returned, 1st section should be appended to array of united sections to be returned and only 2nd section should be appended to a beginning of sections list; in other cases - united pair should be appended to a start of sections list. This approach enables iteration on sections array while it contains at least 2 items to compare using unite_pair() method. A couple of test cases just to make sure we’re okay:

united_test_1 = [el for test_case in test_list for el in test_case]
test_6 = [[1,5], [6,9], [10,12]]
test_list = [united_test_1, test_4, test_6]
expected = [[[1,20]],[[1,10],[11,20]], test_6]
for i, el in enumerate(test_list):
received = union_of_sections(el)
print(f"Test case:{i}, expected: {expected[i]}, received: {received}")
try:
assert received == expected[i]
print("Test ok")
except AssertionError as err:
print("Test failed")

The tests are green, we’ve cracked the problem!

Test case:0, expected: [[1, 20]], received: [[1, 20]]
Test ok
Test case:1, expected: [[1, 10], [11, 20]], received: [[1, 10], [11, 20]]
Test ok
Test case:2, expected: [[1, 5], [6, 9], [10, 12]], received: [[1, 5], [6, 9], [10, 12]]
Test ok

3. Final notes. Complexity

We will estimate the resulting complexity. Complexity of sorted() is O(nlog(n)) and complexity of array traversal is O(n). Final complexity is O(nlog(n)), which is good enough :)

P.S.: this story is a part of my series on algorithmic challenges. Check out other cool algorithms decomposed with tests and jupyter notebooks!

--

--