Mastering data structures in Ruby — Sets

Ale Miralles
Nov 6, 2018 · 6 min read
S = { 1, 2, 3 }

Empty

A set with no elements is the empty set.

{} => true

Equal

Two sets are equal if they contain the same members. (Remeber that in the case of sets the order doesn’t matter.)

{1, 2, 3} = {3, 2, 1} => true

Subset

Set “a” is a subset of “b” if “b” contains all of the elements that are present in “a”.

{3, 4} C {2, 3, 4} => true

Union

A union of sets “a” and “b” produces a set “c” that contains all the elements from “a” and “b”.

{1, 2, 3} U {4, 5, 6} => { 1, 2, 3, 4, 5, 6 }

Intersect

An intersection of two sets produces a new set that contains only the elements that are present in both sets.

{1, 2, 3} N {2, 3, 4} => { 2, 3 }

Difference

A difference between two sets is a set that contains all the elements for the first set except the ones that are present in the second.

{1, 2, 3} - {2, 3, 4} => { 1 }

Properties

  • The intersection of a set with the empty set is the empty set.
  • The union of set the “a” and the empty set is the set “a”.
  • The intersection of a set with itself is the original set.
  • The union of a set with itself is the original set.
  • We have to guarantee no dupes which convert traditional O(1) operations, like inserts, into O(n) operations.

Sets interface

What follows is a brief description of the set’s method and their complexity.

Method names, summaries, and complexity.

Implementation Details

To store set members, we are going to use the singly linked list we built in the first post of this series, which is nice because we get to reuse a lot working code that helps us to save a bunch of work.

Initialize

The only thing this method does is to initialize the set’s internal storage. The complexity of this method is O(1).

def initialize
@list = LinkedList.new
end

Insert

This method inserts a new member into the set. Since sets can’t contain duplicates, the first this method does is to check if the member is not already there. If it’s not, it inserts the new member otherwise it does nothing.

def insert member
return if contains? member
@list.insert member
end

Remove

This method removes a member from the current set, or it does nothing if the element doesn’t exist.

def remove member
node = @list.find { |nd| nd.value == member }
@list.remove node if node
end

Union

This method returns a set that contains all members of the current and the other set.

def union other
res = Set.new
@list.each { |nd| res.insert(nd.value) }
other.each { |nd| res.insert(nd.value) }
res
end

Intersect

This method returns the intersection of the current set with the other set.

def intersect other
res = Set.new
@list.each do |nd|
res.insert(nd.value) if other.contains?(nd.value)
end
res
end

Diff

This method returns a set that contains all of the elements that are present in the current set but not in the other.

def diff other
res = Set.new
@list.each do |nd|
res.insert(nd.value) unless other.contains?(nd.value)
end
res
end

Contains

This method returns true if the set includes the given member. To find a member this method walks the internal list and check each node data until it finds the member or the list runs out of nodes.

def contains? member
@list.find_first { |nd| nd.data == member }
end

Subset

This method returns true if the current set is a subset of the other set and false otherwise.

def subset? other
return false if self.count > other.count
@list.each do |nd|
return false unless other.contains(nd.value)
end
true
end

Equal

This method returns true if the current set is equal to the other set and false otherwise. As well as we did with the subset method, we do a quick check on set’s length first to see if we can exit fast if that check fails, we have to go the slow path instead.

def equal? other
return false if self.count != other.count
subset? other
end

Count

This method returns the number of elements in the current set. The complexity of this method is O(1).

def count
@list.length
end

Each

This method walks the members in the current set yielding them one a time. The complexity of this method is O(n).

def each
return nil unless block_given?
current = @list.head
while current
yield current&.data
current = current.next
end
end

Print

This method prints the contents of the current set. The complexity of this method is O(n).

def print
@list.print
end

When to use sets

Sets are commonly used to data correlation. Or, in other words, to find relationships between different groups of data.

Data Structures From Scratch — Ruby Edition.

amiralles

Concise programming articles for those who code

Ale Miralles

Written by

There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors.

amiralles

amiralles

Concise programming articles for those who code