Sets versus Arrays

A set is a type of collection (a way to group related items together) that is similar to an array but differs in a few ways. Sets store distinct values as unordered list whereas arrays are ordered lists that can have duplicate values. In addition, a set can only take types that conform to the Hashable protocol, which allows the set to maintain a distinct set of values and makes it much faster than an array in terms of searching and retrieving elements. All of the foundation types are Hashable and can be used with sets. If you want to use one of your own classes or structs with a set, you will need to implement the Hashable protocol.

When to use Sets!

In cases where the order of the elements is not important, or when you need to ensure that an item only appears once.

Let’s take a closer look!

Let’s choose a problem and try to solve it once using a set and once using an array, then we’ll decide which one is faster based on the time of execution and the number of iterations.

We have a dictionary that contains information about states and their capitals and we want to know which state’s capital contains none of the same characters as the name of the state. We will answer this question first using a set. We’ll start by iterating inside the dictionary and creating two sets of characters, one for the name of the state and the other for the name of the capital. then we’ll check the result of intersection — one of the fundamental operations of a set that determines which values two sets have in common — and we can get the answer for our question. Easy, right?!

Intersection Operation

The code will be :

Now, let’s try to solve it by creating an array of characters for the name of the state and the name of the capital, then looping inside the array of the capital characters to check each character if it exists in the array of the corresponding state or not. The code for this will be:

As we can see:

  • Both functions repeat the first two steps 50 times because our dictionary has 50 keys.
  • Using a set, we are able to perform the comparison in one line of code, and this line will be executed once per state.
  • Using an array, we’ll need a for loop that will be repeated 50 times multiplied by the number of the characters in the name of each capital.
  • So, it is obvious that using a set is faster than using an array to solve these types of problems. This can be proved by calling each function 100 times and measuring the average execution time for each function as in the screenshot below.

Although the difference in the run time in our example is very small, this could make a big difference when used with large data sets, so always consider using sets as an alternative to the traditional arrays.