# A brief tour of lists in Scala and algorithmically processing them in SICP exercises

## Let’s continue our journey through Scala using SICP exercises to demonstrate the elegance and simplicity of this powerful programming language in processing list data structures.

This article builds off of part 1, Quickly learning the basics of Scala through Structure and Interpretation of Computer Programs examples. That article also covers reasons for why you may want to learn Scala.

We continue our exploration of Scala using examples that solve exercises from the classic book, “Structure and Interpretation of Computer Programs” (SICP). Small exercises for the reader are also included.

Today we’re going to start working with some data in that we’re going to learn how to create and processes lists of elements. A list is a simple data structure that consists of a sequence of elements. There are many cases where we’ll need a collection of elements and lists can be a reasonable data structure in many of these cases. E.g., if we wanted to programmatically analyze some tweets from a famous person then we could store their tweets in a list for processing in Scala.

A list is constructed as a linear chain of elements whereby each element points to the next element in the list. Totally reasonable if that doesn’t sound intuitive. I think the concept is best illustrated with an example. The following figure shows a list of three elements; 12, 99, and 37.

Each list element consists of a value and a pointer to the next element. You can see how the first element of the list contains the value 12. This element also points to the second element in the list. The second element contains the value 99 and points to the third element. Finally, the third element of this list contains the value 37 and points to a special “end-of-list” terminal.

In Scala, we call this special terminal element `Nil`. We need such a terminal value to let us know when the list ends. We can think of `Nil` as a special element of the list that neither contains a value nor points to another value. It just exists to cap off the end.

So let’s go ahead and build the 12-to-99-to-37-to-Nil list in Scala.

Interestingly, we have to build the list starting from the tail. This makes sense when we reflect on how we’re going to build this list. We can’t create the first element until we have the second one since the first element needs to know what it’s going to point to when we create it. Similarly, we can’t create the second element until the third has been created.

So let’s start by creating the last non-`Nil` element of the list that contains the value 37 and points to `Nil`. Because lists are used all through Scala, the list element is given a special, concise name, `::`. Yes, the formal name of the list element in Scala is just two colons. While unusual, as you work more with lists, you’ll come to appreciate the novel naming and brevity. Here’s the Scala code for creating a list with just the single element 37.

We’re simply calling `::` like a function and this corresponds to creating a new list element of type `::`. Each `::` element has two attributes:

• `head` : the value of the list element (e.g., 37 in this example)
• `tail` : the pointer to the next element in the list (`Nil` in example)

E.g.,

Note that the Scala shows the single-element list `::(37, Nil)` in a special format that downplays the role of individual list elements and just shows the values contained in the list. `List(37)` is the clearer representation of this list with just this one element. Scala could also show this as `::(37, Nil)`, but that’ll be harder to interpret when we get to longer lists.

Now let me introduce a generally cool Scala feature that motivates the `::` naming convention: `::(head, tail)` can also be written as `head :: tail`. E.g.,

Hence, `::` can be thought of as a binary operator in math, similar to say `+` or `-`. Do you agree that this infix notation is a simpler syntax than `::(37, Nil)`? We can appreciate it more when we see that multiple usages of `::` can be chained in creating the rest of the list. Here’s what the code looks like to construct the full list with conventional constructor function calls.

Contrast that to the infix notation.

Would you agree that the infix notation is a more legible way to define the elements that make up a list?

And we can do even better. The `List(12, 99, 37)` notation shown in the shell is a valid Scala expression to create a list of these three elements in the shown order. This is because `List` is a function for constructing lists. It can take any number of arguments, including zero. Behind the scenes, it does the work of creating the individual list elements and it returns the start of the list. The `List` function will be helpful as we create lists to process in today’s exercises.

Lists also introduce a new concept in terms of data types in that we want to be able to differentiate between lists with different types of elements. E.g., differentiating a list of `Int` values from a list of `String` values. This is necessary to ensure that all elements in a list conform to the same element type so we can write code that knows how to process all elements in an arbitrary list of a certain element type.

E.g., say we wanted to write a function to compute the sum of the values in a list of integers.

What would happen if we passed in a list of strings? E.g., `List("a", "b")`. If our code expected integers, then it would encounter an error when it tried to treat strings as integers. To avoid such errors, each Scala list type specifies the type of its list elements. The syntax for specifying a list of type `X` is `List[X]`. E.g., `List[Int]` and `List[String]`.

This allows us to write the following type declaration for the list argument in our function that expects a list of integers.

And thereby Scala ensures that we can never accidentally call `sumListIntegers` with a list of anything other than integers.

# SICP List Exercises

Let’s go ahead and implement`sumListIntegers` using our understanding of lists. As in part 1 of this series, we’re going to use recursion in place of looping to consider each element in computing the sum of a list.

Let me show you the example code and then explain what it’s doing.

The main strategy is that we recurse on the sub-list formed by removing the first element. E.g., `List(12, 99, 37).tail` is the list `List(99, 37)`. We then combine together the result of the recursive call with the value of the first element through addition. In turn, the recursive call to `sumListIntegers` with argument `List(99, 37)` again makes a recursive call, this time with an argument of `List(37)`. And we’re not done yet! That call, in turn, calls `sumListIntegers` with the tail of `List(37)` which is `Nil`. The recursion then terminates because the empty list `Nil` has a sum of zero and we can return that.

That’s a lot of words. Let’s consider what the expansion looks like in math/code.

Take some time if you’d like and think through this recursion strategy to process all elements in a list. When you’re ready, you can then try out creating a recursive function to compute the length of a list by filling out the following template. I.e., count the number of elements in a list.

You can test your solution in Scastie. (Note, you have to remove the lines that start with `>` when running your code. I added those lines to show the expected output for each example input.)

Next, let’s consider a function that constructs a new list from its input list. E.g., let’s write a function to add five to every element of a `List[Int]`. This can be done recursively with the following code.

Can you follow through how the recursive function calls build a new list that contains the results of adding five to each element of the input list? If not, take some time and work through the individual examples and reason through what the recursion chain looks like. I.e., write out the recursion expansion as we did with `sumListIntegers`.

Once you feel comfortable with the example, can you generalize the example to be a higher-order function that takes an arbitrary input function, `func`, in addition to a `l: List[Int]` and creates a new list that is the result of applying `func` to every element in the input list? Recall that a higher-order function is a function that takes other functions as input.

Let us call this new function `map` since it is mapping the function `func` across every element of the input list to create a new list. Here’s a template to get you started.

Again, you can test your solution in Scastie.

Once you’ve completed that exercise, let me show you a neat and more legible way to recurse over lists using Scala’s `match` expression. You may recall from the previous article how `match` allows us to define different conditions for matching a value. Scala then evaluates different code based upon which condition matches.

Here’s a neat way to implement `add5toElement` using `match` to determine whether or not we’re dealing with the `Nil` element.

You can see that we’re matching against the first element in the list. The first `case` matches `Nil` and terminates recursion since we’ve reached the end of the list. The second `case` statement is more interesting. Here we’re using deconstruction to not only match an `::` element, but to also unpack its `head` and `tail` attributes into the local `val`s of `h` and `t`. We can then use these variables in processing the list element and recursing on the tail.

This starts to hint at some of the power of `match`. You’ll find many cases in Scala where `match` can be used to not only match against data but to also concisely access the attributes of a data type like `::`. We’ll further explore `match` in later articles.

Can you modify your `map` function to use `match`?

We actually didn’t need to implement `map`. It’s already built into Scala. Specifically, lists in Scala have a `map` method (i.e., a function attached to the data structure) that we can use to apply a function to every element in the list, returning a new list of the results of applying the function. E.g.,

It was still a great exercise to implement `map` to learn about recursion on lists, which is a common pattern in Scala. Nonetheless, you can commonly avoid writing your own recursive functions by leveraging builtin Scala algorithms like `List.map`.

As we’ll explore more in the future, `map` can work with lists of any element type as long as you pass in a function that can accept such elements. E.g., it doesn’t just work on `List[Int]`, but it also works on `List[String]` and any other type of list as long as we have an appropriate function to map across the list. Further, the mapping function can have any return type and we get a list of elements with the return type of the given function.

This type flexibility is possible due to placeholder types in Scala. We’re gonna briefly look at simplified code examples for `List` that demonstrate this concept. In doing so, we’ll encounter a few parts of Scala that we haven’t yet studied and don’t be surprised if you can’t follow all of the code. I just want to introduce the idea that Scala has generic data structures like `List` that can hold any type of element and that this is accomplished through placeholder types. Here’s the simplified example code.

No need to understand the details yet. Just see if it makes sense how both `::[A]` and `NilClass[A]` implement the behavior of `List[A]` in terms of methods `head`, `tail`, and `map`. Further, these data structures can work on any generic type of list values as represented by the placeholder type `A`.

Only showing this example to hint at the power of Scala’s types. See if you can reason through why a calling `map` on a`List[A]` with a function of type `f: A => B` generates a `List[B]` where `A` and `B` can be any arbitrary Scala types. They can even be the same type.

To reiterate, `::` and `Nil` are two types of `List` elements that have different behavior. Hence why we can use `List[Int]` to represent a sequence of `::` elements that have `Int` values and terminate with `Nil`. We’ll explore this topic of traits and classes more in future articles.

There’s also a builtin list method called `fold` that can be used to aggregate across all elements in a list. E.g., we could re-implement `sumListIntegers` as follows.

Note how we call fold with two different parameter lists. The first parameter list provides a starting value of aggregation, in this case, zero. The second parameter list takes a single function that operates on two elements.

Also, this example introduces the concept of an anonymous function in the code`(aggregate: Int, element: Int) => aggregate + element`. Anonymous functions can be useful to make our code more concise by not needing to define and name every function we write.

The “folding” function will be called multiple times. Each time it receives the current value of aggregation so far, as well as an element of the list. The function computes the new aggregation value, in this case by adding the list element value to the aggregate. This is repeated for every element in the list and results in a final aggregation value that returned by `fold`.

The name fold corresponds to how we create the recursion chain of form `f(f(f(0, 37), 99), 12)` for the folding function `f` with an input of `List(12, 99, 37)`. We’re essentially repeatedly calling `f` and folding the list down to a single aggregate value through the recursion chain.

FYI, here’s a simplified implementation of `fold` in Scala. Take a few minutes and see if you can reason through some parts of what’s going on. Totally fine if you find some parts confusing and don’t yet understand them. We’ll be exploring the details of placeholder types and classes in the future.

Both `map` and `fold` can be used in many cases to simplify our processing of lists. Further, as we’ll see in future installments of this series, there are other Scala data structures that also implement `map` and `fold` methods for processing their elements. Lastly, there are many other useful algorithms implemented in built-in methods of Scala data structures that can simplify our code and we’ll explore them over time.

As always, thanks for following along on our journey through Scala. And please let me know if any part of this article could be made more clear so as to help future readers. Send any feedback to matthew.hagy@gmail.com.

Part three is out and it’s particularly fun in that you get to analyze Reddit data in your browser and learn a bit about Redditors. Checkout Interactively exploring Reddit posts using basic Scala in your browser.

--

--

Software Engineer and fmr. Data Scientist and Manager. Ph.D. in Computational Statistical Chemistry. (matthagy.com)