A brief tour of lists in Scala and algorithmically processing them in SICP exercises

Let’s continue our journey through Scala using SICP exercises to demonstrate the elegance and simplicity of this powerful programming language in processing list data structures.

Matt Hagy
11 min readFeb 25, 2019

This article builds off of part 1, Quickly learning the basics of Scala through Structure and Interpretation of Computer Programs examples. That article also covers reasons for why you may want to learn Scala.

We continue our exploration of Scala using examples that solve exercises from the classic book, “Structure and Interpretation of Computer Programs” (SICP). Small exercises for the reader are also included.

Today we’re going to start working with some data in that we’re going to learn how to create and processes lists of elements. A list is a simple data structure that consists of a sequence of elements. There are many cases where we’ll need a collection of elements and lists can be a reasonable data structure in many of these cases. E.g., if we wanted to programmatically analyze some tweets from a famous person then we could store their tweets in a list for processing in Scala.

A list is constructed as a linear chain of elements whereby each element points to the next element in the list. Totally reasonable if that doesn’t sound intuitive. I think the concept is best illustrated with an example. The following figure shows a list of three elements; 12, 99, and 37.

From Wikipedia article Linked list

Each list element consists of a value and a pointer to the next element. You can see how the first element of the list contains the value 12. This element also points to the second element in the list. The second element contains the value 99 and points to the third element. Finally, the third element of this list contains the value 37 and points to a special “end-of-list” terminal.

In Scala, we call this special terminal element Nil. We need such a terminal value to let us know when the list ends. We can think of Nil as a special element of the list that neither contains a value nor points to another value. It just exists to cap off the end.

So let’s go ahead and build the 12-to-99-to-37-to-Nil list in Scala.

Interestingly, we have to build the list starting from the tail. This makes sense when we reflect on how we’re going to build this list. We can’t create the first element until we have the second one since the first element needs to know what it’s going to point to when we create it. Similarly, we can’t create the second element until the third has been created.

So let’s start by creating the last non-Nil element of the list that contains the value 37 and points to Nil. Because lists are used all through Scala, the list element is given a special, concise name, ::. Yes, the formal name of the list element in Scala is just two colons. While unusual, as you work more with lists, you’ll come to appreciate the novel naming and brevity. Here’s the Scala code for creating a list with just the single element 37.

We’re simply calling :: like a function and this corresponds to creating a new list element of type ::. Each :: element has two attributes:

  • head : the value of the list element (e.g., 37 in this example)
  • tail : the pointer to the next element in the list (Nil in example)

E.g.,

Note that the Scala shows the single-element list ::(37, Nil) in a special format that downplays the role of individual list elements and just shows the values contained in the list. List(37) is the clearer representation of this list with just this one element. Scala could also show this as ::(37, Nil), but that’ll be harder to interpret when we get to longer lists.

Now let me introduce a generally cool Scala feature that motivates the :: naming convention: ::(head, tail) can also be written as head :: tail. E.g.,

Hence, :: can be thought of as a binary operator in math, similar to say + or -. Do you agree that this infix notation is a simpler syntax than ::(37, Nil)? We can appreciate it more when we see that multiple usages of :: can be chained in creating the rest of the list. Here’s what the code looks like to construct the full list with conventional constructor function calls.

Contrast that to the infix notation.

Would you agree that the infix notation is a more legible way to define the elements that make up a list?

And we can do even better. The List(12, 99, 37) notation shown in the shell is a valid Scala expression to create a list of these three elements in the shown order. This is because List is a function for constructing lists. It can take any number of arguments, including zero. Behind the scenes, it does the work of creating the individual list elements and it returns the start of the list. The List function will be helpful as we create lists to process in today’s exercises.

Lists also introduce a new concept in terms of data types in that we want to be able to differentiate between lists with different types of elements. E.g., differentiating a list of Int values from a list of String values. This is necessary to ensure that all elements in a list conform to the same element type so we can write code that knows how to process all elements in an arbitrary list of a certain element type.

E.g., say we wanted to write a function to compute the sum of the values in a list of integers.

What would happen if we passed in a list of strings? E.g., List("a", "b"). If our code expected integers, then it would encounter an error when it tried to treat strings as integers. To avoid such errors, each Scala list type specifies the type of its list elements. The syntax for specifying a list of type X is List[X]. E.g., List[Int] and List[String].

This allows us to write the following type declaration for the list argument in our function that expects a list of integers.

And thereby Scala ensures that we can never accidentally call sumListIntegers with a list of anything other than integers.

SICP List Exercises

Let’s go ahead and implementsumListIntegers using our understanding of lists. As in part 1 of this series, we’re going to use recursion in place of looping to consider each element in computing the sum of a list.

Let me show you the example code and then explain what it’s doing.

The main strategy is that we recurse on the sub-list formed by removing the first element. E.g., List(12, 99, 37).tail is the list List(99, 37). We then combine together the result of the recursive call with the value of the first element through addition. In turn, the recursive call to sumListIntegers with argument List(99, 37) again makes a recursive call, this time with an argument of List(37). And we’re not done yet! That call, in turn, calls sumListIntegers with the tail of List(37) which is Nil. The recursion then terminates because the empty list Nil has a sum of zero and we can return that.

That’s a lot of words. Let’s consider what the expansion looks like in math/code.

Take some time if you’d like and think through this recursion strategy to process all elements in a list. When you’re ready, you can then try out creating a recursive function to compute the length of a list by filling out the following template. I.e., count the number of elements in a list.

You can test your solution in Scastie. (Note, you have to remove the lines that start with > when running your code. I added those lines to show the expected output for each example input.)

Next, let’s consider a function that constructs a new list from its input list. E.g., let’s write a function to add five to every element of a List[Int]. This can be done recursively with the following code.

Can you follow through how the recursive function calls build a new list that contains the results of adding five to each element of the input list? If not, take some time and work through the individual examples and reason through what the recursion chain looks like. I.e., write out the recursion expansion as we did with sumListIntegers.

Once you feel comfortable with the example, can you generalize the example to be a higher-order function that takes an arbitrary input function, func, in addition to a l: List[Int] and creates a new list that is the result of applying func to every element in the input list? Recall that a higher-order function is a function that takes other functions as input.

Let us call this new function map since it is mapping the function func across every element of the input list to create a new list. Here’s a template to get you started.

Again, you can test your solution in Scastie.

Once you’ve completed that exercise, let me show you a neat and more legible way to recurse over lists using Scala’s match expression. You may recall from the previous article how match allows us to define different conditions for matching a value. Scala then evaluates different code based upon which condition matches.

Here’s a neat way to implement add5toElement using match to determine whether or not we’re dealing with the Nil element.

You can see that we’re matching against the first element in the list. The first case matches Nil and terminates recursion since we’ve reached the end of the list. The second case statement is more interesting. Here we’re using deconstruction to not only match an :: element, but to also unpack its head and tail attributes into the local vals of h and t. We can then use these variables in processing the list element and recursing on the tail.

This starts to hint at some of the power of match. You’ll find many cases in Scala where match can be used to not only match against data but to also concisely access the attributes of a data type like ::. We’ll further explore match in later articles.

Can you modify your map function to use match?

We actually didn’t need to implement map. It’s already built into Scala. Specifically, lists in Scala have a map method (i.e., a function attached to the data structure) that we can use to apply a function to every element in the list, returning a new list of the results of applying the function. E.g.,

It was still a great exercise to implement map to learn about recursion on lists, which is a common pattern in Scala. Nonetheless, you can commonly avoid writing your own recursive functions by leveraging builtin Scala algorithms like List.map.

As we’ll explore more in the future, map can work with lists of any element type as long as you pass in a function that can accept such elements. E.g., it doesn’t just work on List[Int], but it also works on List[String] and any other type of list as long as we have an appropriate function to map across the list. Further, the mapping function can have any return type and we get a list of elements with the return type of the given function.

This type flexibility is possible due to placeholder types in Scala. We’re gonna briefly look at simplified code examples for List that demonstrate this concept. In doing so, we’ll encounter a few parts of Scala that we haven’t yet studied and don’t be surprised if you can’t follow all of the code. I just want to introduce the idea that Scala has generic data structures like List that can hold any type of element and that this is accomplished through placeholder types. Here’s the simplified example code.

No need to understand the details yet. Just see if it makes sense how both ::[A] and NilClass[A] implement the behavior of List[A] in terms of methods head, tail, and map. Further, these data structures can work on any generic type of list values as represented by the placeholder type A.

Only showing this example to hint at the power of Scala’s types. See if you can reason through why a calling map on aList[A] with a function of type f: A => B generates a List[B] where A and B can be any arbitrary Scala types. They can even be the same type.

To reiterate, :: and Nil are two types of List elements that have different behavior. Hence why we can use List[Int] to represent a sequence of :: elements that have Int values and terminate with Nil. We’ll explore this topic of traits and classes more in future articles.

There’s also a builtin list method called fold that can be used to aggregate across all elements in a list. E.g., we could re-implement sumListIntegers as follows.

Note how we call fold with two different parameter lists. The first parameter list provides a starting value of aggregation, in this case, zero. The second parameter list takes a single function that operates on two elements.

Also, this example introduces the concept of an anonymous function in the code(aggregate: Int, element: Int) => aggregate + element. Anonymous functions can be useful to make our code more concise by not needing to define and name every function we write.

The “folding” function will be called multiple times. Each time it receives the current value of aggregation so far, as well as an element of the list. The function computes the new aggregation value, in this case by adding the list element value to the aggregate. This is repeated for every element in the list and results in a final aggregation value that returned by fold.

The name fold corresponds to how we create the recursion chain of form f(f(f(0, 37), 99), 12) for the folding function f with an input of List(12, 99, 37). We’re essentially repeatedly calling f and folding the list down to a single aggregate value through the recursion chain.

FYI, here’s a simplified implementation of fold in Scala. Take a few minutes and see if you can reason through some parts of what’s going on. Totally fine if you find some parts confusing and don’t yet understand them. We’ll be exploring the details of placeholder types and classes in the future.

Both map and fold can be used in many cases to simplify our processing of lists. Further, as we’ll see in future installments of this series, there are other Scala data structures that also implement map and fold methods for processing their elements. Lastly, there are many other useful algorithms implemented in built-in methods of Scala data structures that can simplify our code and we’ll explore them over time.

As always, thanks for following along on our journey through Scala. And please let me know if any part of this article could be made more clear so as to help future readers. Send any feedback to matthew.hagy@gmail.com.

Part three is out and it’s particularly fun in that you get to analyze Reddit data in your browser and learn a bit about Redditors. Checkout Interactively exploring Reddit posts using basic Scala in your browser.

--

--

Matt Hagy

Software Engineer and fmr. Data Scientist and Manager. Ph.D. in Computational Statistical Chemistry. (matthagy.com)