CODEX

Prefer lists to arrays in Scala

Alonso Del Arte
CodeX
Published in
5 min readJan 15, 2021

--

Photo by freestocks on Unsplash

Java provides both arrays and lists. So does Scala. Java lists have many advantages over arrays. In Scala, the advantages of lists over arrays are even greater than in Java.

Because most people learning Scala come at it from the perspective of Java, we’re going to start off by reviewing a couple of the ways in which arrays are declared and initialized in Java.

Here’s one way to declare an array of five specific integers in Java:

int[] numbers = {43, -7, 8, 21, 58};

There are a couple of other ways to do it, but they all involve the “square brackets” and the “curly braces.”

int[] numbers = new int[5];
numbers[0] = 43;
numbers[1] = -7; // etc.

The syntax in Scala is different, but I think you’ll get used to it very quickly.

val numbers = Array(43, -7, 8, 21, 58)

Note that those are parentheses, not “square brackets” nor “curly braces.” It’s likelier that in going back to Java, you’ll forget the Java array syntax and will have to look it up online. I know I have.

On the local Scala REPL, you get immediate feedback regarding type inference.

scala> Array(43, -7, 8, 21, 58)
res2: Array[Int] = Array(43, -7, 8, 21, 58)

Both the first Java line is shown above and the Scala line with “val” compile to pretty much the same thing. If you use the Java disassembler tool (javap with the -c command-line option), you will see plenty of differences, but you will also see that a lot of consecutive lines are exactly the same.

       0: iconst_5            // Push 5 (array size) on the stack
1: newarray int // New int array
3: dup // Duplicate top stack element
4: iconst_0 // Push 0 (first index) on the stack
5: bipush 43 // Push 43 (1st element) on the stack
7: iastore // Store 43 at array index 0
8: dup
9: iconst_1 // Push 1 (second index) on the stack
10: bipush -7 // Push -7 (2nd element) on the stack
12: iastore // Store -7 at array index 1
13: dup
14: iconst_2 // Push 2 (third index) on the stack
// etc.

As far as the JVM’s concerned, Scala’s Array[Int] is exactly the same as Java’s int[].

Note that in Scala the “square brackets” [ and ] are not used for arrays in the same way as in Java. Instead, they serve the same purpose as the “angle brackets” < and > in Java: to enclose type parameters for generic types like ArrayList<E>.

Essentially, you’re supposed to regard arrays in Scala as being very much like generic types in the Java Development Kit (JDK), generic types in the Scala Development Kit (SDK), generic types in third-party libraries, and generic types you might design yourself.

That means, for example, if you wanted to use a Java array-list in your Scala program, you’d declare it something like this:

    val reindeerNames = new util.ArrayList[String]()
reindeerNames add "Dasher"
reindeerNames add "Dancer"
reindeerNames add "Prancer" // etc.

Though in my opinion, the only good reason to use any of the java.util data structures in a Scala project is for the sake of Java interoperability (e.g., a function in a framework expects an ArrayList<E>).

Do give credit where credit is due: a Java array-list offers important benefits over using arrays directly when the exact number of elements is not known in advance, like automatically resizing the backing array.

Still, it’s better to use Scala collections in a Scala program.

val reindeerNames = List("Dasher", "Prancer", "Vixen", "Comet",
"Cupid", "Donner", "Blitzen", "Rudolph")

It’s easy to sort arrays if the relevant type T is Comparable<T> (the Scala trait Ordered[T] extends Java’s Comparable[T]).

scala> numbers.sorted
res3: Array[Int] = Array(-7, 8, 21, 43, 58)

Don’t expect to be able to do that in Java. The only reason you can do that in Scala is because of implicit conversion to ArrayOps[T].

But beware: arrays are mutable data structures, with all the problems that come with that, including problems with concurrency.

scala> numbers(0) = 74scala> numbers
res5: Array[Int] = Array(74, -7, 8, 21, 58)

In a toy example like the numbers array, the mutability is not a problem at all. But in a program with even just two concurrent threads trying to access the same array, you might run into problems like bottlenecks and race conditions.

Some of you might have no interest in writing multi-threaded programs. Even so, lists are still preferable to arrays, because of the mental model: as humans, we tend to think of lists rather than arrays.

For example, when you go shopping, you often have a shopping list, but you’ve probably never had a “shopping array.” Your shopping is limited not by an arbitrary number of spaces for items to buy but by how much money you can spend on the items, and how much those items cost.

Maybe you want to buy ten widgets of a certain brand, but you find a deal by which you can buy two widgets and you get one free. So maybe you take the deal and buy fifteen widgets for the price often.

So if you’re implementing a shopping cart in an online shopping cart with Java or Scala, you should probably either use a collection from the JDK’s java.util package or from the SDK (from scala.collection.immutable or scala.collection.immutable) rather than an array.

In some cases, such as when inter-operating with Java frameworks, you’ll need to work with arrays. But they can be conveniently converted to lists or sets, thanks to the toList() and toSet() functions in ArrayOps.

scala> numbers.toList
res6: List[Int] = List(43, -7, 8, 21, 58)
scala> numbers.toSet
res7: scala.collection.immutable.Set[Int] = HashSet(-7, 21, 43, 8, 58)

This is all well and good for toy examples with few elements. What about when you need to deal with thousands of objects at a time? For example, let’s say you want a list of the first million prime numbers.

A simple Java implementation using an array of a million integer primitives, and with only the most basic optimizations (skip all even numbers higher than 2, only check potential factors up to n) can deliver results in less than three seconds.

Compare that to a Scala implementation using immutable lists: it’ll take so long you’ll probably stop it after a couple of minutes, if not sooner. It was only after replacing List with scala.collection.mutable.ArrayBuffer that I was able to bring the running time down under four seconds.

But… this example with a million primes is not very realistic. How often will you actually need to sift through a few million integers to pick out a million of them? Probably not very often.

In summary, Scala has arrays mostly for inter-operating with Java. In most cases, the faster access of arrays offers hardly any advantage for anything you wouldn’t rather be using something like C or C++ for instead.

--

--

Alonso Del Arte
CodeX

is a Java and Scala developer from Detroit, Michigan. AWS Cloud Practitioner Foundational certified