Scala: comprehending the for-comprehension

Linas Medžiūnas
Jan 30 · 10 min read

If you are the kind of person who likes to check what’s under the hood of things, and happen to be in the process of learning the Scala programming language, this post is precisely for you. In a few short steps, we will take one of the most powerful constructs of Scala — the for-comprehension, figure out what is under the hood of it, and hopefully discover some rules that would help make our code easier to follow.


Let’s start by defining a simple class hierarchy that we will be using. If you are already somewhat familiar with Scala, you’ll quickly notice the analogy. It will either contain some generic value, or contain no value. Let’s also define a few values from this class hierarchy that we will need down the road:

sealed abstract class Perhaps[A]case class YesItIs[A](value: A) extends Perhaps[A]case object Nope extends Perhaps[Nothing]val y3 = YesItIs(3)val y4 = YesItIs(4)val n = Nope

Note: Nope is a singleton instance (object) of type Perhaps[Nothing], and Nothing is a special type in Scala: it is at the bottom of type system, meaning that it is a subtype of any other type. With some small tweak that we’ll see later, this will allow us to use Nope as a value of typePerhaps[A], for any generic type A.


Now, we will try to print a value contained in one of the objects we just created, by writing what appears so similar to a classic, imperative for-loop that can be found in most programming languages, in one shape or another:

for {
a <- y3
} println(a)

Sadly, this fails. The error message is: value foreach is not a member of YesItIs[Int]. Strange, isn’t it? We haven’t referenced foreach anywhere in our code, and yet the compiler complains that it is missing. Let’s try to implement this method in our class hierarchy and see what happens:

sealed abstract class Perhaps[A] {
def foreach(f: A => Unit): Unit
}
case class YesItIs[A](value: A) extends Perhaps[A] {
override def foreach(f: A => Unit): Unit = f(value)
}

case object Nope extends Perhaps[Nothing] {
override def foreach(f: Nothing => Unit): Unit = ()
}

It is as simple as it looks — for YesItIs it calls the function f (that is passed as an argument to foreach) with the value contained in our YesItIs. And for Nope — it does nothing. Function f is meant to be invoked for it’s side effect, which is indicated by a return type Unit (which is analogous to void in Java).

Turns out, the imperative kind of for is nothing more than a syntactic sugar that gets “desugared” into a call of foreach method by Scala compiler. So our for “loop” above gets transformed into a code like this:

y3.foreach(a => println(a))

We can now try and see that it started to compile, and it prints “3” as expected. And if we replace y3 with n (which is a Nope), it doesn’t print anything (since there is no value to print).


Next, we go less imperative and a bit more functional, and explore the form of for-comprehension with a yield keyword:

for {
a <- y3
} yield a * a

We would expect it to create a new instance of YesItIs, containing a squared value of y3. But the complier complains again:value map is not a member of YesItIs[Int]. Let’s add map to our class hierarchy:

sealed abstract class Perhaps[+A] {
def foreach(f: A => Unit): Unit
def map[B](f: A => B): Perhaps[B]
}
case class YesItIs[A](value: A) extends Perhaps[A] {
override def foreach(f: A => Unit): Unit = f(value)
override def map[B](f: A => B): Perhaps[B] = YesItIs(f(value))
}

case object Nope extends Perhaps[Nothing] {
override def foreach(f: Nothing => Unit): Unit = ()
override def map[B](f: Nothing => B): Perhaps[B] = this
}

The implementation is dead simple, again: on YesItIs, map computes the given function f on the contained value, and returns the result wrapped in a new instance of YesItIs (which might be of a different generic type B, depending on the return type of f). And for Nope, it returns just that — the same singleton instance of Nope.

Effectively, our for-comprehension with a yield desugars into this more basic code:

y3.map(a => a * a)

And, after we implemented map, we can see that our for-comprehension compiles and yields a value YesItIs(9), as we intended.

Note: I had to add a + sign before the generic parameter of Perhaps[A]. I had to do this because otherwise the compilation of Nope.map would fail with an error message:

Error:(17, 55) type mismatch;
found : Nope.type
required: Perhaps[B]
Note: Nothing <: B (and Nope.type <: Perhaps[Nothing]), but class Perhaps is invariant in type A.
You may wish to define A as +A instead. (SLS 4.5)
override def map[B](f: Nothing => B): Perhaps[B] = this

This addition of + declares to the compiler that if Y is a subtype of X, then Perhaps[Y] is a subtype of Perhaps[X]. This allows us to use the instance of Nope (which extends Perhaps[Nothing]) anywhere we need a Perhaps with any generic type. I will not go into details of covariance in this post, I’ll just say that I really appreciate the compiler error message that not only tells what is wrong, but also tells how to fix it!


Next, let’s try to use the for-comprehension with more than one generator (i.e. more than one <- arrow):

for {
a <- y3
b <- y4
} yield a * b

We would like to get a value 12 (from 3 * 4) wrapped in YesItIs, but the compiler insists that value flatMap is not a member of YesItIs[Int]. Let’s implement it:

sealed abstract class Perhaps[+A] {
def foreach(f: A => Unit): Unit
def map[B](f: A => B): Perhaps[B]
def flatMap[B](f: A => Perhaps[B]): Perhaps[B]
}
case class YesItIs[A](value: A) extends Perhaps[A] {
override def foreach(f: A => Unit): Unit = f(value)
override def map[B](f: A => B): Perhaps[B] = YesItIs(f(value))
override def flatMap[B](f: A => Perhaps[B]): Perhaps[B] = f(value)
}

case object Nope extends Perhaps[Nothing] {
override def foreach(f: Nothing => Unit): Unit = ()
override def map[B](f: Nothing => B): Perhaps[B] = this
override def flatMap[B](f: Nothing => Perhaps[B]): Perhaps[B] =
this

}

It might seem unexpected, but the implementation of flatMap on YesItIs is slightly simpler than that ofmap: this time, we do not have to wrap the result of function f into YesItIs explicitlyf already returns either a value wrapped in YesItIs, or Nope. And our for-comprehension now gets desugared into something a bit more complicated:

y3.flatMap(a => y4.map(b => a * b))

Note that all the generators of a single for-comprehension must be of the same type. That is, you cannot mix our Perhaps with something else, like Scala’s Option, Try, Future, or collections. There simply is no way to “fuse” them together into something meaningful in the type system. The best you can hope for is chaining more than one for-comprehension with a yield (if this is what your use case really demands), for example:

for {
a <- y4 // YesItIs(4)
} yield for {
b <- Try(100 / a)
} yield s"100/$a=$b"

Which results in YesItIs(Success(100/4=25)) and is equivalent to:

y4.map(a => Try(100 / a).map(b => s"100/$a=$b"))

Note the absence of flatMap, caused by the fact that these are two for-comprehensions, each having just a single generator.


A really cool feature of for-comprehension is support of filtering (by using the if keyword):

for {  
a <- y3
if a > 1
b <- y4
} yield a * b

Here, we would expect to get YesItIs(12), because the filter condition a > 1 is satisfied (or to get a Nope, if it weren’t). Again, the compiler tells us what’s missing: value withFilter is not a member of YesItIs[Int]. Let’s add what will be the final touch to our class hierarchy:

sealed abstract class Perhaps[+A] {
def foreach(f: A => Unit): Unit
def map[B](f: A => B): Perhaps[B]
def flatMap[B](f: A => Perhaps[B]): Perhaps[B]
def withFilter(f: A => Boolean): Perhaps[A]
}
case class YesItIs[A](value: A) extends Perhaps[A] {
override def foreach(f: A => Unit): Unit = f(value)
override def map[B](f: A => B): Perhaps[B] = YesItIs(f(value))
override def flatMap[B](f: A => Perhaps[B]): Perhaps[B] = f(value)
override def withFilter(f: A => Boolean): Perhaps[A] =
if (f(value)) this else Nope

}

case object Nope extends Perhaps[Nothing] {
override def foreach(f: Nothing => Unit): Unit = ()
override def map[B](f: Nothing => B): Perhaps[B] = this
override def flatMap[B](f: Nothing => Perhaps[B]): Perhaps[B] =
this
override def withFilter(f: Nothing => Boolean): Perhaps[Nothing] =
this

}

Now, our nice and clean for-comprehension works, and it gets desugared to this not-so-elegant code (that we normally don’t get to see) under the hood:

y3.withFilter(a => a > 1).flatMap(a => y4.map(b => a * b))

I have to admit that I have simplified things here slightly. You can find some interesting details on the difference between filter and withFilter here.


Another nice feature of Scala for-comprehension is assignment of expressions to named values that can be referenced later within the same for-comprehension:

for {
a <- y3
b <- y4
c = a * b
} yield c

This time, we do not need to add anything to our Perhaps class hierarchy. The code above works out of the box, and it gets desugared into something like this:

y3.flatMap { a =>
y4.map { b =>
val c = a * b
(b, c)
}.map { case (b, c) =>
c
}
}

Assigning the results of expressions (even really trivial ones) to named values can really improve the readability of your for-comprehensions, even if you refer to such value only once (and especially if you can find a descriptive name for it).


In the beginning of this post I promised that we will not only look under the hood of the for-comprehension, but also see how to use them to make our code more elegant and easier to read. During my six years of using Scala, I’ve come up with a few rules of thumb for this, and I’ll try to write them down:

1. Prefer for-comprehension over the chain of map / flatMap / filter / withFilter in most cases (unless it is a single map or filter). Not only is the code in this form easier to read, it is also easier to extend it and to move things around. I hope it is apparent from the examples above. Note: you can usually simplify a chain of filter and map into one call of collect.

2. Avoid the one line syntax of for-comprehension with parens (for (x <- xs)) even in the most simple cases. It is slightly shorter, but it brings inconsistency:

// DON'T:
for (x <- xs) yield x * x
// DO:
for {
x <- xs
} yield x * x

3. Keep the yield expression as simple as possible. If it is a non-trivial expression, extract it into a named value(-s) (or a private method) with descriptive names. Never write a block expression ({ … }) after the yield — it hurts the eyes:

// DON'T:
for {
x <- xs
} yield {
val f = foo(x)
val b = bar(f)
f + b
}
// DO:
for {
x <- xs
f = foo(x)
b = bar(f)
} yield f + b

4. Keep the right hand side of <- generators as simple as possible. If there is a non-trivial expression it it, extract it into a a private method with a descriptive name or find some other way to simplify it. Never use a nested for-comprehension as a generator — it causes headaches:

// DON'T:
for {
square <- for {
x <- xs
} yield x * x
doubleSquare = 2 * square
} yield doubleSquare
// DO:
for {
x <- xs
square = x * x
doubleSquare = 2 * square
} yield doubleSquare

5. Do not put a for-comprehension into parens — such a powerful language feature deserves to stand on it’s own. Assign it’s value to a val or extract into a private method.

// DON'T:
(for {
x <- xs
} yield x -> x * x).toMap
// DO:
val xsWithSquares = for {
x <- xs
} yield x -> x * x
xsWithSquares.toMap

6. Use pattern matching (I’ll write more about it my upcoming post) on the left hand side of <- generators. Eg.:

// DO:
for {
(key, value) <- someMap
} yield s"$key maps to $value"

In general, I find short, vertically flowing lines of code easy to read, as opposed to almighty one-liners frequently found in Scala code. Also, I try to avoid accessing the fields of a Tuple directly (eg. t._1, t._2) which, again, hurts the eye. Instead, give them nice names with the help of pattern matching, like in the code snippet above.

I understand that everyone has a different taste and you might disagree with some of these guidelines. I suggest to find some obscure, oversized for-comprehension in your code base (I believe that most of the long lived projects have them), and try to apply these guidelines on it — I would really appreciate if you could share the result, in a form of “before” and “after” the refactoring!


Here is the gist of the whole Perhaps class hierarchy and related code. The code has been written in Scala 2.11.8. You can play around with it in Scala Worksheet of your IDE (IntelliJ IDEA, or Scala IDE for Eclipse), or online, by using Scastie. I haven’t used Eclipse in a while, but any of these Worksheet tools should be a nice way to learn and to experiment with Scala code and to get a nearly-instant feedback.

And if you have enjoyed this post, stay tuned — I will follow with a similar post about Scala pattern matching soon.

Update: I’ve published the other post that I was promising — “Scala pattern matching: apply the unapply”.

Wix Engineering

Architecture, scaling, mobile and web development, management and more, written by our very own Wix engineers. https://www.wix.engineering/

Linas Medžiūnas

Written by

Software engineer at Wix; unretired competitive programmer; 2x IOI gold medal winner; curious about Quantum algorithms.

Wix Engineering

Architecture, scaling, mobile and web development, management and more, written by our very own Wix engineers. https://www.wix.engineering/

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade