Two new features in Scala 3’s type system are intersection and union types. They embrace set theory more explicitly to improve the soundness of the type system, where a type can be considered a set and the instances of it are the members of the set.
December 29, 2020 and January 7, 2021. Fixed mistakes in the discussion of subtyping for intersection types and the distribution laws. Thanks to James Warren , Sean Kim, and Pascal Mengelt for catching my mistakes.
Let’s start with some definitions we’ll use as examples:
All the types derive from
M and all define or override a method
m, including calls to the parent’s
m. Let’s declare some instances in the Scala 3 REPL. If I used curly braces instead of significant indentation for the previous type definitions, they would work for Scala 2 as well. The following code works for both Scala 2 and 3:
I’ll come back to the assertions and calls to
m below. Note the types for
c21. Both are intersection types. In Scala 2,
c12 would have the type
C with T1 with T2. Note that you can’t write
val c12 = new C & T1 & T2; the
with keywords are still used.
Going back to set theory, what does
C & T1 & T2 actually mean? A valid instance, like
c12, must be a member of the set of all instances for type
C and a member of the set
T1 and also of
T2. This works like it always does in object-oriented programs. A variable of type
C can point to
c12. So can a variable of type
T1, etc. In fact, all of the following are valid:
Now here’s why I added the
m methods and invocations. In set theory, the intersection commutes,
A & B == B & A. The same is true for Scala 3 intersection types.
c12 is considered to have the same type as
c21. All the following permutations are valid:
However, the assertions above illustrate that the order of invocation of
super.m depends on the order the traits were mixed in. It’s roughly right to left preference. The full algorithm for resolving precedence over a hierarchy is called linearization and Scala 2 used it, as well. I discuss it in the book, but won’t go into more details here. The important point is that the results returned by
c12.m differ from the results of
c21.m. Despite their type equivalence, they don’t necessarily behave the same way.
Intersection types are also associative:
(A & B) & C == A & (B & C).
What about subtyping? If
A <: B, then
A & A2 <: B. for any
A2. To be clear, this means
(A & A2) <: B. Note that subtyping
A <: B means that all of
A’s instances are part of
B’s set of instances. The intersection with
A2 reduces this subset further, but doesn’t remove any of the instances from
A <: B1 and
A <: B2, then
A <: B1 & B2. Think about the examples above to see why this works.
What about covariant subtyping? Suppose that
C[A] is covariant in
Seq[A], for example). In this case,
C[A & B] <: C[A] & C[B]. Consider this example:
You can see this by thinking about set membership. Since any
T1 & T2 element is also a
list1 should make sense. Both
c21 are subtypes of
T1. Similarly for
list3 takes a bit more thought, but again this is the union of two sets, so we’re saying that a sequence assigned to
list3 must only have elements that are instances of
T1 and instances of
T2, which is true for both
Intersection types expand on and improve the formalism of the familiar mixin model from Scala 2. You still use
with to construct types but the actual type is an intersection type.
Union types are closer to the
Consider the following definitions, which use a union type in the idiomatic way that
Either[A,B] is used for returning errors on the left or successful results on the right:
As the comment says, if you declare a variable that’s initialized with a sequence of
result, but you don’t provide a type annotation, Scala infers
Seq[Object] , the normal OO behavior we’re accustomed to. (It’s a quirk of the current Scala 3 implementation that the inferred type uses
Object instead of
AnyRef; they are essentially equivalent.) However, you can now explicitly annotate
seq to say the sequence must contain instances of
Bad. This is better than
AnyRef, because it’s more specific.
How do you work with these values?
In Scala 2,
work would return
Either[Bad|Good] and the conditional would return
Note that you’ll typically need to use pattern matching to determine what you actually have. You can do this with
Either, too, but
Either also gives you operations like
flatMap, etc. which union types don’t provide. However, on the plus side, union types aren’t limited to just two types, so they will find broader use than just as an error handling idiom.
Returning to set theory, note that
work says it will return an instance from the set of all
Good instances or from all
Like intersection types, union types commute, so
Good | Bad is equivalent to
Bad | Good and they are associative,
A | ( B | C) == (A | B) | C.
Subtyping also works similarly;
A <: A|B and if
A <: C and
B <: C, then
A|B <: C.
Distributive Law for Intersection and Union Types
You might recall the distributive law for multiplication and addition:
A * (B + C) == (A * B) + (A * C).
Union and intersection types also obey distributive laws for booleans:
A & (B | C) == (A & B) | (A & C) and
A | (B & C) == (A | B) & (A | C)
Study the following declarations and convince yourself they are true, using concrete examples as needed:
Covariance vs. Contravariance
We saw above how intersection types work with covariant parameterized types, like
Seq[A]. What about union types?
If you look at the contents of
t123s, it should be clear why
t3s can’t work. They declare that they will only ever have types
T3, respectively, but clearly that’s not true of
In fact, union types work contravariantly with covariant parameterized types:
This should make more sense when you consider the examples. The last three collections can hold any of
T3, but in each case, they are assigned sequences that only have one of those three types, which is fine.
What about contravariant parameterized types?? Functions are our best examples of types that are contravariant in some of the type parameters. In this case, it’s the types for the function arguments, e.g.,
Function2[-A1, -A2, +R] == (A1, A2) => R. (I explain in the book why these types must be contravariant. It has to do with the Liskov Substitution Principle.)
So, consider the following… carefully:
f123a method does the pattern matching we discussed above, returning a
Here’s the brain bending part; the type of
f123a is identical to
f123b. The rest of the example just proves it works. (Again, if it compiles, it must be true, right??)
Here’s where it helps to think of a set as a constraint on what’s allowed to be a member. The type of
f123a says “I am able to accept an instance of either one of the three types, then I’ll return a string.” The signature of
f123b says “I’m very particular; the only function you can assign to me must be able to accept a
T1 and return a
String, and accept a
T2 and return a
String, and accept a
T3 and return a
String. The only kind of function we can write like that, which is a single function satisfying these constraints, is a function that takes an argument of type
T1 | T2 | T3, meaning it can handle instances of any of these three types.
We ended on a rather heavy note, but I hope most of this post made sense. Basing intersection and union types on set theory makes the type system more sound. There will be fewer “divergent expansions” that sometimes happened with Scala 2. I think they also provide more options for typing, yet remain mostly intuitive, when you recall the basics of set theory (or at least work through some concrete examples.)
I’ll continue our tour of new type system features in several more posts.
For an even more concise summary of most of the notable changes in Scala 3, see my new Scala 3 Highlights page.
You can start reading the rough draft of Programming Scala, Third Edition on the O’Reilly Learning Platform. The first half or so of the chapters are available. I am refining them still, but any feedback is welcome!