Scala 3: Dependent Types, Part II
In the last post (which proved popular), I introduced dependent types and discussed interesting things you can with values as types, like val onePlusOne: 1+1 = 2
. In this post, I’ll visit some more sophisticated capabilities provided by dependent typing.
January 17, 2021 update: Chris Birchall pointed out to me that it’s not necessary to use
NSize[N]
in theDTList
example.N
will do. I added another recursive type onN
example instead.
For an even more concise summary of most of the notable changes in Scala 3, see my new Scala 3 Highlights page.
You can start reading the rough draft of Programming Scala, Third Edition on the O’Reilly Learning Platform. The first half or so of the chapters are available. I am refining them still, but any feedback is welcome!
For me, the most intriguing promise of dependent typing is greater safety when performing operations that transition some state in a precise way, whether a new object is returned or the existing object is mutated. I’ll focus on the former.
Disclaimer: I’m no expert on dependent typing. There are probably lots of naive mistakes in this post. I welcome feedback!
For example, if I append an item to a three-element list, a four-element list should be returned. Also, I should not be able to call head
or tail
on an empty list. Finally, wouldn’t it be great if the list remembered the type of each element, rather than inferring a least upper bound type for all the elements?
HList
in Shapeless has implemented features like these, even for Scala 2. Tuples in Scala 3 provide similar capabilities, too. Here is a taste of what you can do with tuples in Scala 3, where a new type, EmptyTuple
plays a similar role to Nil
for lists:
I won’t discuss the details, but hopefully the comments, output, and links to documentation will help you understand what’s going on.
Instead, let’s try creating our own dependently-typed list, DTList
. It will be “dependent” on the length of the list.
We saw the two imports, scala.compiletime.S
and scala.compiletime.ops.int._
, in my previous post. S
lets us compute the natural numbers given a 0
and application of the “successor” function S
repeatedly. The ops.int._
provide operations on integer types, which means they happen at compile time.
The specific value for the list size is used as a type N
. The corresponding value can be retrieved using the method valueOf[N]
, as used in the size
method. The successor of N
is computed with S[N]
or N+1
.
Like Scala’s List
type, a sealed hierarchy is used with base trait named DTList
. and two concrete implementing types, DTNil
, for empty lists (analogous to Scala’s Nil
for empty List
s), and DTNonEmptyList
, for all other lists.
DTList
has a type parameter, N
, corresponding to the size of the list. Recall that Scala List
(and other sequences) have a type parameter for the least upper bound of the elements. Hence, DTList
actually remembers the types of each element, analogous to how tuples retain this information.
DTList
has two methods. The first is the size
, which is exactly equal to N
. It leverages the method valueOf
to return the one and only one value corresponding to the type, e.g., 0
for type 0
, 1
for type 1
, etc.
The inline
keyword tells the compiler to inline this method and not create byte code to call the method. This is necessary because valueOf
is inlined. (It is defined in Predef
.)
The second method is the familiar +:
method for constructing a new list by prepending a new head to the list. By definition, the new list must be non-empty, so a DTNonEmptyList
is returned. Notice that +:
is parameterized by the head element’s type H
. Hence, we retain the type information for each element.
The empty list, DTNil
, is parameterized with type 0
.
The non-empty DTNonEmptyList
is parameterized by N
, the head type H
, and the tail type T
, which must be a subtype of DTList
. Note carefully the numeric type argument given to the parent DTList
, it is the successor of N
, meaning N+1
. So, the N
type parameter is actually one less than the actual size of the DTNonEmptyList
instance, so S[N]
is used to get the correct size. However, this value is never specified explicitly; the compiler will always infer it.
Okay, let’s try it!!
The list
is constructed very much like regular sequences. However, the type returned is quite detailed. Think of it this way: (1:Int, ("two":String,(DTNil:DTList)))
. It resembles a nested tuple pairs.
When we ask for the size
, we get 2
as expected. When we ask for the head
element, we get a properly-typed Int
of value 1
.
When we examine the tail
, we see its size
is now 1
, its head
is a String
, "two"
, and its tail is DTNil
, of size 0
.
When we attempt to call head
and tail
on list.tail.tail
, which is actually DTNil
, these methods don’t exist! Contrast this situation with Scala sequences, which define head
and tail
for all instances. This means that exceptions must be thrown for empty sequences. Specifically, Nil.head
and Nil.tail
throw java.util.NoSuchElementException
.
Hence, this dependently-typed list implementation gives us two very nice benefits. First, it remembers the type of each element, not just the least upper bound type for all of them. Second, methods that can never succeed are not defined. DTNil
does not even have the head
and tail
methods. Attempting to call them fails at compile time.
However, there are significant disadvantages. First, this approach is harder for mortals like me to understand and use. I spent a number of hours attempting to implement additional collection methods and found it impossible (for me at least) without resorting to a more advanced implementation approach like Shapeless uses for its HList
implementation.
Second, this approach is heavily biased towards building up structures statically at compile time, as opposed to more dynamically at runtime. Consider this attempt to build up a DTList
with a fold:
I define a regular Seq
of values, then attempt to fold over them to construct a DTList
. The first attempt fails because the type of DTNil
is DTList[0]
, which is invariant in N
. You can’t make it covariant or contravariant either, which means that the compiler can’t successfully type check our attempt to generate new lists of increasing sizes with each pass through the loop, since this doesn’t happen statically at compile time.
What sort-of works is to declare a mutable list of type DTList[?]
. Then using foreach
, I could mutate this instance to construct the list. However, as the subsequent lines reveal, l
doesn’t have useful type information, unless we downcast to something we know is correct. Yuck.
In contrast, a declaration like 1 +: "two" +: 3.3 +: DTNil
can be statically checked easily with no loss of type information.
Okay, so there are definite pros and cons here. I love the promise of dependent typing, but clearly it’s not an “obvious” replacement for the old ways, at least not yet. I’m looking forward to seeing how people leverage these tools.
The new match type feature is also an idea that looks very promising, but so far I’ve only seen a few interesting examples. Time will tell…
There are other new features in Scala 3’s type system. See this Dotty page for details. I’ll cover some of them in subsequent posts, but also move on to other topics in Scala 3.
Appendix — January 17, 2021
The original version of this post used a recursive match type, a special kind of type alias, named NSize[N]
. Match types are a new Scala 3 feature. That match type was unnecessary, as using N
is sufficient. So, to replace the discussion of NSize[N]
, let’s look at another recursive match type on N
.
Here is a match type that only allows values within a range of integers, inclusive:
Bounded
is a match type, meaning the type computation and recursion, in this case, are performed at compile time. It returns the Int
subtype between Min
Match types use case
clauses and they can be recursive, as in this example. The case clauses have restrictions, because they are evaluated at compile time. For example, guards (if
clauses) aren’t allowed.
Note how the recursion works. If the MAX == MIN
, MIN
is returned (also terminating the recursion). Otherwise, the union of MAX
and the type returned by Bounded[MIN,MAX-1]
is returned.
The examples that follow show values of 1
to 5
, inclusive, are allowed, but not 0
or 6
.
Here’s an example of what you see in the Scala REPL for one of the examples:
scala> val two: Bounded[1, 5] = 2
val two: 5 | Bounded[1, 5 - 1] = 2 // Note the union typescala> val two2: 2 = two
1 |val two2: 2 = two
| ^^^
| Found: (two : (5 : Int) | Bounded[(1 : Int), (5 : Int) - (1 : Int)])
| Required: (2 : Int)
The union type and 2
are not considered compatible, even though the same value of 2
can be used for instances of both types.
Next is a regular type alias, IndexOf[N]
, which is the special case of Bounded
for zero-based indices.
You might be tempted to write Bounded
as follows, but this doesn’t work:
type Bounded[MIN <: Int, MAX <: Int] <: Int = MAX match
case MIN => MIN
case MAX => MAX
case ? => Bounded[MIN,MAX-1]
It rejects all values except for 5
. This actually makes sense, because the second clause will simply short circuit the evaluation, becase MAX match ... case MAX
is obviously always true. Hence, we never evaluate the last clause for values greater than MIN
but less than MAX
, i.e., exclusive values.
How might Bounded
or IndexOf
be useful? While I haven’t figured out how to implement it yet, a DTList[N] { def apply(index: IndexOf[N]): ? }
method would only allow values for index
that are in bounds! Similarly, consider a methodsplitAt[I <: IndexOf[N]](index: I): (DTList[I], DTList[N-I])
. The types would enforce that the two returned sublists have the correct size.
For a concise summary of the notable changes in Scala 3, see my new Scala 3 Highlights page.
You can start reading the rough draft of Programming Scala, Third Edition on the O’Reilly Learning Platform. The first half or so of the chapters are available. I am refining them still, but any feedback is welcome!