The Value of Parametric Polymorphism
I’m giving a talk at Scale by the Bay, October 28th, on Lessons Learned from 15 Years of Scala in the Wild.
Programming Scala, Third Edition is now available. It provides a comprehensive introduction to Scala 3 for experienced Scala developers, as well as a complete introduction to Scala for new Scala developers.
The other day, I tweeted the following:
A friend replied that I should provide an example, so here you go… I’ll use Scala 3 syntax, but the arguments apply equally well to Scala 2, to Java, and most other statically-typed languages. This repo has the Scala 3 and Java examples. For brevity, I’ll just discuss the Scala 3 example.
While I’ll focus on the boilerplate reduction aspect, this post ends with thoughts about the real “superpower” of parametric polymorphism, which might be less familiar to many of you.
First, what is parametric polymorphism? You probably already understand subtype polymorphism, the object-oriented notion of polymorphism, where an abstraction is defined, say with Java interface or Scala trait, then multiple concrete implementations provide the specific details that vary for each of those types. This is one of several kinds of polymorphism that I discuss in Programming Scala, Third Edition. Parametric polymorphism (also discussed at length in the book) is today’s subject, where a type parameter is the abstraction and implementations are created (by you or by the compiler deities) for specific implementations.
For example, suppose you didn’t trust the
size method for Scala collections 🤓. Here is a trivial example of a parametrically-polymorphic method alternative:
Starting scala3 REPL...
scala> def len[T](xs: Seq[T]): Int = xs.foldLeft(0)((count, _) => count + 1)
def len[T](xs: Seq[T]): Intscala> len(Seq(1,2,3,4,5))
val res0: Int = 5
We obviously don’t care about the type of elements. You could even rewrite the method signature as
def len2(xs: Seq[?]): Int and the same implementation would still work. I’ll show a more interesting example at the end, but the point here is that I specify a type parameter
T and the method works for any
Seq[T] I provide. I don’t need separate methods for
Okay, now suppose a define some messages that can be sent between subsystems, where the receiver of a message needs to handle it properly. First, here are the “incoming” and “outgoing” message types:
Why didn’t I use the new fancy
enum feature in Scala 3? For details I won’t go into here, there’s a slight disadvantage in the way typing works for
enums vs. the “classic” approach shown here. (Also, we wouldn’t be able to support adding new
IncomingMessage types with
enums). See the repo branch
with-enums and the code comments there for the details of the typing difference and the additional code required if you use
Okay, here is the first version (
v1) of how we might work with these messages. First, I need message handlers:
I start with a simple trait that defines the handler abstraction, then I define instances for each kind of
IncomingMessage. Yes, the
apply method bodies are all identical and I could have just provided a concrete
MessageHandler.apply body. Realistically, the bodies won’t be identical. I’ll come back to this below.
Now let’s use the implementation:
Processor.apply has to pattern match on the message and call the write one. It’s a bit tedious, but it is straightforward to implement and to read when you come back to the code later. That’s a virtue we shouldn’t ignore, even though we’ll eliminate the need for this pattern matching in a moment.
If you start
sbt and run it, entering
1 at the prompt, here’s what you get:
sbt:parametric-polymorphism> runMultiple main classes detected. Select one to run:
 polyglotprogramming.messaging.v4java.MainEnter number: 1
[info] running polyglotprogramming.messaging.v1.Messaging
Received message: Start(start)
Succeeded(Successfully processed Start(start))
Received message: DoWork1(dowork1)
Succeeded(Successfully processed DoWork1(dowork1))
Received message: DoWork2(dowork2)
Succeeded(Successfully processed DoWork2(dowork2))
Received message: Stop(stop)
Succeeded(Successfully processed Stop(stop))
Now let’s use parametric polymorphism to reduce the size of this code, then discover other benefits. This is
v2. The messages are the same. For the moment, I’ll go ahead and use a single implementation of
MessageHandler.apply, then return to the general case where that won’t work (in
It’s drastically smaller. We still need concrete instances for each
IncomingMessage type, but now we can use concise
Here is the updated
Processor and an identical
Processor.apply works for any type
IM that is a subtype of
message argument is typed specifically to be
IM, too. The “secret sauce” is the addition of a
using parameter list where a corresponding
MessageHandler[IM] must be in scope. We imported the
given instances we need using the second
import statement on line 3. Not only have we eliminated some boilerplate (it would be even more noticable in a less trivial example…), but we have also eliminated the need to match in
How can we handle different
MessageHandler.apply implementations? The Template Method Pattern is our friend. Here’s our
MessageHandler.apply is still concrete and I’ve now declared it to be
final, to emphasize that users shouldn’t override a concrete method, because it’s very easy to break its “contract”. Instead, the protected
process method is declared for variations in handling. There are also two protected helper methods,
given instance only needs to define what’s unique about the way it handles messages, by implementing
process. Now, I’ll assume that all handling of
DoWork2 messages fail. For all four implementations of
process, a unique “details” string is returned.
I won’t show the corresponding
v3/Main.scala, as it identical to the
v2 implementation, except for the package name and import statement.
This code uses parametric polymorphism to define uniform handling for all allowed subtypes of
IncomingMessage, while providing the hooks required for implementing variant behavior.
Finally, let’s see how a new
IncomingMessage type can be supported (
v4/MessageHandlers.scalais identical to the
v3 implementation, except for the package name. Here’s the new
Processor is unchanged. It continues to work as long as the input
handler have the correct types. A new message type
DoWork3 is declared along with a corresponding
given instance for it.
Messaging() adds an instance of
DoWork3 and everything works as before! Note the comment; if you forget to define a matching
given instance, you’ll get a compile time (vs. runtime) error.
I eliminated all the original boilerplate code using parametric polymorphism, while preserving the flexibility necessary for variant behavior, using the Template Method Pattern (perhaps my favorite Gang of Four pattern). Yes, the simple example wasn’t that verbose to begin with, but hopefully you can see how this could make a real difference in many real-world problems.
I also eliminated the pattern matching that was required in
v1, along with the need to modify that code if and when a new message type is added! In
v4, I exploited the “pluggable” nature of
Producer. All I needed besides the new
IncomingMessage type was a
given instance of a corresponding
MessageHandler. So, it is easy to add new message types, but it’s also robust against an easy mistake to make. A compile-time error is thrown if I forget to add a
given instance of a
It’s not quite as easy to do parametric polymorphism in Java and you don’t have
given instances and
using clauses to keep the code in
Messaging() as clean and concise, but you can still leverage these ideas. See the Java implementation I provided in the repo. (My Java is rusty; PRs for improvements are welcome!)
The Real “Superpower” of Parametric Polymorphism
Okay, this post was inspired by the need to keep code as concise and still robust as possible and how parametric polymorphism gives you powerful tools for this purpose. However, there is really an even more important “superpower” that parametric polymorphism gives us.
Remember the “trivial”
len method I showed at the beginning? Here’s the signature again, this time with a generic name and omitting the body:
def foo1[T](xs: Seq[T]): Int
Hmm. What if I didn’t have parametric polymorphism and I wrote something like this instead:
def foo2(xs: SeqOfInt): Int // SeqOfInt is a sequence of integers
Forget the fact we started with
len and ask yourself, what are the possible, “legal” implementations for each of these methods?
foo2 could be len, but also sum, multiple, median, and others, whereas there is really only one possible implementation for
foo1, the length of the collection.
By being more abstract in the types of the elements, we constrain the allowed implementations.
The real power of parametric polymorphism is the way it makes our type and method signatures much more precise, even while they become more flexible for use with different types. It becomes much harder to write giant hairball methods that mix all kinds of functionality, creating a morass that’s hard to understand, test, debug, and reuse. It brings greater discipline and precision to our designs.
All that said,
foo1) is kind of trivial. Here’s a more sophisticated case that still leverages the same power. What if the collection elements are “numeric”, i.e.,
Numeric[T] exists for the element type
T? Here’s how we can multiple any such sequence of elements:
scala> def multiply[T : Numeric](xs: Seq[T]): T =
| xs.reduceLeft((x1,x2) => summon[Numeric[T]].times(x1,x2))
def multiply[T](xs: Seq[T])(using evidence$1: Numeric[T]): Tscala> multiply(Seq(1,2,3,4,5))
val res0: Int = 120scala> multiply(Seq(1.1,2.2,3.3,4.4,5.5))
val res1: Double = 193.26120000000003
Now the type parameter is constrained by a context bound. (I discussed context bounds and what
summon does here.) There must exist a
Numeric[T] for any concrete type
T we use. These are predefined in the Scala library for
Double . You can define your own
Numeric[T] for your own types, like
Change the method name to
foo again and ask yourself, what are the possible implementations allowed by the method signature? In this case, it’s not just one implementation, but the defined operations that
Numeric[T] provides where the corresponding methods return
T: addition, subtraction, multiplication, etc.
This is just the tip of the Iceberg. You can find plenty of references on the web that expound upon this idea. See for example, the great talk by Rúnar Bjornasson called Constraints Liberate, Liberties Contrain and the book he cowrote with Paul Chiusano called Functional Programming in Scala. (A second edition is under development.)