Mindful Coding — Recursive Type Bound, Liskov ≠ Duck

Darren Wegdwood
7 min readFeb 2, 2020

--

In this post, we will explore one of the most complex concept in Java through First Principles. To risk a little bit of oversimplification, First Principle is a way of learning by breaking down a complex problem into its most fundamental form. At this level of reasoning, the only constraint is the limit of the language specification itself, rid of all external assumptions. It is a bottom-up approach of solving a problem by inspecting the form, instead of the behaviour of the element.

A non-FP way of learning is when you learn by applying assumptions. Sometimes the assumption manifests itself as an advice/solution from a Senior Engineer, or as a top voted answer on StackOverflow. While those are helpful when you need to fix a production bug, intentional learning is what separate a bug-fixer and someone who setup the house in such a way that bug would not exist.

This post assumes that you already have some basic knowledge of variance in generics, you are gonna need it.

Finding the Maximum Element in a Collection

This is a pretty trivial task in Java, most of us has dealt with this before. If it is a Comparable like Integer, all you have to do is call Collections.max, and you will get the maximum element.

Integer maxInt = Collections.max(integerList);

However, if we pass a class that is not a Comparable to the Max method, it simply won't work. We can test it with a simple Line class that doesn't implement Comparable.

Compiler will complain that Line is not a Comparable, since Collections.max takes a Comparable, it relies on the compareTo method of Comparable to find the maximum. We can solve this by making Line implements Comparable, and overrides compareTo method to compare using length.

Now if we run the code again, it will return the longest line in the list.

The Curious Case of Collections.max

Did you ever look intently into Java SDK or any third party library source code to see how things are implemented? Well you should, for several reasons:

  1. Knowing the implementation helps you to justify its usage, or lack thereof.
  2. Learn about best practices and how industrial standard library are designed.
  3. See practical application of textbook idioms and patterns.
  4. Expose yourself to new concept.

Curiosity makes you a better engineer. In general, curiosity makes you better at whatever that you do.

For Collections API, you don’t need to wander far before stumbling across something interesting. I pick Collections.max because it has an unnecessarily complex API, perfect for an unnecessarily long discussion. Also, it involves Comparable, which is the textbook example of Recursive Type Bound, along with Enum.

Look at this beast.

public static <T extends Object & Comparable<? super T>> T max(Collection<? extends T> coll)

If you’re not an experienced API designer, then WTF is the correct response.

But what’s more important is the reaction following the initial WTF. Are you intrigued, or do you close the tab? Most of us will dismiss it as something that doesn’t concern us. While that is mostly true, we could continue doing that and be mediocre, or we could put ourself together and do the hard thing.

Recursive Type Bound

The class declaration class Line implements Comparable<Line> is an interesting one. Implementing Comparable allows Line to be sorted/searched by Collections utility methods, but why not simply class Line implements Comparable? What's with the bounding of Line itself?

This is referred to as Recursive Type Bound — When a type is bounded by itself.

The compareTo method in Comparable shows that the argument T is the object to be compared against the calling instance.

So class Line implements Comparable<Line> is simply saying that Line is a class that can be compared with itself.

If we were to write class Line implements Comparable<String>, we are saying that Line can be compared to a String. While that would work in theory, it wouldn't make much sense. This is why when a class implements Comparable, it is usually bounded by itself. Example of such classes are boxed primitives, String, Enum, and File.

Reconstructing Collections.max

Armed with the knowledge of Recursive Type Bound, let’s further explore Collections.max.

To truly understand it, we will reconstruct Collections.max from its simplest form to build a Minimum Viable Product, then reason each component incrementally until we reach the final form.

A quick glance at OpenJDK’s implementation of Max shows that the implementation is quite trivial and straightforward. Basically, it loops through the collections, and sets the current maximum if compareTo returns a positive number. The remaining discussion will omit the implementation code and reason solely through the API, which is the method signature.

Minimum Viable Product

Following First Principle, we shall ask ourselves what is the minimum requirement that constitutes a Max method? It probably just need to take a Collection of Comparable, and return the max element.

However, due to the fact that generics are invariant, there are a few problems with this approach:

  1. The argument type is too restrictive. We can only pass in a Collection<Comparable<T>>, not even a Collection<Line> although Line is a subtype of Comparable<T>.
  2. In fact this API will not work, because we cannot compare between elements in the collection. With Comparable<T>, all we know is that it is a type that can be compared to some T. If we pass in a collection of class StyledLine implements Comparable<Line>, then they can only be compared to Line, but not with each other!
  3. For the same reason, this function can only return Comparable<T> instead of T.

To fix this, we need to assure the compiler that Max will only receive collection of a type that can be compared with itself, not just a type that can be compared with some T.

So, instead of taking a collection of Comparable<T>, Max should be taking a collection of T extends Comparable<T>. This is another fine example of Recursive Type Bound, it captures any type that can be compared with itself.

Notice that the definition of T is now moved to the type parameter declaration section, before the method return type, otherwise it won’t be valid syntax.

To recap,

  1. Comparable<T> refers to a type that can be compared with some T.
  2. T extends Comparable<T> refers to a type that can be compared with itself.

At this point, we have arrived at an MVP of the Max method. However, it is still pretty far from the original mutant that we have seen previously. It is time to explore some nuance and tragedy that made Collections.max what it is today.

Nuance — A Type that can be Compared with its Supertype

Suppose we have a new class called StyledLine. It inherits everything from Line, and adds a LineStyle field.

Since StyledLine is a subclass of Line, it is also a Comparable, however it doesn’t override compareTo because the comparison of StyledLine doesn't need to take LineStyle into consideration. For example, a dashed StyledLine, when compared to a Line or a dotted StyledLine, are considered equal as long as they have the same length.

So, we know that StyledLine can be compared with itself, or its super type, Line.

Dealing with this kind of class, our MVP version of Max method will break, because it only deals with type that can be compared with itself. If we try to pass in a List of StyledLine to Max, we will get a compilation error.

To work with classes like StyledLine, T extends Comparable<T> must be changed to T extends Comparable<? super T>.

Liskov Substitution Principle in Action

<T extends Comparable<? super T>> is contravariant. In this context, a type that can be compared with itself (Line) is a subtype of a type that can be compared with itself or its supertype (StyledLine).

Go back and read about variance if you haven’t already.

By doing this, we're applying LSP, specifically the concept of Contravariance in method argument. In another word, method argument can be replaced with its supertype. Max method is accepting more generous than it has to, it doesn't just support Line, but it goes way beyond to support StyledLine.

Counterintuitive yeah? LSP is a pretty deep philosophy with simple wording. Next time someone shows you that stupid duck picture, tell them about <T extends Comparable<? super T>>.

Tragedy— Compatibility with Pre-Generics Code

With Type Erasure, bounded generic type will be erased to its leftmost concrete type, so <T extends Comparable<? super T>> will be erased to simply Comparable. For the Max method that we have derived so far, the generated code after Type Erasure looks like this,

public static Comparable max(Collection<Comparable> coll) { ... }

However, Java before version 5 doesn't support generics, everything gets casted between type and Object. At that time, Collection was written to work with Object, it looks something like this,

public static Object max(Collection coll) { ... }

How to ensure backward compatibility?

So they fix™️it by exploiting the fact that,

  1. Generic type parameter is erased to its leftmost type, and
  2. Multiple Bound, e.g. <T extends A & B>, T will be erased to A.

By adding Object as the first bound, T will be erased to Object instead of Comparable, thus fully compatible with old code, while providing compile-time type safety check for generics.

public static <T extends Object & Comparable<? super T>> T max(Collection<T> coll) { ... }

After compilation, it will generate a signature as if it was written in 2000.

Finally

Remember PECS? In general, if a method only read from a collection, it is good practice to bind the collection type parameter with extends. Since Max is only reading from the collection, Collection<T> should be changed to Collection<? extends T>, so that Max can also work with any subtype of T.

Although, in this particular method it doesn’t really matter, because the boundary of T is very clearly defined in <T extends Object & Comparable<? super T>>. This seemingly unnecessary additions is there for the sake of API consistency.

There we have arrived at the original Collection.max signature, what a ride!

Summary

  1. Comparable<T> — a type that can be compared with some T.
  2. T extends Comparable<T> — a type that can be compared with itself.
  3. T extends Comparable<? super T> — a type that can be compared with itself, or its super type.
  4. T extends Object & Comparable<? super T> — Same as previous point, but type erased to Object for backward compatibility.
  5. Collection<? extends T> — a read-only collection.

To put it together,

public static <T extends Object & Comparable<? super T>> T max(Collection<? extends T> coll)

Collections.max accepts a read-only collection of a type that can be compared with itself, or its super type. It also preserve backward compatibility.

Follow me @ https://twitter.com/darrenbkl

--

--