Wildcards in Java generics (Part 3/3)
Contravariance: Why is Consumer<Animal> a subtype of Consumer<Cat>?
This post wraps up the blogpost series on the three types of variance (in/co/contra). Java offers us a mechanism for distinguishing between these 3 cases of variance. You can check out invariance and covariance in our previous blogs: Part 1 and Part 2 respectively. This third and final part now, discusses contravariance.
1) Knowing what invariance, covariance and contravariance entail is helpful (Part 1 and Part 2)
2) Definition of subtype: If an object of class.Animal can be seamlessly changed with an object of class.Cat, then Cat is (or at least, should be) a subclass of Animal.
3) Point #2 gets fuzzier when generics enter the picture. The whole point of variance is about what happens when we deal with complex types (types that are dependent on some other types, mostly in generics). Sometimes, GenericType<Animal> is a supertype of GenericType<Cat>. This is called covariance.
Sometimes, they have nothing to do with each other. This is called invariance.
And finally, sometimes, GenericType<Animal> is a subtype of GenericType<Cat> however reversed that may sound. It feels really counter-intuitive, since Animal is a supertype of Cat, that some GenericType<Animal> is a subtype of GenericType<Cat>. Weird right? Well that’s contravariance, and you may now move on to other medium articles if this seems blatantly obvious to you. If you want the details, continue below.
Consider the following take on cats, dogs, and animals in the evergreen subtyping example below:
All animals have a name and are able to speak. The name is a string kept in the base class.Animal, and String speak() is an abstract method that is overriden in both Cat and Dog:
Contravariance is where the fun with inheritance in Java begins. Generics in Java are, by default, invariant. They can be and, commonly are, logically contravariant. Let’s imagine an inverse world where some GenericType<Animal> should be a subtype of GenericType<Cat> despite Animal being a supertype of Cat.
In java, GenericType<Cat> and GenericType<Animal> have absolutely nothing to do with each other due to invariance. To force generic types in Java to be assignable to each other based on the generic parameter, we need to use the wildcard. But bear with me and really try to imagine an upside down world where GenericType<Animal> is a subtype of GenericType<Cat>.
For this to happen in Java, we need to use the contravariance generic operator: <? super Cat>. And just to make things even easier, let’s use List<> as our dummy generic type of choice.
Now: A reference of List<? super Cat> can point to List<Cat>, List<Animal> or List<Object>. Really backwards. Especially when you consider the definition of subclasses:
If a reference to an Animal can be seamlessly replaced with a reference to a Dog, then Dog is a subclass of Animal, and Animal should be (and is) a superclass of Dog.
Here is the code:
Note how List<? super Cat> baseList [line 2] can be assigned a reference to List<Cat> [line 11], but also List<Animal> [line 7].
It is kind of weird, but since this is compilable, by definition it follows through that List<? super Cat> is a base class of List<Animal>. Crazy right??
So why is this important?
Well, in contravariant lists, we can add whatever we want, as long as it is a Cat (or a Cat supertype), to any list that can hold it. The “worst” that can happen is that our base list of type List<? super Cat> points at List<Object>, but that is no problem because we can add Cats, or Animals to List<Object>. This may be useful if we want to, let’s say, add Cats into all types of different rosters, some of which may be Cat-specific, some of which may be Animal-specific and some of which may be Object-specific.
We can generalize the code for adding Cats to any roster by using List<? super Cat> [line 1].
Lists recap: If we want to have a base pointer for Lists of a type hierarchy, we should consider 2 scenarios:
a)(Covariance) We want to read/delete/sort the elements, but not add any new ones, in which case we need to use a base list pointer of type List<? extends Animal>. This is explored in detail in the previous post. This means that the worst case scenario, is that this list contains Animals. Note how we can now be sure that we are getting an animal, but we cannot be sure if it is an instance of the Animal.class, the Cat.class or the Dog.class. Moreover, we cannot be sure if this is coming from List<Cat>, List<Dog> or List<Animal>. Therefore, adding elements is a bit tricky.
b)(Contravariance) We want to add elements of a specific type (Cat, in our examples) but don’t want to read the elements. In the contravariance scenario, we only know that they are of type somewhere between Object and Cat. Hence, every time we need to use an object from that list, we would need to upcast not only from Object, but also from the type too. Not really useful, and, really, easier said than done.
The caveats of both a) and b) can be worked around if we somehow keep the knowledge of which elements are of which type in the lists, but this defeats the purpose of having both a base pointer, and the whole polymorphism thing us OO programers all cherish and love. Moreover, contravariant lists make almost no sense. A much more useful and natural example of contravariance is the Consumer<T> interface.
So why is this important (vol2.), with an unforced example?
Quick throwback: Remember the definition of a subtype/supertype relationship? If the reference to an Animal can be seamlessly replaced with the reference to a Dog, then Dog is a subclass of Animal, and Animal should be (and is) a superclass of Dog.
Let’s translate that definition to Consumer<T> (action). In layman terms, we can map Java’s Consumer<T> closest to the expression of “something being done to T”.
A Cat is an Animal, and everything that is being done to an Animal, can also be done to a Cat*. If we have some action to do on an Animal (Consumer<Animal>) we can surely do the same action to a Cat (Consumer<Cat>) right?
That would mean that Consumer<Animal> can replace any Consumer<Cat> just fine.
Try it yourself: Anything done to an animal can be done to a cat? Remember that “done to a/an X” can be replaced with Consumer<X>. Well, according to the definition that “Consumer<Animal>can replace any Consumer<Cat> just fine”, then Consumer<Cat> is (and should be) a superclass of Consumer<Animal>. The inheritance is naturally flipped. Crazy.
Let’s get back to some examples.
If we want to print the name of all the cats in a list, we really don’t need a Consumer<Cat>. A Consumer<Animal> will do, since the name of the animal is kept in the base class.
Note: I stick to the very liberal use of “IS” when talking about something being a subclass. Example: If some conditions of GenericType<Something> SHOULD BE/IS a super/sub class.
Well, that kindda depends on the language implementation. Some things that are a super/sub class in theory, may not share that inheritance relationship in a specific language.
By default, generics in Java are invariant; Consumer<Dog> has nothing to do with Consumer<Animal>. However Consumer<? super Dog> has everything to do with both Consumer<Animal> and Consumer<Dog>, and it complies with the counter-intuitive contravariance situation we have going on.
Even though logically, Consumer<Dog> SHOULD BE a base class of Consumer<Animal>, they are invariant.
Instead, in Java:
Consumer<? supertype Dog> IS indeed a base class of Consumer<Animal> and this is exactly what should be used when dealing with Consumer<>.
So why is this important (vol3.) , with a real-life example from omni:us?
Here at omni:us, we deal with a lot of files. One of the ways we do that is via Java’s streams. Sometimes we get those streams from a file system, sometimes from a storage bucket, sometimes from a database blob. And when, let’s say, we are logging the number of processed files or just copying them from one place to another, we don’t really care where they come from. But when we are trying to count the bytes in it, decode it, or log implementation-specific performance, we really need to know the underlying implementation. In (very, very simplified) code, the concept looks like these two implementations:
A non-generic one:
A generic one:
Let’s say we need to process files with the following 3 steps (method: void ingest(String) across the 2 implementations):
1) Get the file from some storage [line 17] and save a copy [line 19]
2) Run dark AI magic [line 21]
3) Carry out post-processing stuff. Up to the caller [line 22].
As we can see, input streams come with a lot of boiler plates about resource management that we would like to abstract away. We want to keep those parts generic. We do, however, want to keep the creation of the stream and the post-processing part up to the caller. The problem is that sometimes, this step might be implementation-specific. Here are two examples of how we would use the Ingester:
So if we stick with the ClassImplementationLossIngester, we lose the option to do implementation-specific logging in the post-processing step ([line 11] does not compile).
If we, however, choose the second way with the GenericIngester, as well as the power of contravariance, we open up the possibility to post-process specific implementations [line 16], as well as generic ones [line 21].
This marks the second GenericIngester implementation as the clear winner of how we should do this. Understanding contravariance and where to use it, allows us, as a tech team, to create generic modules which can communicate with each other on different levels of abstraction, all while being type-safe. Whoever creates the consumer or supplier, can decide the specific implementation of the InputStream interface they are writing it for (this has been the case even before contravariance). HOWEVER, when somebody uses their work (when creating an ingester instance), he/she has the same freedom. He/She can decide to use a specific supplier and a specific consumer, or a specific supplier and a generic consumer, or a generic supplier and a generic consumer… you get my drift. In all those cases, type-safety is preserved. No upcasts are needed.
Generics in Java are invariant by default, for a good reason of staying type- safe. You can force generics to be either contravariant or covariant using the wildcard. <? extends T> means covariant, <? super T> means contravariant.
As a rule of the thumb, output-type parameters are often covariant, and input-type parameters are often contravariant. A prime example is Consumer<T>. In almost all scenarios when dealing with generic consumers, Java’s Consumer<T> should be used like this: Consumer<? super T> .
Consume input parameters with contravariance in mind and keep your dogs and cats in separate lists to stay type-safe.