Typeable — A long journey to type-safe dynamic type representation (Part 3)

Toan Nguyen
8 min readJan 5, 2019

--

Last part, Typeable was evolved into safer and more expressive type representations. However, it can still be improved. The final part discusses about how to decompose function type with TypeInType and Levity Polymorphism.

This is part 3 of Typeable series:

  1. Remove boilerplate with PolyKinds and Type Equality (GHC 7.8)
  2. Safer and more expressive type representations
  3. Decompose function type, TypeInType and Levity Polymorphism

So far, we implement new Typeable design like this:

We also need to implement dynApply , which applies a function Dynamic to an argument Dynamic. It is necessary in real-world application, e.g, send an object with a function type, say Bool -> Int, over the network.

In theory, We can use TrApp and App pattern to construct and decompose function type (->) too. The definition of this function is fairly straightforward:

However, because functions are quite ubiquitous, we should define another constructor for the sake of efficiency:

This definition wasn’t compiled before GHC 8.0. First, TrApp and TrFun is ambiguous that can be solved with explicit quantification. Second, GHC can't know that kind TypeRep :: k -> * is identical with * -> *.

Kind equality extends the idea of kind polymorphism by declaring that types and kinds are indeed one and the same. Nothing within GHC distinguishes between types and kinds. Another way of thinking about this is that the type Bool and the “promoted kind” 'Bool are actually identical.

One simplification allowed by combining types and kinds is that the type of Type is just Type. It is true that the Type :: Type axiom can lead to non-termination, but this is not a problem in GHC, as we already have other means of non-terminating programs in both types and expressions. This decision (among many, many others) does mean that despite the expressiveness of GHC’s type system, a “proof” you write in Haskell is not an irrefutable mathematical proof. GHC promises only partial correctness, that if your programs compile and run to completion, their results indeed have the types assigned. It makes no claim about programs that do not finish in a finite amount of time.

GHC User’s Guide

To enable (* ::*) axiom, you have to enable TypeInType extension, which is a deprecated alias of PolyKinds, DataKinds and KindSignatures. Its functionality has been integrated into these other extensions. With this extension, GHC can know that k ~ *, and the code can compile.

Levity Polymorphism

You may notice, there are RuntimeRep and TYPE kind signatures in TrFun constructor. They relates to Levity Polymorphism, which is implemented in GHC version 8.0.1, released early 2016.

In brief, most of types we use everyday (Int, Bool, AST, …) are boxed value, which is represented by a pointer into the heap. The main advantage of boxed types are supporting polymorphism. The disadvantage is slow performance. Most polymorphic languages also support some form of unboxed primitive values that are represented not by a pointer but by the value itself. In Haskell, unboxed types are denoted with MagicHash suffix.

Haskell also categorizes types into lifted, and unlifted. Lifted types is one that is lazy. It is considered lifted because it has one extra element beyond the usual ones, representing a non-terminating computation. For example, Haskell’s Bool type is lifted, meaning that three Bools are possible: True, False ,and . An unlifted type, on the other hand, is strict. The element does not exist in an unlifted type.

Because Haskell represents lazy computation as thunks at runtime, all lifted types must also be boxed. However, it is possible to have boxed, unlifted types.

  • Lifted — Boxed: Int, Bool,...
  • Unlifted — Boxed: ByteArray#
  • Unlifted — Unboxed: Int#, Bool#

Given these unboxed values, the boxed versions can be defined in Haskell itself; GHC does not treat them specially. For example:

Here Int is an ordinary algebraic data type, with one data constructor I#, that has one field of type Int#. The function plusInt simply pattern matches on its arguments, fetches their contents (i1 and i2, both of type Int#), adds them using (+#), and boxes the result with I#.

The issue is, like many other compilers for a polymorphic language, GHC assumes that a value of polymorphic type, such as x :: a s represented uniformly by a heap pointer, or lifted type. The compiler adopts The Instantiation Principle:

You cannot instantiate a polymorphic type variable with an unlifted type.

That is tiresome for programmers, but in return they get solid performance guarantees.

How can the compiler implement the instantiation principle? For example, how does it even know if a type is unlifted? By kinds, much the same way that terms are classified by types. Type, or * kind which we use every day is lifted. In contrast, # is a new kind that classifies unlifted types, e.g: Int#, Bool#.

In default, polymorphism functions assume kind of parameters is Type. If we attempt to instantiate it at type Float# :: #, we will get a kind error because Type and # are different kinds. The function arrow type (->) is the same. It is just a binary type constructor with kind:

But now we have a serious problem: a function over unlifted types, such as sumTo# :: Int# -> Int# -> Int#, becomes ill-kinded!.

For many years its “solution” was to support a sub-kinding relation. That is, GHC had a kind OpenKind, a super-kind of both Type and #. We could then say that:

However, there are drawbacks. The combination of type inference, polymorphism, and sub-typing, is problematic. And indeed GHC’s implementation of type inference was riddled with awkward and unprincipled special cases caused by sub-kinding. Moreover, The kind OpenKind would embarrassingly appear in error messages.

Levity polymorphism bring new idea: replace sub-kinding with kind polymorphism. New primitive type-level constant, TYPE :: RuntimeRep -> Type is introduced with the following supporting definitions:

RuntimeRep is a type that describes the runtime representation of values of a type. Type, the kind that classifies the types of values, was previously treated as primitive, but now becomes a synonym for TYPE Lifted, where Lifted :: RuntimeRep. It is easiest to see how these definitions work using examples:

Any type that classifies values, whether boxed or unboxed, lifted or unlifted, has kind TYPE r for some r :: RuntimeRep. The type RuntimeRep tells us the runtime representation of values of that type. This datatype encodes the choice of runtime value.

We can now give proper types to (->), same as TrFun. This enables polymorphism for both lifted and unlifted types.

Note: Unboxed and Levity Polymorphism types are imported in GHC.Exts module.

If you are more curious about Levity polymorphism, let’s take a look at original paper.

Compare TypeReps

It is sometimes necessary to use type representations in the key of a map. For example, Shake uses a map keyed on type representations to look up class instances (dictionaries) at runtime; these instances define class operations for the types of data stored in a collection of Dynamics. Storing the class operations once per type, instead of with each Dynamic package, is much more efficient.

More specifically, we would like to implement the following interface:

But how should we implement these type-indexed maps? One option is to use HashMap. We can define the typed-map as a map between the type representation and a dynamic value.

Notice that we must wrap the TypeRep key in an existential SomeTypeRep, otherwise all the keys would be for the same type, which would rather defeat the purpose! The insert and lookup functions can then use toDynamic and fromDynamic to ensure that the right type of value is stored with each key.

Because Map family uses balanced binary trees to achieve efficient lookup, SomeTypeRep must be an instance of Ord. This is straightforward, since TypeRep use fingerprint for unique hash, it can be compared too.

Notice that we cannot make an instance for Ord (TypeRep a): if we compare two values both of type TypeRep t, following the signature of compare, they should always be equal!

Other Changes

  • Type-indexed type representation interface is in Type.Reflection module. Data.Typeable provides type representations which are qualified over this index, providing an interface very similar to the Typeable notion seen in previous releases for backward compatibility.
  • Data.Typeable.TypeRep and Type.Reflection.TypeRep are different. Data.Typeable.TypeRep is alias of SomeTypeRep
  • TypeRep definition is replaced with record, to easier to extend parameters https://gist.github.com/a38efab80c66a1d6a83807ade3d17bf2
  • The kind of the TypeRep in each TrTyCon and TrApp constructor is cached. This is necessary to ensure that typeRepKind (which is used, at least, in deserialization and dynApply ) is cheap, because calculating the kind of type constructor and nested type applications is pricy,
  • We need to be able to represent TypeRep Type. This is a bit tricky because typeRepKind (typeRep @Type) = typeRep @Type, so if we actually cache the TypeRep of the kind of Type, we will have a loop. One simple way to do this is to make the cached kind fields lazy and allow TypeRep Type to be cyclical.

Limitation and unexplored future

Our interface does not support representations of polymorphic types, such as TypeRep (∀ a. a -> a). Although plausible, supporting those in our setting brings in a whole new range of design decisions that are as of yet unexplored (e.g. higher-order representations vs. de-Bruijn?). Furthermore, it requires the language to support impredicative polymorphism (the ability to instantiate quantified variables with polymorphic types, for instance the a variable in TypeRep a or Typeable a), which GHC currently does not. Finally, representations of polymorphic types have implications on semantics and possibly parametricity.

Similarly, constructors with polymorphic kinds would require impredicative kind polymorphism. A representation of type TypeRep (Proxy :: ∀ kp. kp -> *) would require the kind parameter k of TypeRep (a :: k) to be instantiated to the polymorphic kind ∀ kp. kp -> *. Type inference for impredicative kind polymorphism is no easier than for impredicative type polymorphism and we have thus excluded this possibility.

Concluon

Typeable take a long journey for definition, design, refactor and redesign, from runtime term level to type-safe. However, the journey doesn't stop here. GHC is still growing, better and safer, with innovation and contribution of brilliant researchers. It motivates us to study and do great software and bring toward the industry.

Acknowledgment

Thanks to Richard A. Eisenberg and his team for great research and contribution. This post take many reference in their papers.

Appendix

  • The complete code in “A reflection on types” paper: link
  • Homogeneous equality ~ Type equality: (a :~: b)
  • Heterogeneous equality ~ Kind equality: (a :~~: b)
  • Universal quantification: forall a. a -> a
  • Existentially quantified type: forall a. Show a => a -> String

References

--

--

Toan Nguyen

A simple developer who still explores Functional Programming