# Existential crisis

*Paul plays around storing and operating on collections of mixed-type values in Haskell. If you’re comfortable with **Data.Dynamic** there’s probably nothing new here, but playing around is still fun!*

One of the components in the audio collaboration system we’re building needs to handle data whose type is determined by other components at run time. Our existing implementation used a simple sum type (a tagged union) to do this, but that approach has some deficiencies. So, over Christmas, I decided to experiment with different ways of storing values with mixed types in the same data structure.

This post aims to explain some of the things I found, and should hopefully compliment the good material that others have already published.

The Haskell wiki page on heterogeneous collections suggests a number of approaches for storing values of mixed type in a data structure:

- The simple algebraic sum type that we were already using
- A “universal type” — embodied by
`Data.Dynamic`

- “Existential” types
- The
`Hlist`

strongly typed collections library.

Whilst I’ve recently been playing with Idris and have read quite a bit about dependant typing, the description of using `Hlist`

as “very advanced and a little restrictive” made me quite nervous! I did some reading about it and it looked amazing and terrifying in equal measure, so I decided to move on.

Of the other two suggested approaches from the wiki, `Dynamic`

seemed too much like black magic at first glance. So I finally settled on trying my hand with *existentials* again.

#### Why?

Before I get into exactly what I was playing around with, I should explain briefly why we want to be able to store a collection of values of different type.

The architecture for the audio collaboration tools that we are building has a publication/subscription API middleware component that deals with forwarding messages to interested parties. This component also does a lot of data validation so that the providers of APIs can focus more on the business of executing the work, rather than checking values and providing good error messages.

For this architecture to work generically, the API providers need to be able to publish a definition of the types of values they accept for any given piece of data. (That is, they need to be able to describe types.) Also, we need to be able to cache a copy of all of their data so that when a new client subscribes we can publish the current state of the data to them. (That is, we need to be able to store arbitrarily typed data.)

You can see all the components working together in this demo video.

#### Problems with using an algebraic data type

Our first implementation used a simple algebraic sum type to describe possible values:

-- | WireValues represent data that our API receives "off the wire"

data WireValue = WvInt Int | WvString String deriving (Show, Eq)

This behaves just how you’d expect. You can wrap up values of different types into a value of the sum type and put them into a container:

λ: [WvInt 3, WvString "hello"]

[WvInt 3,WvString "hello"]

However, everywhere you want to get access to the content of a `WireValue`

you have to pattern match all the possibilities:

wireValueAppend :: WireValue -> WireValue -> Maybe WireValue

wireValueAppend (WvString s1) (WvString s2) =

Just $ WvString $ s1 ++ s2

-- Bad types:

wireValueAppend _ _ = Nothing

λ: wireValueAppend (WvString "hello") (WvString "world")

Just (WvString "helloworld")

λ: wireValueAppend (WvString "hello") (WvInt 3)

Nothing

This works just fine, but starts to get cumbersome when you add more types to `WireValue`

because every function has to cope with new unpacking possibilities. The `-Wincomplete-patterns`

GHC option is really useful for spotting mistakes here, but the approach still leads to lots of cases to keep track of.

Another problem we had was that of expressing container types. If we extend `WireValue`

to express lists recursively

data WireValue = WvInt Int | WvString String | WvList [WireValue] deriving (Show, Eq)

we run into the problem that our type is too permissive — it’s possible to store a heterogeneous list of values *inside* a `WvList`

itself, so we’d have to validate the type of each and every value of the list at runtime.

#### Do existentials have the answer?

From an extremely dubious and hand-wavy point of view, I want Haskell just to “know” what type I stored in a value but hide it from my code until I need to unpack it.

We can do something like this with *existential quantification *(or “existential values”). There are some good tutorials on this already out there. The headline for me is that existential type declarations have a type variable on the RHS that *is not present on the LHS*:

{-# LANGUAGE ExistentialQuantification #-}

data WireValue = forall a. Show a => WireValue a

instance Show WireValue where

show (WireValue a) = show a

The explicit `forall a.`

on the RHS is the actual existential quantification. Notice the `Show`

class constraint, which we’ll come to in a moment.

This definition is a good start. We can put values into a `WireValue`

and GHCi can show us what they were, regardless of what the type of the contained value is:

λ: WireValue 3

3

λ: WireValue "Hello"

"Hello"

The trick is that we had to tell Haskell that values that can be held in `WireValue`

are `Show`

able, and, more annoyingly, the *only thing we can do* with a value wrapped up in `WireValue`

is `show`

it. Hmm, perhaps this isn’t going to be so useful!

As an aside, you can use `StandaloneDeriving`

to generate an instance of `Show`

for `WireValue`

automatically:

{-# LANGUAGE StandaloneDeriving #-}

deriving instance Show WireValue

#### Adding more type classes: Equality

Perhaps we can make our `WireValue`

type more useful by adding more constraints as to the kind of thing `a`

can be? Let’s try to define equality.

It’s easy to extend the definition of `WireValue`

with the `Eq`

constraint:

data WireValue = forall a (Show a, Eq a) => WireValue a

But trying to define an instance of `Eq`

for `WireValue`

itself gives us a compiler error

instance Eq WireValue where

(WireValue x) == (WireValue y) = x == y

-- Couldn't match expected type ‘a’ with actual type ‘a1’...

because we can’t guarantee that `x`

and `y`

are of the same type. (Fun fact: standalone deriving of `Eq`

runs into this exact problem too.)

What we need is a way of safely casting values to a given type.

After a bit of searching, I came across the `cast`

function from `Data.Typeable`

that does exactly what we need!

λ: :type Data.Typeable.cast

cast :: (Typeable b, Typeable a) => a -> Maybe b

With that we can define an `Eq`

instance that does what we want:

import Data.Typeable (Typeable, cast)

data WireValue = forall a. (Typeable a, Show a, Eq a) => WireValue a

instance Eq WireValue where

WireValue x == WireValue y = maybe False (x ==) $ cast y

And it works beautifully:

λ: WireValue 3 == WireValue 3

True

λ: WireValue 3 == WireValue 4

False

λ: WireValue 3 == WireValue "apple"

False

#### Wait — what is `Data.Typeable`

?

Chris Done has written a fantastic post on `Data.Typeable`

and the related `Data.Data`

package, with a view to programming generically in Haskell. The documentation for `Data.Typeable`

mentions not only generic programming but the `Data.Dynamic`

package, which I had so carefully avoided because I didn’t understand the magic. Perhaps now was the time to dig a little deeper?

`Data.Typeable`

provides the `typeOf`

function as a way to extract a representation of a Haskell type from a value:

λ: :type Data.Typeable.typeOf

Data.Typeable.typeOf :: Typeable a => a -> TypeRep

The (type-safe) `cast`

function we used earlier simply guards an `unsafeCoerce`

with a check that the types you’re coercing between have the same representation.

Looking at the definition of `Dynamic`

, we can see something familiar — it contains an existential!

data Dynamic = forall a. TypeRep a => Dynamic a

(I’ve translated the definition from the original GADT style)

The type of `Dynamic`

says that as long as you can represent the type of the value `a`

, you can hide it in the `Dynamic`

wrapper type. This looks similar to how we’ve defined `WireValue`

! It seems as if trying to understand existentials before `Data.Dynamic`

was a Good Thing™.

So, is there a practical use for being able to use types as values, other than `cast`

?

#### Orderability

If we try to define an `Ord`

instance for our `WireValue`

type in the same way we did for `Eq`

we run into a problem: We don’t know what the right answer is when the types don’t match.

data WireValue = forall a. (Typeable a, Show a, Eq a, Ord a) => WireValue a

instance Ord WireValue where

compare (WireValue x) (WireValue y) = maybe _onMismatchedTypes (compare x) $ cast y

When we had mismatched types with `==`

we could just return the default value of `False`

. In the above snippet for `compare`

, `_onMismatchedTypes`

can’t just magically produce an `Ordering`

value like `LT`

because you would get inconsistent answers depending on which order you compared the two values.

But values of `TypeRep`

*are* comparable, so, if the `cast`

fails, we can simply compare the* type representations* of the values instead:

instance Ord WireValue where

compare (WireValue x) (WireValue y) =

maybe

(compare (typeOf x) (typeOf y))

(compare x) $ cast y

This now works nicely:

λ: WireValue 3 < WireValue 4

True

λ: WireValue "hello" < WireValue 1

False

λ: WireValue 1 < WireValue "hello"

True

#### Types as values

What I’ve presented so far is a nice straightforward path through some of my holiday discoveries playing around with existentials and `Data.Typeable`

. However, in practice, getting my head around some of this stuff wasn’t easy and I took a few interesting diversions along the way.

One of those diversions was looking at how I might convert a type to a value when I *did not know* about the magic of `Data.Typeable`

. This section may be instructive if you, like me, found this all too magical!

We can define some data types that together represent types our `WireValues`

can contain:

data WireConcreteType = WctInt | WctString deriving (Show, Eq, Ord)

data WireContainerType = WctList | WctMaybe deriving (Show, Eq, Ord)|

data WireType

= WtConc WireConcreteType

| WtCont WireContainerType WireType

deriving (Show, Eq, Ord)

These constructors, even without particularly many options, can be used to represent wide variety of types:

-- Int:

t1 = WtConc WctInt

-- [String]:

t2 = WtCont WctList $ WtConc WctString

-- [Maybe Int]

t3 = WtCont WctList $ WtCont WctMaybe $ WtConc WctInt

...

To extract a type as a value at runtime, we can define the `getType`

function as follows. Note that we use a `proxy a`

value, rather than simply an `a`

value, so that you don’t actually need to build a value to extract its type. This is useful in the case of lists, as it allows us to define `getType`

recursively, avoiding having to work out what to do when given an empty list.

{-# LANGUAGE ScopedTypeVariables -#}

import Data.Proxy

class Wireable a where

getType :: proxy a -> WireType

instance Wireable Int where

getType _ = WtConc WctInt

instance Wireable a => Wireable [a] where

getType _ = WtCont WctList $ getType (Proxy :: Proxy a)

We need the `ScopedTypeVariables`

extension because we need to refer to the specific `a`

that `getType`

was called with when building the `Proxy`

in the `Wireable`

instance for `[a]`

.

Now we can get `WireType`

values from Haskell types:

λ: getType (Proxy :: Proxy Int)

WtConc WctInt

λ: getType (Proxy :: Proxy [[Int]])

WtCont WctList (WtCont WctList (WtConc WctInt))

I was dead chuffed when I first got this to work — I think it’s pretty cool.

`getType`

does an equivalent thing to the `typeRep`

function from `Data.Typeable`

. Our `WireType`

data type is fulfilling the same role as the `TypeRep`

data type from `Data.Typeable`

, although it’s much more restricted — `TypeRep`

can represent algebraic data types with multiple arguments to constructors and comes with a bunch of helper functions to unpack and inspect it.

As an aside, when doing this work I also discovered the `TypeApplications`

extension, which provides slightly less clumsy syntax for specifying the types of polymorphic functions:

λ: :set -XTypeApplications

λ: getType $ Proxy @Int

WtConc WctInt

#### So much for the existentials, where’s the crisis?

We’ve seen that, thanks to existential quantification, we can easily store values of mixed type in the same collection. We’ve also seen that, thanks to `Data.Typeable.cast`

, we can safely extract values back out to perform further processing.

So, our lives are fairly easy if our code doesn’t care at all about the type contained in a `WireValue`

, or if we know exactly which type we need to extract from it. Some difficulties start to appear when we want to act differently based on what is stored in the existential type.

As we looked at earlier, the existential quantification lets us hide a type variable on RHS of a type signature so that it is not visible on the LHS. This unlocked all the fun stuff we see above. However, the cost is that, if given a `WireValue`

, Haskell cannot make any guarantees about the type it has. That severely limits what you can do with it!

Early on in my experimentation I kept wanting to do things like

extract (WireValue x) = x

so that I could get at the value inside the `WireValue`

. But that gives us an error:

Couldn't match expected type ‘p’ with actual type ‘a’

because type variable ‘a’ would escape its scope

This (rigid, skolem) type variable is bound by

a pattern with constructor:

WireValue :: forall a. (Typeable a, Show a) => a -> WireValue,

in an equation for ‘extract’

According to this Stack Overflow thread, the word *skolem* refers to a type that is unknown but fixed — we can’t know at compile time what `x`

will be, but it will be something! This does make sense, but it really hit me sideways when I was trying to reason about my code, because as a human *I* could often see what type something would be. (I have a lot of experience with Python, where playing with this kind of interrelation between types and values is much more fluid, but completely unsafe! Hence, sometimes my intuitions in Haskell are too permissive.)

`cast`

got away with being polymorphic in its return type because it was the *caller* of the function that determined what they wanted `cast`

to return. `cast`

could simply decide not to return anything if the caller asked for the wrong type. We need a different mechanism — one where the *callee* gets to choose the type and the *caller* has to provide the polymorphism.

{-# LANGUAGE Rank2Types #-}

overWv :: WireValue -> (forall a. Typeable a => a -> b) -> b

overWv (WireValue x) f = f x

The `Rank2Types`

extension lets you write an explicit `forall`

to bind the variable `a`

at a tighter scope than variable `b`

. The type signature of `overWv`

effectively says “give me a `WireValue`

, and a function that can take a value of *any* `Typeable`

`a`

that I choose to a value of *some* type `b`

that you choose, and I’ll give you a value of that `b`

type.”

Needless to say, having to construct a generic function that can handle any `a`

conforming to a constraint can feel a little cumbersome. It’s been tempting to use the `Wireable`

class I briefly sketched out earlier as a hook where I can keep adding specific (mostly validation) functionality so that I can write things like:

class Wireable a where

myFoo :: a -> Bool

data WireValue = forall a. (Wireable a, Typeable a) => WireValue a

overWv :: WireValue -> (forall a. Wireable a => a -> b) -> b

overWv (WireValue x) f = f x

doFoo :: WireValue -> Bool

doFoo wv = overWv wv myFoo

But this couples together `Wireable`

, `WireValue`

and the code that does the validation horribly, given that the code can be split across multiple modules but all has to work together.

Hopefully the above is illustrative of the kind of issues I’ve faced trying to make this stuff work in practice. It’s been great levelling up my understanding of more complex type signatures, and I think our production code is more precise about what we mean because of having used the existential stuff. However, some of the stranger error messages and the dancing around Haskell’s (correctly enforced) constraints about how you can use the existentials has been really challenging! I keep wanting some of the advanced dependant typing features that make Idris so exciting, but I expect they bring their own set of challenges.

If this material has been useful, do let me know. It will help me target what to write more about in the future. If you’re hungry for more interesting Haskell stuff, especially about the progress of dependant types in Haskell, I recently came across this really good HaskellCast interview with Richard Eisenberg.