Pure Functional Validation in a Nutshell

Steven Syrek

Published in

Blacklane Engineering

13 min readDec 5, 2017

There are several different ways to model errors in typed functional programming, depending on your specific needs.

`Maybe`

Maybe is a datatype that represents a computation (or an effect) that might not produce a value.

The datatype is defined this way in Haskell:

data Maybe a = Nothing | Just a

The | indicates that Maybe is a sum type. That is, a value of type Maybe a is either Nothing (for example, in the case of a null pointer or equivalent case) or Just a, with the a here indicating any kind of value whatsoever.

The type Maybe is parameterized with another polymorphic type a. This works like a function call: give an a to Maybe as an argument, and you get a fully instantiated type Maybe a. For example:

safeGet :: (Eq k) => k -> [(k,v)] -> Maybe v
safeGet _ []            = Nothing
safeGet key ((k,v):kvs)
  | key == k            = Just v
  | otherwise           = safeGet key kvs

There are quite a few things going on here, but this function should be fairly readable even without a comprehensive understanding of Haskell. We want to create a dictionary of key-value pairs and store them in a list. We can represent this type as [(k,v)] which is a pair of keys and values, (k,v) wrapped up in a list, [].

The (Eq k) part indicates that our k value, the key, must be something equatable, i.e. we must be able to determine, for any two k values, whether they are equal.

Actually, since we’re talking about dictionaries, the Eq k bit is actually compiled into an additional parameter to the function that looks up the k in a dictionary of all types that implement Eq so as to inline the correct definition. Nothing magical is going on here—it's functions all the way down. This isn't even the important part, though. That would be pattern matching.

safeGet _ [] = Nothing is an equation that states that, given an empty list of type [(k,v)], the return value of this function is Nothing. That is, when the function call matches this pattern, the function short circuits to returning Nothing. We don't even care what the k is in this case, so we just use a wildcard, _, in its place in the equation.

The second equation, a pattern matching a list that actually has data, destructures the type [(k,v)] and cases on it, returning Just v if the given key argument matches k and recursing on the rest of the list, otherwise.

Here’s what this looks like in practice:

λ> dict = [("Steve", 39)]

λ> safeGet "Steve" dict
Just 39

λ> safeGet "Joe" dict
Nothing

What this way of representing our data provides us is an interface for simple error handling, without having to write a bunch of extra code to handle edge cases. Without static type checking, the sort of thing built-in to Haskell or provided to JavaScript by Flow, we don’t get any compile-time guarantees about the values passed into these functions, but we do get more reliable protection against missing values that might blow up our programs.

We can fake this functionality in JavaScript using the Folktale library, like so:

const safeGet = key => dict => dict.get(key)
  ? Maybe.Just(dict.get(key))
  : Maybe.Nothing()

If dict.get(key) returns any value whatsoever, the return value of the function is wrapped in Folktale's Just object. Otherwise, if the get method returns undefined, we get Nothing instead:

> const dict = new Map()

> dict.set("Steve", 39)

> dict.get("Steve")
39

> dict.get("Joe")
undefined

> safeGet("Steve")(dict)
Maybe.Just({ value: 39 })

> safeGet("Joe")(dict)
Maybe.Nothing({  })

A cursory comparison of the Haskell and JavaScript code should prove to you that these are functionally equivalent implementations, which is a strength of the FP approach — the patterns are universal, even if the syntax differs from language to language.

We even managed to implement currying in JavaScript by using multiple, embedded arrow functions. This allows us to do partial application of functions. For example, let’s say we always want to look up "Steve", but the lookup dictionary can vary:

> const lookupSteve = safeGet("Steve")

> const dict2 = new Map().set("Joe", 40)

> lookupSteve(dict)
Maybe.Just({ value: 39 })

> lookupSteve(dict2)
Maybe.Nothing({  })

We can create a new function, which is just the name lookupSteve bound to the safeGet() function applied to "Steve". Our new function is now waiting for one more argument, a Map object, and then it will return a final value. The lookupSteve function defined above is equivalent to the following:

const lookupSteve = safeGet.apply(null, ["Steve"])

Which is not as nice.

Since we used Maybe here, we can always match all the possible values that could be returned by our function on purpose, which is the whole point of sum types. Well, it's at least one of the points.

Remember the Haskell type Maybe a = Nothing | Just a? That | is like a logical or. It's also like a +, because the total number of possible values for any type a | b is a + b, whereas a product type, like an object, struct, or other record type found in most programming languages is like a logical and and also *, because the total number of values you can construct with a product of type a b or {a, b} is a * b. Incidentally, this encoding is also possible with union types in Flow.

Armed with sum types and pattern matching (and Folktale), we can now write this JavaScript function:

const handleSteve = dict => lookupSteve(dict).matchWith({
  Maybe.Just: ({ value }) => `Found ${value}`,
  Maybe.Nothing:       () => `Nothing found`
})

Let’s try it out:

> handleSteve(dict)
'Found 39'
> handleSteve(dict2)
'Nothing found'

Ship it.

`Either`

Either is a datatype that represents a computation (or an effect) that might fail and produce an error. Whereas Maybe simply encodes the presence or absence of a value, Either gives us a choice between two distinct values. In Haskell, it looks like this:

data Either a b = Left a | Right b

Given two possible values, a or b, the possible values (or inhabitants) of this sum type are Left a or Right b. Traditionally, the Right value is used for computations that are that correct, and Left for bad computations, or errors. Even Haskell is not above puns.

Not only is this a sum type, it is the canonical example of a sum type, since it represents a choice between two values rather than a choice between either a value or no value whatsoever. In this regard, it is the opposite, or dual, of a tuple, which is the canonical product type. They’re two ways of packaging up the same data.

We can easily extend our Haskell example from above with an error that provides information as to exactly why we failed:

data Error = 
    EmptyDict
  | MissingKey
  deriving Show

safeGetE :: (Eq k) => k -> [(k,v)] -> Either Error v
safeGetE _ []            = Left EmptyDictsafeGetE key [(k,v)]     = if key == k then Right v else Left MissingKeysafeGetE key ((k,v):kvs)
  | key == k             = Right v
  | otherwise            = safeGet key kvs

It’s useful to encode errors in their own datatype rather than use strings on their own. The deriving Show clause simply tells the compiler to auto-generate a useful string representation for debugging purposes. Our new safeGet function now provides a little more information when it fails:

λ> dict = [("Steve",39)]

λ> safeGetE "Steve" dict
Right 39

λ> safeGetE "Joe" dict
Left MissingKey

λ> safeGetE "Steve" []
Left EmptyDict

Maybe we need to know whether a value is simply missing or the dictionary we pass in is just completely empty. In fact, it would be useful to know the latter before even checking for the former. So that’s one way to enrich our error checking. Folktale provides a Result object to cover this case:

const EmptyDict = "EmptyDict"

const MissingKey = "MissingKey"

const safeGetE = key => dict =>
    dict.size === 0
  ? Result.Error(EmptyDict)
  : dict.has(key)
  ? Result.Ok(key)
  : Result.Error(MissingKey)

Which we can also test in the same way and with the same results:

> const dict = new Map().set("Steve", 39)

> safeGetE("Steve")(dict)
Result.Ok({ value: "Steve" })

> safeGetE("Joe")(dict)
Result.Error({ value: "MissingKey" })

> safeGetE("Steve")(new Map())
Result.Error({ value: "EmptyDict" })

Pattern matching works the same way as for Maybe. And now we have error handling without exceptions, encoded directly in our data. But we can still do more.

`Validation`

Validation is a datatype that represents a computation (or an effect) that might fail and produce an error. It's similar to Either, except it accumulates the errors instead of failing on the first bad computation. This functionality makes this type useful for form validations, i.e. collecting the results of a series of related but independent computations. Once again, let's look first at the datatype as it might be defined in Haskell:

data Validation e a = Failure e | Success a

It certainly looks the same as Either. Only the names are different. So how does it provide extra features? The secret lies in its use of the Applicative type class. Much as our safeGet functions above implemented Eq in order to gain access to the equality function (==), there is a handy function in Haskell called (<*>) that Validation implements in order to aggregate error values (you may pronounce it "tie fighter"):

instance Semigroup e => Applicative (Validation e) where
  pure :: pure :: a -> Validation e a
  pure = Success

  (<*>) :: Validation e (a -> b) -> Validation e a -> Validation e b
  Success f <*> Success a  = Success (f a)
  Success _ <*> Failure e  = Failure e
  Failure e <*> Success _  = Failure e
  Failure e <*> Failure e' = Failure (e <> e')

This probably looks bizarre, but if you’ve been following everything up to this point, you’re well enough prepared. What Applicative allows us to do is take a bunch of computations and concatenate their effects. It provides two functions.

pure represents a "pure" computation, that is one without any side effects. It takes a value and injects it into a computational context without performing any operation on it and without any other effect. The equation presented here, pure = Success, is in pointfree form. It is equivalent to pure x = Success x, but since we have partial application, we can factor out the argument for a neater equation.

(<*>), defined here in its usual infix form, represents functional application in a context. In this case, the context is the "effect" of validation, the effect provided by the Validation datatype. It takes two arguments, and since they are both of type Validation, it must pattern match on all possible inputs.

As you can see, a sum type of two inhabitants has four distinct permutations. Given two Success values, the effect is to apply the function f embedded in the first Success to the value a embedded in the second. This is what happens when we get a successful validation: the original value embedded in a type that lets us know that the validation succeeded. The middle two patterns simply return a Failure value when paired with another Success. The real star here is that last pattern, specifically the Failure (e <> e') part.

You may have noticed that strange word Semigroup in the definition of our Validation instance. A semigroup is any structure with an append operation—some function that takes two such structures, does something (usually on the order of combining or selecting), and returns a new structure of the same type. If you know what a monoid is, a semigroup is a monoid without an identity.

What the above definition is stating is that for any value e that we want to put into our Validation type, that e must itself implement the Semigroup functionality (or, if you prefer, the Semigroup interface). It’s just another typeclass in Haskell, but it provides us a guarantee that any e will act like a semigroup when we want it to. When applying the tie fighter function to two Failure values, the effect provided by Applicative is to append the two values together, which is what the (<>) operator is for.

That’s a bunch of new concepts at once, so we should look at a real example, first in Haskell:

-- Datatype for an unvalidated form
data Form = Form {
    email    :: String
  , password :: String
} deriving Show-- Datatype for errors
data Error = 
    EmptyField
  | NotMinLength
  deriving Show-- Validated value types
newtype Email = Email String
  deriving Show

newtype Password = Password String
  deriving Show-- Datatype for a validated form
data ValidatedForm = ValidatedForm Email Password
  deriving Show

type FormValidation = Validation [Error]-- Validation functions
notEmpty :: String -> FormValidation String
notEmpty ""  = Failure [EmptyField]
notEmpty str = Success str

minLength :: String -> Int -> FormValidation String
minLength str n
  | length str >= n = Success str
  | otherwise       = Failure [NotMinLength]

minPasswordLength :: Int
minPasswordLength = 8-- Field validations
validateEmail :: String -> FormValidation Email
validateEmail input =
  notEmpty input $>
  Email input

validatePassword :: String -> FormValidation Password
validatePassword input =
  notEmpty input *>
  minLength input minPasswordLength $>
  Password input-- Form validation
validateForm :: Form -> FormValidation ValidatedForm
validateForm (Form email password) =
  ValidatedForm <$>
  validateEmail email <*>
  validatePassword password-- Smart constructor
mkForm :: String -> String -> FormValidation ValidatedForm
mkForm email password = validateForm $ Form email password

We start with a definiton for a simple datatype representing forms that we’d like to validate. We can pair this with a functon mkForm, further down in the code, which acts like a constructor. Give mkForm an email and a password, and it will shove them into a new Form structure for you, albeit a form that has yet to be validated.

Once again, we use a fully-fledged sum type for our errors. This time, however, we also want to create special type wrappers for validated emails and passwords. A type like String doesn't tell us anything special about email addresses, after all, much less passwords, so we want to constrain the number of possible values this type will accept. In addition, we can also create a type to represent forms that have already been validated (parameterized by our already-validated Email and Password types) as well as a type alias for a form validation, which isn't strictly necessary but does cutdown on boilerplate and keystrokes.

The validations themselves are generic. Either notEmpty or minLength could be used for any fields, and they should look straightforward enough by this point.

Then things start to look strange. But don’t panic when you see weird operators. Something like <$> is just another binary operator like +, it just happens to do something more interesting than addition, and you don't have to memorize dozens of them. This one, for example, is just a synonym for fmap, the function that maps another function over a functorial structure (like mapping a function over an array).

$> is similar to fmap, except it ignores the result of the function part. So it's like a "then" operation—which is exactly what we want in a function that replaces the value in a form with either an error or the validated type (in this case, Email or Password) for that value. *> is $> for Applicative, which means it does the same thing except the value to be replaced is already in some kind of context, in this case the Validation context.

So — we glue together the basic data validations into more specialized field validations, validateEmail and validatePassword and then combine those into a single validateForm function that uses tie fighter power to produce a final ValidatedForm result, which will contain either the successfully validated and now properly-typed values or an accumulation of errors for us to inspect. Ta-da:

λ> form1 = mkForm "steve@email.com" "12345678"

λ> form2 = mkForm "steve@email.com" "123"

λ> form3 = mkForm "" ""

λ> validateForm form1
Success (ValidatedForm (Email "steve@email.com") (Password "12345678"))

λ> validateForm form2
Failure [NotMinLength]

λ> validateForm form3
Failure [EmptyField,EmptyField,NotMinLength]

With a bonus function, traverse, we can even validate a collection of multiple forms:

λ> traverse validateForm [form1, form2, form3]
Failure [NotMinLength,EmptyField,EmptyField,NotMinLength]

Obviously, this isn’t super useful without more informative error types, but I think you can see the utility in this approach.

And now we can try Folktale’s Validation in JavaScript:

const EmptyField = "EmptyField"

const NotMinLength = "NotMinLength"

const mkForm = (email = '', password = '') => ({ email, password })

const notEmpty = value => value
  ? Success(value)
  : Failure([EmptyField])

const minLength = (value, n) => value.length >= n
  ? Success(value) 
  : Failure([NotMinLength])

const validateEmail = (field, value) => 
  notEmpty(field, value)

const minPasswordLength = 8

const validatePassword = input =>
  notEmpty(input).concat(
  minLength(input, minPasswordLength))

const validateForm = form => collect([
  validateEmail(form.email),
  validatePassword(form.password)
])

The typing isn’t nearly as robust, and we have to ensure that our structures are law-abiding ourselves (or just hope that they are), but the result is similar:

> const form1 = mkForm("steve@email.com", "12345678")

> const form2 = mkForm("steve@email.com", "123")

> const form3 = mkForm("", "")

> validateForm(form1)
Validation.Success({ value: "12345678" })

> validateForm(form2)
Validation.Failure({ value: ["NotMinLength"] })

> validateForm(form3)
Validation.Failure({ value: ["EmptyField", "EmptyField", "NotMinLength"] })

One difference, obviously, is that the Success value returned by the JavaScript version of the validateForm function only contains the last successfully validated value. You could probably jury-rig a way around that limitation, if you needed to. You could also just use a different language.

Whether you do or not, however, I do hope this article at least demonstrates that writing code in a functional style, while providing obvious benefits, also has properties that make it universal, regardless of the language you happen to be using. The patterns are always the same.

What distinguishes Haskell from JavaScript is not that Haskell is “functional” while JavaScript is not. And it’s not that Haskell is difficult to learn while JavaScript is easy. It’s simply that Haskell makes functional programming more straightforward, because it’s the language’s native idiom. In other words, if you prefer this style or even just recognize its strengths and benefits, Haskell gives you a bunch of stuff for free that you would otherwise have to implement yourself. And for those who don’t want to abandon the JavaScript universe completely, see also PureScript.

Why should you care about typed, functional programming? The data validation example sort of makes the point on its own: if you care about validating the data that users input into your forms, why shouldn’t you care just as much about the data that gets input into your functions? If you think all these static types just add complexity, it’s worth pointing out that types are also functions. They just happen to be operating on a higher level of abstraction. So when you define a type, you’re really defining a functional shortcut for validating your data.

When you specify all of your types for all of your functions, you have everything in front of you when examining your code. You haven’t written a series of instructions for telling a machine what to do, and you therefore aren’t having to mentally track the hidden contents of various memory stores and stateful objects. Instead, you have stated what the shape of your data should be and what sorts of transformations (i.e. functions) you would like to perform on that data. You have, that is, written your program in a declarative style, and that ought to be even more compelling than the notion of a functional style.

You may also be surprised to learn that this sort of programming is far and away the most common in the world. People who work with numbers like to have pure functions—functions with predictable outputs—and they like to have all their data represented in clearly-defined, that is to say clearly-validated, types. They do not want to have to look at a screen of text and numbers and just guess what they’re supposed to mean. Millions of spreadsheet users can’t be wrong: declarative programming is the way to go.

Pure Functional Validation in a Nutshell

`Maybe`

`Either`

`Validation`

Code used in this article

Written by Steven Syrek