Pure Functional Validation in a Nutshell
There are several different ways to model errors in typed functional programming, depending on your specific needs.
Maybe
Maybe
is a datatype that represents a computation (or an effect) that might not produce a value.
The datatype is defined this way in Haskell:
data Maybe a = Nothing | Just a
The |
indicates that Maybe
is a sum type. That is, a value of type Maybe a
is either Nothing
(for example, in the case of a null pointer or equivalent case) or Just a
, with the a
here indicating any kind of value whatsoever.
The type Maybe
is parameterized with another polymorphic type a
. This works like a function call: give an a
to Maybe
as an argument, and you get a fully instantiated type Maybe a
. For example:
safeGet :: (Eq k) => k -> [(k,v)] -> Maybe v
safeGet _ [] = Nothing
safeGet key ((k,v):kvs)
| key == k = Just v
| otherwise = safeGet key kvs
There are quite a few things going on here, but this function should be fairly readable even without a comprehensive understanding of Haskell. We want to create a dictionary of key-value pairs and store them in a list. We can represent this type as [(k,v)]
which is a pair of keys and values, (k,v)
wrapped up in a list, []
.
The (Eq k)
part indicates that our k
value, the key, must be something equatable, i.e. we must be able to determine, for any two k
values, whether they are equal.
Actually, since we’re talking about dictionaries, the Eq k
bit is actually compiled into an additional parameter to the function that looks up the k
in a dictionary of all types that implement Eq
so as to inline the correct definition. Nothing magical is going on here—it's functions all the way down. This isn't even the important part, though. That would be pattern matching.
safeGet _ [] = Nothing
is an equation that states that, given an empty list of type [(k,v)]
, the return value of this function is Nothing
. That is, when the function call matches this pattern, the function short circuits to returning Nothing
. We don't even care what the k
is in this case, so we just use a wildcard, _
, in its place in the equation.
The second equation, a pattern matching a list that actually has data, destructures the type [(k,v)]
and cases on it, returning Just v
if the given key argument matches k
and recursing on the rest of the list, otherwise.
Here’s what this looks like in practice:
λ> dict = [("Steve", 39)]
λ> safeGet "Steve" dict
Just 39
λ> safeGet "Joe" dict
Nothing
What this way of representing our data provides us is an interface for simple error handling, without having to write a bunch of extra code to handle edge cases. Without static type checking, the sort of thing built-in to Haskell or provided to JavaScript by Flow, we don’t get any compile-time guarantees about the values passed into these functions, but we do get more reliable protection against missing values that might blow up our programs.
We can fake this functionality in JavaScript using the Folktale library, like so:
const safeGet = key => dict => dict.get(key)
? Maybe.Just(dict.get(key))
: Maybe.Nothing()
If dict.get(key)
returns any value whatsoever, the return value of the function is wrapped in Folktale's Just
object. Otherwise, if the get
method returns undefined
, we get Nothing
instead:
> const dict = new Map()
> dict.set("Steve", 39)
> dict.get("Steve")
39
> dict.get("Joe")
undefined
> safeGet("Steve")(dict)
Maybe.Just({ value: 39 })
> safeGet("Joe")(dict)
Maybe.Nothing({ })
A cursory comparison of the Haskell and JavaScript code should prove to you that these are functionally equivalent implementations, which is a strength of the FP approach — the patterns are universal, even if the syntax differs from language to language.
We even managed to implement currying in JavaScript by using multiple, embedded arrow functions. This allows us to do partial application of functions. For example, let’s say we always want to look up "Steve"
, but the lookup dictionary can vary:
> const lookupSteve = safeGet("Steve")
> const dict2 = new Map().set("Joe", 40)
> lookupSteve(dict)
Maybe.Just({ value: 39 })
> lookupSteve(dict2)
Maybe.Nothing({ })
We can create a new function, which is just the name lookupSteve
bound to the safeGet()
function applied to "Steve"
. Our new function is now waiting for one more argument, a Map
object, and then it will return a final value. The lookupSteve
function defined above is equivalent to the following:
const lookupSteve = safeGet.apply(null, ["Steve"])
Which is not as nice.
Since we used Maybe
here, we can always match all the possible values that could be returned by our function on purpose, which is the whole point of sum types. Well, it's at least one of the points.
Remember the Haskell type Maybe a = Nothing | Just a
? That |
is like a logical or
. It's also like a +
, because the total number of possible values for any type a | b
is a + b
, whereas a product type, like an object, struct, or other record type found in most programming languages is like a logical and
and also *
, because the total number of values you can construct with a product of type a b
or {a, b}
is a * b
. Incidentally, this encoding is also possible with union types in Flow.
Armed with sum types and pattern matching (and Folktale), we can now write this JavaScript function:
const handleSteve = dict => lookupSteve(dict).matchWith({
Maybe.Just: ({ value }) => `Found ${value}`,
Maybe.Nothing: () => `Nothing found`
})
Let’s try it out:
> handleSteve(dict)
'Found 39'
> handleSteve(dict2)
'Nothing found'
Ship it.
Either
Either
is a datatype that represents a computation (or an effect) that might fail and produce an error. Whereas Maybe
simply encodes the presence or absence of a value, Either
gives us a choice between two distinct values. In Haskell, it looks like this:
data Either a b = Left a | Right b
Given two possible values, a
or b
, the possible values (or inhabitants) of this sum type are Left a
or Right b
. Traditionally, the Right
value is used for computations that are that correct, and Left
for bad computations, or errors. Even Haskell is not above puns.
Not only is this a sum type, it is the canonical example of a sum type, since it represents a choice between two values rather than a choice between either a value or no value whatsoever. In this regard, it is the opposite, or dual, of a tuple, which is the canonical product type. They’re two ways of packaging up the same data.
We can easily extend our Haskell example from above with an error that provides information as to exactly why we failed:
data Error =
EmptyDict
| MissingKey
deriving Show
safeGetE :: (Eq k) => k -> [(k,v)] -> Either Error v
safeGetE _ [] = Left EmptyDictsafeGetE key [(k,v)] = if key == k then Right v else Left MissingKeysafeGetE key ((k,v):kvs)
| key == k = Right v
| otherwise = safeGet key kvs
It’s useful to encode errors in their own datatype rather than use strings on their own. The deriving Show
clause simply tells the compiler to auto-generate a useful string representation for debugging purposes. Our new safeGet
function now provides a little more information when it fails:
λ> dict = [("Steve",39)]
λ> safeGetE "Steve" dict
Right 39
λ> safeGetE "Joe" dict
Left MissingKey
λ> safeGetE "Steve" []
Left EmptyDict
Maybe we need to know whether a value is simply missing or the dictionary we pass in is just completely empty. In fact, it would be useful to know the latter before even checking for the former. So that’s one way to enrich our error checking. Folktale provides a Result
object to cover this case:
const EmptyDict = "EmptyDict"
const MissingKey = "MissingKey"
const safeGetE = key => dict =>
dict.size === 0
? Result.Error(EmptyDict)
: dict.has(key)
? Result.Ok(key)
: Result.Error(MissingKey)
Which we can also test in the same way and with the same results:
> const dict = new Map().set("Steve", 39)
> safeGetE("Steve")(dict)
Result.Ok({ value: "Steve" })
> safeGetE("Joe")(dict)
Result.Error({ value: "MissingKey" })
> safeGetE("Steve")(new Map())
Result.Error({ value: "EmptyDict" })
Pattern matching works the same way as for Maybe
. And now we have error handling without exceptions, encoded directly in our data. But we can still do more.
Validation
Validation
is a datatype that represents a computation (or an effect) that might fail and produce an error. It's similar to Either
, except it accumulates the errors instead of failing on the first bad computation. This functionality makes this type useful for form validations, i.e. collecting the results of a series of related but independent computations. Once again, let's look first at the datatype as it might be defined in Haskell:
data Validation e a = Failure e | Success a
It certainly looks the same as Either
. Only the names are different. So how does it provide extra features? The secret lies in its use of the Applicative
type class. Much as our safeGet
functions above implemented Eq
in order to gain access to the equality function (==)
, there is a handy function in Haskell called (<*>)
that Validation
implements in order to aggregate error values (you may pronounce it "tie fighter"):
instance Semigroup e => Applicative (Validation e) where
pure :: pure :: a -> Validation e a
pure = Success
(<*>) :: Validation e (a -> b) -> Validation e a -> Validation e b
Success f <*> Success a = Success (f a)
Success _ <*> Failure e = Failure e
Failure e <*> Success _ = Failure e
Failure e <*> Failure e' = Failure (e <> e')
This probably looks bizarre, but if you’ve been following everything up to this point, you’re well enough prepared. What Applicative
allows us to do is take a bunch of computations and concatenate their effects. It provides two functions.
pure
represents a "pure" computation, that is one without any side effects. It takes a value and injects it into a computational context without performing any operation on it and without any other effect. The equation presented here, pure = Success
, is in pointfree form. It is equivalent to pure x = Success x
, but since we have partial application, we can factor out the argument for a neater equation.
(<*>)
, defined here in its usual infix form, represents functional application in a context. In this case, the context is the "effect" of validation, the effect provided by the Validation
datatype. It takes two arguments, and since they are both of type Validation
, it must pattern match on all possible inputs.
As you can see, a sum type of two inhabitants has four distinct permutations. Given two Success
values, the effect is to apply the function f
embedded in the first Success
to the value a
embedded in the second. This is what happens when we get a successful validation: the original value embedded in a type that lets us know that the validation succeeded. The middle two patterns simply return a Failure
value when paired with another Success
. The real star here is that last pattern, specifically the Failure (e <> e')
part.
You may have noticed that strange word Semigroup
in the definition of our Validation
instance. A semigroup is any structure with an append operation—some function that takes two such structures, does something (usually on the order of combining or selecting), and returns a new structure of the same type. If you know what a monoid is, a semigroup is a monoid without an identity.
What the above definition is stating is that for any value e
that we want to put into our Validation
type, that e
must itself implement the Semigroup
functionality (or, if you prefer, the Semigroup
interface). It’s just another typeclass in Haskell, but it provides us a guarantee that any e
will act like a semigroup when we want it to. When applying the tie fighter function to two Failure
values, the effect provided by Applicative
is to append the two values together, which is what the (<>)
operator is for.
That’s a bunch of new concepts at once, so we should look at a real example, first in Haskell:
-- Datatype for an unvalidated form
data Form = Form {
email :: String
, password :: String
} deriving Show-- Datatype for errors
data Error =
EmptyField
| NotMinLength
deriving Show-- Validated value types
newtype Email = Email String
deriving Show
newtype Password = Password String
deriving Show-- Datatype for a validated form
data ValidatedForm = ValidatedForm Email Password
deriving Show
type FormValidation = Validation [Error]-- Validation functions
notEmpty :: String -> FormValidation String
notEmpty "" = Failure [EmptyField]
notEmpty str = Success str
minLength :: String -> Int -> FormValidation String
minLength str n
| length str >= n = Success str
| otherwise = Failure [NotMinLength]
minPasswordLength :: Int
minPasswordLength = 8-- Field validations
validateEmail :: String -> FormValidation Email
validateEmail input =
notEmpty input $>
Email input
validatePassword :: String -> FormValidation Password
validatePassword input =
notEmpty input *>
minLength input minPasswordLength $>
Password input-- Form validation
validateForm :: Form -> FormValidation ValidatedForm
validateForm (Form email password) =
ValidatedForm <$>
validateEmail email <*>
validatePassword password-- Smart constructor
mkForm :: String -> String -> FormValidation ValidatedForm
mkForm email password = validateForm $ Form email password
We start with a definiton for a simple datatype representing forms that we’d like to validate. We can pair this with a functon mkForm
, further down in the code, which acts like a constructor. Give mkForm
an email and a password, and it will shove them into a new Form
structure for you, albeit a form that has yet to be validated.
Once again, we use a fully-fledged sum type for our errors. This time, however, we also want to create special type wrappers for validated emails and passwords. A type like String
doesn't tell us anything special about email addresses, after all, much less passwords, so we want to constrain the number of possible values this type will accept. In addition, we can also create a type to represent forms that have already been validated (parameterized by our already-validated Email
and Password
types) as well as a type alias for a form validation, which isn't strictly necessary but does cutdown on boilerplate and keystrokes.
The validations themselves are generic. Either notEmpty
or minLength
could be used for any fields, and they should look straightforward enough by this point.
Then things start to look strange. But don’t panic when you see weird operators. Something like <$>
is just another binary operator like +
, it just happens to do something more interesting than addition, and you don't have to memorize dozens of them. This one, for example, is just a synonym for fmap
, the function that maps another function over a functorial structure (like mapping a function over an array).
$>
is similar to fmap
, except it ignores the result of the function part. So it's like a "then" operation—which is exactly what we want in a function that replaces the value in a form with either an error or the validated type (in this case, Email
or Password
) for that value. *>
is $>
for Applicative
, which means it does the same thing except the value to be replaced is already in some kind of context, in this case the Validation
context.
So — we glue together the basic data validations into more specialized field validations, validateEmail
and validatePassword
and then combine those into a single validateForm
function that uses tie fighter power to produce a final ValidatedForm
result, which will contain either the successfully validated and now properly-typed values or an accumulation of errors for us to inspect. Ta-da:
λ> form1 = mkForm "steve@email.com" "12345678"
λ> form2 = mkForm "steve@email.com" "123"
λ> form3 = mkForm "" ""
λ> validateForm form1
Success (ValidatedForm (Email "steve@email.com") (Password "12345678"))
λ> validateForm form2
Failure [NotMinLength]
λ> validateForm form3
Failure [EmptyField,EmptyField,NotMinLength]
With a bonus function, traverse
, we can even validate a collection of multiple forms:
λ> traverse validateForm [form1, form2, form3]
Failure [NotMinLength,EmptyField,EmptyField,NotMinLength]
Obviously, this isn’t super useful without more informative error types, but I think you can see the utility in this approach.
And now we can try Folktale’s Validation
in JavaScript:
const EmptyField = "EmptyField"
const NotMinLength = "NotMinLength"
const mkForm = (email = '', password = '') => ({ email, password })
const notEmpty = value => value
? Success(value)
: Failure([EmptyField])
const minLength = (value, n) => value.length >= n
? Success(value)
: Failure([NotMinLength])
const validateEmail = (field, value) =>
notEmpty(field, value)
const minPasswordLength = 8
const validatePassword = input =>
notEmpty(input).concat(
minLength(input, minPasswordLength))
const validateForm = form => collect([
validateEmail(form.email),
validatePassword(form.password)
])
The typing isn’t nearly as robust, and we have to ensure that our structures are law-abiding ourselves (or just hope that they are), but the result is similar:
> const form1 = mkForm("steve@email.com", "12345678")
> const form2 = mkForm("steve@email.com", "123")
> const form3 = mkForm("", "")
> validateForm(form1)
Validation.Success({ value: "12345678" })
> validateForm(form2)
Validation.Failure({ value: ["NotMinLength"] })
> validateForm(form3)
Validation.Failure({ value: ["EmptyField", "EmptyField", "NotMinLength"] })
One difference, obviously, is that the Success
value returned by the JavaScript version of the validateForm
function only contains the last successfully validated value. You could probably jury-rig a way around that limitation, if you needed to. You could also just use a different language.
Whether you do or not, however, I do hope this article at least demonstrates that writing code in a functional style, while providing obvious benefits, also has properties that make it universal, regardless of the language you happen to be using. The patterns are always the same.
What distinguishes Haskell from JavaScript is not that Haskell is “functional” while JavaScript is not. And it’s not that Haskell is difficult to learn while JavaScript is easy. It’s simply that Haskell makes functional programming more straightforward, because it’s the language’s native idiom. In other words, if you prefer this style or even just recognize its strengths and benefits, Haskell gives you a bunch of stuff for free that you would otherwise have to implement yourself. And for those who don’t want to abandon the JavaScript universe completely, see also PureScript.
Why should you care about typed, functional programming? The data validation example sort of makes the point on its own: if you care about validating the data that users input into your forms, why shouldn’t you care just as much about the data that gets input into your functions? If you think all these static types just add complexity, it’s worth pointing out that types are also functions. They just happen to be operating on a higher level of abstraction. So when you define a type, you’re really defining a functional shortcut for validating your data.
When you specify all of your types for all of your functions, you have everything in front of you when examining your code. You haven’t written a series of instructions for telling a machine what to do, and you therefore aren’t having to mentally track the hidden contents of various memory stores and stateful objects. Instead, you have stated what the shape of your data should be and what sorts of transformations (i.e. functions) you would like to perform on that data. You have, that is, written your program in a declarative style, and that ought to be even more compelling than the notion of a functional style.
You may also be surprised to learn that this sort of programming is far and away the most common in the world. People who work with numbers like to have pure functions—functions with predictable outputs—and they like to have all their data represented in clearly-defined, that is to say clearly-validated, types. They do not want to have to look at a screen of text and numbers and just guess what they’re supposed to mean. Millions of spreadsheet users can’t be wrong: declarative programming is the way to go.