Using PureScript to create a domain-specific language for building forms with validation

In a previous post we have shown how we use PureScript and Haskell at Lumi to improve the correctness of our code. Using languages with such powerful type systems does not only help us write more correct code, but also help us think about domains and behavior, increasing our creativity and efficiency. However, the expressiveness of these languages is not only to be observed in their type system, but also in their syntax, what allows us to write — and use — embedded domain-specific languages with ease.

In this post, we will go through the process of creating a small type-safe embedded domain-specific language for specifying form components with validation in PureScript, explaining all the whys and hows.

Embedded domain-specific languages

Ever wished you could use a new language perfectly tailored to solve the specific problem at hand? These are called domain-specific languages (DSLs), computer languages specialized to a particular application domain. Some examples of DSLs are: HTML, CSS, Elm, SQL, GLSL, LaTeX, and Markdown — used to write this very article.

Creating a language for specific domains is, however, highly unpractical in modern software engineering. That’s why a common practice is to use embedded domain-specific languages (EDSLs), that is, domain-specific languages that are implemented as libraries for a general-purpose “host language”, allowing the EDSL to exploit the host language’s infrastructure, including its syntax, type system, modularity, etc., while enabling programmers to work on a much higher level of abstraction.

Languages like Haskell and PureScript, have such a rich syntax that using an EDSL in these languages can feel like writing in a different one. An example of this is what we will have by the end of this tutorial, an embedded domain-specific language for specifying form components with validation. Here is what a form written in this EDSL can look like:

To write this form, we start with simple form fields like textbox, passwordBox, numericInput or array, then add some focus to better see the parts of the complex data structure the form edits. We, then, add some syntactic sugar to the mixture with ado, sprinkle in some labels, section separators and validation with the validated combinator, and, voilà, we have a form! Here is the code that generates the form in the picture above:

userForm :: Form UserFormData User
userForm = ado
section "User data"
username <-
label "Username" Required
$ focus (prop (SProxy :: SProxy "username"))
$ validated (nonEmpty "Username")
$ textbox

password <- wizard do
password <- step
$ label "Password" Required
$ focus (prop (SProxy :: SProxy "password"))
$ validated (nonEmpty "Password")
$ passwordBox
passwordConfirmation <- step
$ label "Password confirmation" Required
$ focus (prop (SProxy :: SProxy "passwordConfirmation"))
$ validated
(mustEqual "Passwords must match" (toString password))
$ passwordBox
pure password

section "Personal data"
age <-
label "Age" Optional
$ focus (prop (SProxy :: SProxy "age"))
$ numericInput

addresses <-
label "Addresses" Neither
$ focus (prop (SProxy :: SProxy "addresses"))
$ array defaultAddress addressForm

in User { username, password, age, addresses }

The reader is encouraged to return to this code snippet by the end of every section to understand how the developed concepts are used in a real example.

From reading the code above, one can already spot some advantages of using this approach (frequently called language-oriented programming). The most significant one, however, is that it allows us to exploit the host language’s syntax — in this case PureScript — to write form specifications in a clear, type-safe, and composable way. This is why EDSLs are also often called combinator libraries, as we can use them to build complex structures out of simpler, composable ones.

A language for specifying forms

In a strict sense, a form is a user interface capable of modifying a data structure. It has an input and an output (or result). A translation of this definition to a PureScript type is:

newtype Form a = Form (a -> (a -> Effect Unit) -> UI)

Here, a is the type of the data structure which is manipulated by the form, a -> Effect Unit is a callback for when the input value gets changed in the user interface (henceforth called the "change callback"), and UI is a visual representation of the interface (can be a Virtual DOM, React's JSX, GTK widgets, or even text).

A better definition of Form, however, must include a differentiation between input and output. One reason for this is that different UI representations might yield different input types; for example, some user interface library might use a specific data type of a checkbox's checked state instead of a Boolean, the desired output for such an input field. Some larger forms might also have a more condensed output than there are fields in them. For these reasons, we can rewrite our definition of the Form type as:

newtype Form i a =
Form (i -> { ui :: (i -> Effect Unit) -> UI, result :: a })

The reason for including the change callback in the ui field of the result record is that it is considered a part of the user interface, and should not be a requirement for obtaining an output from the form.

Combining forms

According to the definition of Form i a, a form can be any user interface capable of manipulating a data structure of type i and yielding a result of type a. In particular, single input fields such as text boxes and check boxes may be treated like simple forms themselves.

In order to build larger forms from many fields, we might try to build a Form by hand each time. However, we would prefer to find a way of combining these single-input forms into larger, more useful forms. In order to do this, we can make use of a few PureScript type classes, notably the Applicative type class.

In order to instantiate the Applicative type class for Form, we must also have instances of the Functor and Apply type classes. When instantiated to Form, these three classes provide us the following three functions:

  • Functor: map :: (a -> b) -> Form i a -> Form i b
  • Apply: apply :: Form i (a -> b) -> Form i a -> Form i b
  • Applicative: pure :: a -> Form i a.

The map function from the Functor type class allows us to change the result type of a form, while the pure function from Applicative allows us to create an empty form that produces a specific, pre-defined result.

In this case, however, the most important of the three functions is apply, from the Apply type class. When used together with map, the apply function allows us to combine multiple forms (both their UIs and their result types), so that we can create a larger form out of multiple, individual ones.

This type class is often associated with side effects, and in this context, we can think of the combination of form UIs as being a side effect of composing forms and their results, as with the Writer monad, which accumulates a monoid on the side.

Considering that <$> is the infix operator for map, and <*> is the one for apply, an example of this is:

personForm =
(\firstName lastName { address, city } -> firstName <> " " <> lastName <> " lives at " <> address <> ", " <> city)
<$> firstNameForm
<*> lastNameForm
<*> addressForm

Or using PureScript’s ado-notation, a syntactic sugar that translates expressions written in applicative syntax (such as the example above) to a do-notation inspired syntax, such as below:

personForm = ado
firstName <- firstNameForm
lastName <- lastNameForm
{ address, city } <- addressForm
in firstName <> " " <> lastName <> " lives at " <> address <> ", " <> city

This is the basis of our DSL for specifying form components: a Form data type with input and output types, a current value, and a user interface (comprised of a change callback and a UI representation); and (polymorphic) functions that let us combine multiple Forms.

Editing a data structure with a form

The goal of this DSL is to combine multiple small forms (as single input fields are also forms) to obtain a more complex one that can be used to edit a large data structure. Consider the case of the previous example, where we could have the following data structure:

type PersonFormData =
{ firstName :: String
, lastName :: String
, address ::
{ address :: String
, city :: String
}
}

In this case, each single form field (a Form value itself) must have PersonFormData as its input type, and this means that we wouldn't be able to, for example, use a general text box for the firstName field, as this text box would have the type:

textbox :: Form String String

Fortunately, it’s possible to fix this problem by using lenses.

Also called functional references, lenses are composable functions that let us edit some part of a data structure. Lenses are a natural fit for forms, whose purpose is to visually edit a large structure in terms of its smaller parts.

The purescript-profunctor-lenses library provides a comprehensive set of types and composable functions for updating, viewing and setting values within nested data structures. Given that the type Lens' s a describes a way of viewing or transforming a value of type a that always exists somewhere inside a structure of type s, a function that changes a form to focus on a small part of a larger structure would have type:

focus :: forall i j a. Lens' i j -> Form j a -> Form i a

Knowing that the functions view and set (from the purescript-profunctor-lenses library) allow us to, respectively, view and set a value to which the specified lens points, we can implement the focus function as:

focus :: forall i j a. Lens' i j -> Form j a -> Form i a
focus l (Form f) =
Form \i ->
let
{ ui, result } = f (view l i)
in
{ ui: \onChange -> ui (onChange <<< flip (set l) i)
, result
}

With focus, now we are able to edit complex data structures with Form. An example is the implementation of personForm below. Note that the prop function creates a lens that focuses on a specific field of a record (given an SProxy to the field's label):

personForm :: Form PersonFormData String
personForm = ado
firstName <- focus (prop (SProxy :: SProxy "firstName")) textbox
lastName <- focus (prop (SProxy :: SProxy "lastName")) textbox
{ address, city } <- focus (prop (SProxy :: SProxy "address")) addressForm
in firstName <> " " <> lastName <> " lives at " <> address <> ", " <> city
addressForm :: Form { address :: String, city :: String } String
addressForm = ado
address <- focus (prop (SProxy :: SProxy "address")) textbox
city <- focus (prop (SProxy :: SProxy "city")) textbox
in { address, city }

Notice how the lenses used in the definition of addressForm compose with the one used in personForm when applying focus to addressForm. This is the power of composable lenses, and how they allow us to edit a large, complex data structure with our (so far) simple DSL.

Adding validation to the language

Validated forms are nothing more than forms whose output depends on some validation criteria. Let’s recall the definition of our Form type:

newtype Form i a = Form (i -> { ui :: (i -> Effect Unit) -> UI, result :: a })

In this definition we can see that, if we provide a (unvalidated) input value to the form, we will always get back a user interface and an output value.

In order to have a validated form, then, we must simply change the return type of the function wrapped in the Form constructor. If a validated form does not always return a valid output, then it's natural to change this definition to:

newtype Form i a =
Form (i -> { ui :: (i -> Effect Unit) -> UI, result :: Maybe a })

Note that, with this change, the Functor, Apply and Applicative instances of Form must also be changed to take the optional return value into account. If a form yields its return value as Nothing, this means it's invalid, as we will not be able to retrieve a value of type a by running the form. If it yields a Just a case, however, then it's valid.

This, in turn, changes the way the forms behave. That is, although all the form UIs get concatenated in an applicative chain (with ado-notation or with <$> and <*>), the final result is only available if all the intermediary forms yield a valid result.

Handling validation errors

The choice of the type constructor used to wrap the form’s output value may be contested. Maybe was chosen because we don't want to handle validation errors outside the form. Other valid choices for this type constructor could be Eitheror the V type, from the purescript-validation library. The latter would allow us to accumulate validation errors if more than one form field is invalid.

With Maybe, however, we chose to encapsulate the handling of validation errors in the user interface itself. In this sense, we need a way of transforming a form in order to display validation errors. But first we must define what validation actually is.

In the context of validated forms, a value can be either valid or invalid. If invalid, we would like to know what happened, that is, we would like to have a validation error. If it is valid, however, we would probably like to transform it into an output value of a different type. For example, a required text field might have its input encoded as a String, whereas its output value should be a NonEmptyString.

Thus, we define a validator as:

type Validator i a = i -> Either String a

Some examples of validators are:

nonNull :: String -> Validator (Maybe a) a
nonNull name = maybe (Left (name <> " is required.")) Right
nonEmpty :: String -> Validator String NonEmptyString
nonEmpty name s =
case NonEmptyString.fromString s of
Just nes -> Right nes
Nothing -> Left (name <> " is required.")
mustEqual :: forall a. Eq a => a -> String -> Validator a a
mustEqual value1 error value2 =
if value1 == value2
then Right value1
else Left error

The one thing left to properly integrate validation into our DSL is a combinator that uses a Validator. Given a Validator, this combinator must change a Form to display validation errors if its result is invalid, and change the result type:

validated :: forall i a b. Validator i b -> Form i a -> Form i b
validated validator (Form f) =
Form \v ->
let
{ ui } = f v.value
err = validator v.value
in
{ ui: \onChange -> displayValidationError err (ui onChange)
, result: hush err
}

That’s it! Given a function displayValidationError that appends a validation error to the form, we can use a function as simple validated to validate our forms and display errors.

The validated function as defined above doesn't take into account, however, the state of the form fields in order to display the validation errors. That is, if a form field is unmodified, but is validated as a non empty string, the error message will be displayed anyway. This is not a good user experience, and, to fix it, we must track the state of a form field when it is being validated.

Tracking form field state for validation messages

Consider a simple change to the behavior of our form: validation errors should only be displayed if a form field has been modified.

As with all other combinators in our DSL, if we want to change the behavior of a form based on its input, we should create a new combinator that acts on it, possibly changing the input type. This is exactly what we want in this case, that is, to embellish the input of a form field with an additional state: whether or not it has been modified. This can be achieved with a simple type:

type Validated a = { value :: a, modified :: Boolean }

The Validated type defines the state of a form field on which validation is performed. It contains a value and can be either a fresh or a modified field. Another possibility for this is to change the definition of Form to encapsulate the input type in Validated, but we chose the first option for the sake of simplicity.

The only change left now is to effectively track this state on validated fields. For this, we need to change the result of the validated function to be a Form whose input is of type Validated i. Notice, also, how modified is set to true when building the resulting user interface:

validated
:: forall i a b
. Validator i b
-> Form i a
-> Form (Validated i) b
validated validator (Form f) =
Form \v ->
let
{ ui } = f v.value
err = validator v.value
in
{ ui: \onChange ->
displayValidationError
err
(ui (onChange <<< { value: _, modified: true }))
, result: hush err
}

With this simple change, every time the value of a form field is changed, its modified flag is set to true. As a consequence, in order for this mechanism to work as expected, we must initialize all form fields of type Validated i with the modified flag set to false.

It’s noticeable that we can include all sorts of different states in the Validated type, some interesting possibilities include: asynchronous validation, debouncing, validation based on the focus state and much more.

Monadic validation and wizards

One interesting fact about the DSL we have so far, is that it admits a Monad instance. This is another advantage of choosing Maybe for the form's output value (over V, for example), as it has a lawful Monad instance.

What this means in terms of behavior, is that, in a monadic chain, subsequent form fields are only displayed if all the previous ones produce valid results, just as in a wizard.

The problem, however, is that this Monad instance is incompatible with the Apply instance of Form. In order to remedy this situation, we can define a newtype that simply encapsulates a Form and adds this wizard-like behavior in a compatible way:

newtype Wizard i a = Wizard (Form i a)
wizard :: forall i. Wizard i ~> Form i
wizard (Wizard form) = form
step :: forall i. Form i ~> Wizard i
step = Wizard

The Functor and Applicative instances for Wizard can be newtype-derived. The Apply instance, however, must be defined in terms ofap, so that it is compatible with the Monad instance, which is left as an exercise for the reader.

Besides providing us the wizard-like behavior, the greatest advantage of having a Monad instance, is that we're now able to have sequential dependencies on validated values.

One good illustration of this is a form containing fields for password and password confirmation, where the password confirmation is validated with mustEqual over the valid password:

passwordForm :: Form _ NonEmptyString
passwordForm = wizard do
password <- step
$ focus (prop (SProxy :: SProxy "password"))
$ validated (nonEmpty "Password")
$ passwordBox
  passwordConfirmation <- step
$ focus (prop (SProxy :: SProxy "passwordConfirmation"))
$ validated
(mustEqual "Passwords must match" (toString password))
$ passwordBox
  pure password

Here’s what it looks like in action (the field labels are a different combinator, not defined here as it depends on the choice for UI):

Fetching remote data with Wizard

The convenience introduced by the Monad instance of Wizard begets its exploitation. For example, if the UI representation allows for side-effects to be performed when it is rendered, then it is possible to have a dummy form field that fetches some remote data that is used within the form:

fetch :: Aff a -> Form (Maybe a) a
fetch = ...
countryStateForm :: Form _ Address
countryStateForm = wizard do
countries <- step
$ focus (prop (SProxy :: SProxy "countries"))
$ fetch loadAllCountries

country <- step
$ focus (prop (SProxy :: SProxy "country"))
$ validated (nonNull "Country")
$ select countries

states <- step
$ focus (prop (SProxy :: SProxy "states"))
$ fetch (loadStatesForCountry country)

step ado
state <-
focus (prop (SProxy :: SProxy "state"))
$ validated (nonNull "State")
$ select states
postalCode <-
focus (prop (SProxy :: SProxy "postalCode"))
$ validated (nonEmpty "Postal code")
$ textbox
in Address { country, state, postalCode }

The step combinator transforms a Form into a Wizard, and, due to Wizard's Monad instance, steps that follow each other in a monadic chain are sequential, that is, the next step is only available if all the previous steps are valid and have produced an output value. This happens because Monad introduces sequential dependency on values. One can notice this sequential dependency in the example above, as the country field depends on the value countries produced by the first form field.

The code above generates this simple, yet interesting form.

Conclusion

The embedded domain-specific language described in this post to specify and generate type-safe form UIs with validation has proven itself a successful experiment at Lumi. It has allowed us to, not only build our forms in a type-safe way, avoiding errors and enforcing constraints on the data structures and on the behavior of the forms, but has also enabled the team to build new forms much faster and more consistently with a small language that can produce complex and reusable forms that are yet easy to understand and maintain.

The types and functions described here are also a simplification of what we actually use at Lumi. Since we use purescript-react-basic to render our user interfaces, the UI type used throughout this post is different. Based on our UI guidelines for forms, we have also decided to use a more specific UI type that takes into account the input fields' labels, validation errors and nested fields. Other improvements worth mentioning are: the inclusion of a props parameter to the Form type, which is used as an argument to the function wrapped by form (besides the input value of type i); and a type class that generates default data for empty forms.

Some alternative approaches to the problem are worth mentioning, most notably David Peter’s purescript-flare, and Formlets, as described in the paper The Essence of Form Abstraction by Cooper et al., which served as inspiration for the approach presented here.

If you have any questions, ideas, or any other kind of feedback, please leave a comment below.

And if you’re interested in using PureScript to solve real-world problems like this one, then get in touch — we’re hiring at Lumi!