Announcing ‘conformance’

Nick Van den Broeck
Casper Association R & D
6 min readDec 20, 2023

This is a cross-publication of a post by our team member Tom Sydney Kerckhove, originally published here.

This post announces the conformance library factored out from ical. It implements RFC 2119 in order to help you write implementations for other specifications. The conformance library exists to let you write a single parser that you can then run in multiple modes: strict and lenient.

Robustness and testing

If you have ever implemented a specification, you have probably seen a section like this:

The key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL in this document are to be interpreted as described in RFC 2119.

This section makes it clear how common terms like MUST should be interpreted. It is to be read in context of the robustness principle:

Be conservative in what you do, be liberal in what you accept from others.

Let’s say you’ve implemented a specification. Usually that means that your implementation does some things: produce and render data, perform API calls, parse data and serve API calls. Our goal in implementing specifications is to test that

  • you can parse the data that you produce strictly;
  • you can parse others’ data leniently.

Approached naively, this would involve two implementations of a parser: A strict one and a lenient one. The conformance library exists to let you write a single parser to run in multiple modes: strict and lenient.

Robustness in the face of violations

There is a second problem that the conformance library solves: Powerful implementers violating the specification in externally fixable ways. For example, take the case of the Production Identifier (PRODID) property of calendars in the Internet Calendaring (RFC 5545) specification. The specification says:

Conformance: The property MUST be specified once in an iCalendar object.

And we know from RFC 2119 what MUST means:

MUST: This word, or the terms REQUIRED or SHALL, mean that the definition is an absolute requirement of the specification.

The specification is very clear: Any implementation is supposed to reject a calendar without the PRODID property. Right?

Along comes Apple, which exports this calendar file without any PRODID property (as of this post’s publication date):

BEGIN:VCALENDAR
VERSION:2.0
X-WR-CALNAME:with Syd
X-APPLE-CALENDAR-COLOR:#ff2d55
END:VCALENDAR

Assuming that we don’t really need the PRODID for our application, we are faced with two options when implementing the iCal specification:

  • Reject this calendar, and never allow users to integrate with Apple.
  • Accept this calendar, and break the specification, contributing to this problem.

We could of course attempt to contact Apple, but that doesn’t help in the meantime, and we don’t know how long the meantime will last. This problem is also not exclusive to Apple, as we have found invalid calendar files from Google calendar, Microsoft outlook, Booking.com, and Fastmail as well. Instead we are forced to at least give users the option of accepting invalid data. The conformance library solves this by providing an “extra lenient” mode for parsers in which fixable errors like these are fixed by guessing a fix.

API Overview

The core of the conformance library is the ConformT monad transformer. It has these type parameters:

newtype ConformT ue fe w m a
^ ^ ^ ^- The underlying parsing monad.
| | \-- The type for "SHOULD", "SHOULD NOT", and "OPTIONAL" warnings
| \---- The type for fixable errors, in case "MUST" or "MUST NOT" is violated in a fixable way.
\------- The type for unfixable errors, in case "MUST" or "MUST NOT" is violated in an unfixable way.

If any of the three type parameters are not necessary, you can use Void for them. For example, if you don’t use the warnings:

type MyParser a = ConformT Error FixableError Void UnderlyingParser a

You can then write your parser as you would, using lift where you do the underlying parsing:

lift :: P a -> ConformT ue fe w P a

You can emit warnings and fixable errors:

emitWarning :: W -> ConformT ue fe W P ()
emitFixableError :: FE -> ConformT ue FE w P ()
unfixableError :: UE -> ConformT UE fe w P a

All of these will halt execution depending on the strictness of the mode you are running the parser in. Strict mode turns any warnings into errors, whereas normal mode lets warnings pass and lenient mode fixes the fixeable errors through the implementing guessing strategies. In code:

runConformTStrict ::
ConformT ue fe w P a
-> P (Either (Either ue ([fe], [w])) a)
runConformT ::
ConformT ue fe w P a
-> P (Either (Either ue fe) (a, [w]))
runConformTLenient ::
ConformT ue fe w P a
-> P (Either ue (a, ([fe], [w]))

Lastly, you can even choose which fixable errors you want to fix and which you don’t, at runtime, by supplying a predicate that decides which fixable errors to fix:

runConformTFlexible ::
(fe -> P Bool) ->
ConformT ue fe w P a ->
P (Either (Either ue fe) (a, ([fe], [w])))

Example

Suppose we have a simple language specification that has the mention of RFC 2119 about how to interpret words like MUST:

A code is defined as two characters.
The characters MUST be alphabetic characters.
The first character MUST be upper-case.
The second character SHOULD be upper-case.

Examples:
AB
De

You can now implement a parser like this:

module Example where

import Conformance
import Control.Monad
import Data.Char as Char

myParser :: String -> Conform String String String (Char, Char)
myParser = \case
[c1, c2] -> do
let checkAlpha c =
if Char.isAlpha c
then pure ()
else unfixableError $ "Not an alphabetic character: " ++ show c
checkAlpha c1
c1' <-
if Char.isUpper c1
then pure c1
else do
emitFixableError "The first character is not upper-case."
pure $ Char.toUpper c1
checkAlpha c2
when (not (Char.isUpper c2)) $ emitWarning "The second character is not upper-case."
pure (c1', c2)
_ -> unfixableError "Did not specify exactly two characters."

Here we used an unfixable error for violations that we cannot fix, such as not having enough characters or encountering non-alphabetic characters. Any violations we can fix, become fixable errors, such as the first character not being upper case. We can fix that by making the character upper-case with toUpper. (Note that that only works because the characters are alphabetic.) Lastly we use a warning for a SHOULD in the spec: The second character is not upper-case.

We can then run our parser on some examples:

-- Strictly
ghci> runConformStrict $ myParser "AB"
Right ('A','B')
ghci> runConformStrict $ myParser "Ab"
Left (Right ([], ["The second character is not upper-case."]))
ghci> runConformStrict $ myParser "aa"
Left (Right (["The first character is not upper-case."], []))
ghci> runConformStrict $ myParser "A1"
Left (Left "Not an alphabetic character: '1'")

-- Normally
ghci> runConform $ myParser "AB"
Right (('A','B'),[])
ghci> runConform $ myParser "Ab"
Right (('A','b'),["The second character is not upper-case."])
ghci> runConform $ myParser "aa"
Left (Right "The first character is not upper-case.")
ghci> runConform $ myParser "A1"
Left (Left "Not an alphabetic character: '1'")

-- Leniently
runConformLenient $ myParser "AB"
Right (('A','B'), ([], []))
ghci> runConformLenient $ myParser "Ab"
Right (('A','b'), ([], ["The second character is not upper-case."]))
runConformLenient $ myParser "aa"
Right (('A','a'), (["The first character is not upper-case."], ["The second character is not upper-case."]))
ghci> runConformLenient $ myParser "A1"
Left "Not an alphabetic character: '1'"

Conclusion

The conformance library allows you to write specification-compliant parsers while also letting you test that your own output is strictly specification-compliant. It can be found on Hackage and GitHub. For usage examples, check out my ical implementation.

--

--