Parsing DSLs with predicates

This article illustrates how data specification can be used to validate and parse DSLs. We’ll use Clojure to create data-driven DSL for a subset of CSS, and clojure.spec library to define and parse the syntax.

clojure.spec is a specification library which provides tools (predicates and composition rules) to describe how data should look like, check if it conforms to that specification and provide an explanation of the cause in case of failure.

(s/def ::name string?)
(s/def ::age pos-int?)
(s/def ::profile (s/keys :req [::name ::age]))
(s/valid? ::profile {::name "Roman" ::age 25.9})
;; false
(s/explain ::profile {::name "Roman" ::age 25.9})
In: [:specs.profile/age]
val: 25.9 fails spec: :specs.profile/age
at: [:specs.profile/age] predicate: pos-int?

When conforming data the returning value is destructured according to the spec. This is very useful for parsing DSLs, because the output can clearly describe the structure of the syntax. Think of it as a way to produce an AST, which is much more useful for analysis and compilation. We’ll see this later.

As an example we’ll create a syntax for CSS Transforms value.

[[:translate-x 120 :px]
[:translate-y 150 :px]]
;; -> translateX(120px), translateY(150px)
[[:translate 120 :px]]
;; -> translate(120px)
[[:translate 120 :px 150 :px]]
;; -> translate(120px, 150px)

Let’s start by looking at W3C CSS Transforms specification first. The spec clearly defines that the value of transform property is either none or <transform-list>.

This translates into the code in a straightforward way:

(s/def ::transform              ;; transform
(s/alt ;; is either
:something ::transform-list ;; <transform-list>
:nothing #{:none})) ;; or `none`

The spec defines <transform-list> as one or more <transform-function> tokens.

In code this can be represented using clojure.spec’s regex operators which operate on sequences.

(s/def ::transform-list   ;; <transform-list>
(s/cat ;; is a sequence
:fns (s/+ ::transform-function)))
;; of one or more <transform-function>

<transform-function> is one of those CSS transform functions which you may be familiar with, e.g. translateX(x), translate(x, y).

(s/def ::transform-function     ;; <transform-function>
(s/or ;; is either
:translate-x ::translate-x ;; translateX
:translate-y ::translate-y ;; translateY
:translate ::translate)) ;; or translate

From the signature of these functions we can see that they accept a value of type <length-percentage>, which is either <length> (px , em, etc.) or <percentage> (%) type units.

(s/def ::length-percentage
:length ::length
:percentage ::percentage))

According to the syntax which we’ve specified in the beginning a value with a unit is a sequence of two elements: number and a keyword which represents the type of the unit.

(s/def ::length
:value number?
:unit #{:px}))

<percentage> has the same syntax, just with different type :%

(s/def ::percentage
:value number?
:unit #{:%}))

Let’s see how destructuring with spec works for ::length-percentage specification.

(s/conform ::length-percentage [100 :px])
;; [:length {:value 100, :unit :px}]

The returned value here is based on the spec that we’ve provided and the missing spots are filled in with values that are parsed out of the provided data. This representation is basically an AST.

Now that we have a spec for function parameters, let’s describe how the syntax of those functions look like.

(s/def ::translate-x          ;; translate-x function
(s/cat ;; is a sequence
:function #{:translate-x} ;; of :translate-x keyword
:value ::length-percentage))
;; and a nested sequence of ::length-percentage
;; translate-y is similar
(s/def ::translate-y
:function #{:translate-y}
:value ::length-percentage))

transform function is more interesting, because it has two parameters and the second one is optional.

(s/def ::translate          ;; translate fn
(s/cat ;; is a sequence
:function #{:translate} ;; of :translate keyword
(s/cat ;; sequence
:x ::length-percentage ;; of ::length-percentage
:y (s/? ::length-percentage)))))
;; and optional second value ::length-percentage

Finally let’s test the entire thing.

(s/conform ::transform [[:translate -190 :px 80 :px]])
{:function :translate,
{:x [:length {:value -190, :unit :px}],
:y [:length {:value 80, :unit :px}]}}]]}]

Now the compilation process goes into traversing the tree and translating tokens into a proper CSS format.

Also when conforming fails, meaning that there’s an error somewhere in the syntax, we can ask for a reason, why it is failed, in a form of data.

(s/conform ::transform [[:translate -190 :px 80 :p]])
;; :clojure.spec.alpha/invalid
(s/explain-data ::transform [[:translate -190 :px 80 :p]])
'({:path [:something :fns :translate :value :y :length :unit],
:pred #{:px},
:val :p,
:in [0 4]}),
:spec :specs.css/transform,
:value [[:translate -190 :px 80 :p]]}

We can see that the value :p in [[:translate -190 :px 80 :p]] failed predicate #{:px} of the specification :specs.css/translate. As you might guessed, this data can be used to report an error in a human-readable format.

For more real-world examples see the code that I’ve written for parsing and compiling syntax for CSS Media Queries.

To learn more about clojure.spec check out the rationale and overview.