Data Validation — (schema spec)

flyboarder
degree9
Published in
4 min readMay 31, 2017
Clojure Logo

Recently we needed a small data validation function, which would allow us to compare arbitrary data structures. More specifically we need to be able to compare sets of data in a database against incoming web requests. We don’t know where this data is coming from or what is stored within the db.

This gives us an opportunity to try out the Alpha features of clojure.spec in the newer versions of Clojure.

We are using `clojure.spec` throughout this post. Newer versions of clojure use `clojure.spec.alpha`.

Lets get started. From a high level clojure.spec validates data against a spec of function’s/macro’s. The popular library `schema` validates based on existing data. Let’s see if we can merge the two concepts and simplify the codebase by only targeting ClojureScript … because reasons.

Anything and Nothing.

To begin we start by defining two of the most basic spec’s possible.

(spec/def ::any any?)
(spec/def ::nil nil?)

Data Structures.

Next we define specs for the basic data structures.

(spec/def ::number  number?)
(spec/def ::string string?)
(spec/def ::char char?)
(spec/def ::keyword keyword?)
(spec/def ::symbol symbol?)
(spec/def ::bool boolean?)

Collections.

Complex data structures consist of collections of basic data structures.

(spec/def ::list   list?)
(spec/def ::vector vector?)
(spec/def ::map map?)
(spec/def ::set set?)

Functions.

Finally the last piece of data which could be in our spec is a function.

(spec/def ::fn fn?)

Now that we have all our data types spec’ed we can start doing basic type checking. While Clojure gives us the ability to create complex specs using multi-methods, there is already another language feature which is able to dispatch based on argument type; protocols.

Protocols.

Lets define our schema based protocol.

(defprotocol SchemaSpec

"Provides an abstraction for validating data using clojure.spec
based on a data schema."

(assert [schema data] "See clojure.spec/assert.")
(conform [schema data] "See clojure.spec/conform.")
(explain [schema data] "See clojure.spec/explain.")
(validate [schema data] "See clojure.spec/valid?.")
(spec [schema] "Returns related spec for `schema`."))

Now that we have our protocol, lets create some cases based on data type.

This is our default protocol implementation. In ClojureScript default is used as a generic case for when a more specific case cannot be found.

(extend-protocol SchemaSpec default
(assert [schema data]
(spec/assert (spec schema) data))
(conform [schema data]
(spec/conform (spec schema) data))
(explain [schema data]
(spec/explain (spec schema) data))
(validate [schema data]
(spec/valid? (spec schema) data)))

You should notice that almost all of the protocol methods are implemented. spec is the only type specific method we need to implement.

JavaScript Types.

Lets go ahead and extend some more JavaScript types.

nil
(spec [schema]
(spec* schema ::nil))
number
(spec [schema]
(spec* schema ::number))
char
(spec [schema]
(spec* schema ::char))
string
(spec [schema]
(spec* schema ::string))
boolean
(spec [schema]
(spec* schema ::bool))

This introduces our helper function spec* . Which returns a spec from the registry using schema or a default value.

(defn spec*  "Returns default spec based on `schema`."  [schema & [spec]]
(or (spec/get-spec schema) spec))

ClojureScript Types.

Next we extend some ClojureScript types.

Keyword
(spec [schema]
(spec* schema ::keyword))
Symbol
(spec [schema]
(spec* schema ::symbol))

Schema Functions.

Our first implementation specific question now arrives.

How do we handle the presence of a function within a spec?

For our implementation functions have no special value.

We return a spec which validates the data is a function and that it is equal to the function in our schema.

function
(spec [schema]
(spec/and ::fn
(schema-equal schema)))

Introduce another helper function schema-equal . This takes the value of a schema and compares it to the value of our data.

(defn schema-equal  "Returns a spec where `data` equals `schema`."  [schema]
(fn [data]
(= schema data)))

Schema Lists, Vectors and Sets.

We now need to decide how lists will be handled, as well how will extra data be handled that is not within our schema.

For lists we would like to make sure our data conforms to the schema provided. This means extra items will fail validation.

List
(spec [schema]
(spec/and ::list
(spec/coll-of
(spec/and
(schema-spec schema)
(schema-contains schema)))))

Another helper function appears schema-spec . Which returns a spec from the schema, using data as a key or when data is in the format of [key val] .

(defn schema-spec  "Returns a spec from `schema` or based on `data` type."  [schema]
(fn [data]
(let [k (if (vector? data) (key data) data)
v (if (vector? data) (val data) data)]
(spec* (get schema k v)))))

Enter our next helper function schema-contains . This takes a schema list and verifies that the data is also found within the list.

(defn schema-contains  "Returns a spec where items within `schema` contain `data`."  [schema]
(fn [data]
(some #{data} schema)))

Our vectors and sets are identical.

PersistentVector
(spec [schema]
(spec/and ::vector
(spec/coll-of
(spec/and
(schema-spec schema)
(schema-contains schema)))))
PersistentHashSet
(spec [schema]
(spec/and ::set
(spec/coll-of
(spec/and
(schema-spec schema)
(schema-contains schema)))))

Schema Maps.

Our final implementation is for maps.

PersistentArrayMap
(spec [schema]
(spec/merge ::map
(spec/keys :req-un (keys schema))
(spec/coll-of
(spec/or
:spec (schema-spec schema)
:kv (schema-kv schema)))))

This checks that the keys within our data are also within our schema and uses our helper schema-kv . Which returns a spec where schema is equal to data when data is a key or where data is in the format of [key val] pair.

(defn schema-kv  "Returns a spec from `schema` where key `data`."  [schema]
(fn [data]
(let [k (if (vector? data) (key data) data)
v (if (vector? data) (val data) data)]
(= v (get schema k v)))))

We have reached the end of our little tutorial, from here you can try validating data using data.

(explain {:go "have" :some ['fun]} {:go "have" :some ['soup]})

If you would rather a nice little gist you can do that to:

--

--