A Powerful F# Library Shows How S-Expressions Might be Superior to XML and Json
First, a little historical context…
People used to love XML, especially those in the Microsoft camp. XML represented a powerful enabling technology for programmers of that day — data-driven programming without plain text files. Out of this love for XML came many technologies on which we still rely. For example, XML has been used -
As a Data Description Format -
Often good for serializing objects at run-time, XML enabled the following storage solution -
For Visual Studio project files -
In the late 90’s and early 2000’s Microsoft were quite enamored with XML as a solution to many long-standing problems…
As a DSL language such as XAML -
Microsoft also saw XML as a solution to its newer and more interesting problems -
XML is Dead! Long Live… Json?
Over time, the industry has come to know XML as a technology with many disadvantages. First of its problematic attributes is its verbosity. Tag names are duplicated everywhere, and angle brackets are all over the place. Attribute tags save some space, but their syntax is somewhat bizarre and entirely non-normal, requiring programmers to specialize code their interpreters to deal with its alternative structure. Second of its problems is that XML is ‘stringly’ typed —that is, the only type it can explicitly represent is a string. This adds an additional parsing phase to any interpreter that wants to pull out numbers or dates. Thirdly, are its many issues caused with special characters. For those who are familiar, I need elaborate no further.
As far as advantage go, Json does have support for limited type information. Json can, for example, tell the difference between a string and a number. Also unlike XML, you’d be hard-pressed to put together a DSL like XAML. Without something like XML’s attributes, it’s just not sufficiently information-dense. While XML does a reasonably good job at capturing DSLs, Json does comparatively poorly.
Code is Data Too!
The area where both XML and Json fall down is in enabling scripting languages when you find out that you need them. If you’ve ever seen someone try to implement an ‘if’ or ‘foreach’ loop form using XML or Json syntax, you will know what I mean. Both of these languages were designed to be data languages, but neither of them compose well when the data they are describing is program behavior (EG — code). It is pretty damning that there are important forms of data that neither format represents well — even if that particular type of data is code.
Looking at the trade-offs of each format, you might be best having your data storage as Json, your DSLs in XML, and your scripting languages as whatever ad-hoc syntax you come up with. But the real question is — why would you want a separate format for your data, your DSLs, and your scripting languages? And why would you want to have to write a custom parser and a custom interpreter for each and every scripting language you need? Why not use a library that provides a single format that solves all of these problems? And rather than spending weeks or months hand-crafting a scripting language, why not use an existing scripting language whose semantics can be extended with a simple plug-in?
(At the time of this article, there’s actually one reason to not use Prime — Prime is currently implemented mostly only for F# data structures like algebraic data types, functional List, Map, and Set. The idea is sound in all types of languages, so I also have a partial port of Prime to C# here - https://github.com/bryanedds/Sigma)
Using Prime as an Automatic Serialization Solution
First, let’s take a look how Prime automatically serializes and deserializes your types in F#. Take the following Person type -
You can construct a Person, serialize it to a string, and write it out to a file with the following code -
To deserialize said person, all you need to do is -
As you can see, there are only two novel functions you need to know about for serialization and deserialization — scstring and scvalue. It really is that simple.
So what does the data looks like when serialized?
Compared to XML and Json, this is a very succinct and lightweight format!
However, you may notice one immediate trade-off — because there are no name tags for each element, the order of fields is important. You can’t, say, put the Name after the Age — it will raise a ConversionException since it expects a string for the first value. This is a slight disadvantage in some cases, but is a huge boon for succinctness. However, if you do actually need property names written out along with their values, you can simply attribute the type like so -
And it will be written out like this -
[[Name "John R."]
[FavoritePetOpt [Some "Scruff E."]]
With this approach, you can also put the fields in any order you like.
Using Prime for DSLs (Domain-Specific Languages)
Oftentimes you want to encode some data that will later be executed by a little interpreted within your program. Consider this F# type which is used to implement special effects in an existing game engine -
Using an attribute, we can declare the keywords that are used to define an effect with text. This data is used for syntax highlighting, auto-completion, as well as determining pretty-printing behavior.
Here is a screenshot of this DSL being used to construct special effects at run-time in the world editor -
Editing these types of constructs are run-time can be essential, and Prime provides fantastic facilities for enabling these types of external DSLs when you need them!
Your Very Own Scripting Language — For Free!
The above features are very powerful. But sometimes you need additional super-powers, such as being able to write program control constructs at run-time. Fortunately, Prime offers an extensible scripting language called AMSL (A Modular Scripting Language) which is built on top the above features.
Let’s take a look at some example AMSL code from the Prelude file where its standard functions are defined -
You might be able to pull off something like this in XML or Json with a lot of hacks and a lot of syntactic compromises.
But let’s look at some more involved AMSL code…
Out of the box, the scripting language includes the full lambda calculus, functional data structures, dynamic polymorphic functions for user-defined data structures, and much more. Don’t even think about doing this type of thing with XML or Json! With Prime, it’s all based on the same code, all built on the same functions, and all inter-compatible.
One downside that I must mention, however, is that there isn’t yet much documentation for AMSL. Most of what you can learn has to be gleaned from looking at the full Prelude.amsl file here — https://github.com/bryanedds/Nu/blob/master/Prime/Prime/Prelude.amsl. Work on the language is still a bit in progress, and the documentation phase has yet to get under way.
Hopefully we can now see how s-expressions solve some of our most common programming problems in a consistent, succinct, and coherent way. Once we put together a good standard based on s-expressions, XML and Json become, IMO, technically obsolete. And please don’t be confused by the use of F# — these techniques are just as applicable in an imperative / Object-Oriented code base as they are in a functional one. So much so that I don’t know why this approach hasn’t been in use for decades…
In the next article, we’ll look at how to add custom semantics to AMSL with an F# plug-in. For those who want to just see an example RIGHT NOW, have a look here, here and here. Until then, please let me know your thoughts, feedback, and gripes in a comment below!