Composed serialization

or writing your own DSL for marshaling

Egon Elbre
2 min readMay 4, 2017

Warning: code ahead!

One of the annoying issues when handling bad or legacy data-formats, is getting the marshaling work with your own nice structures. The thing you want to read in might be a complicated mess of SOAP, but you want something nicer.

Let’s see the problem in action, here’s a “simple” response from a SOAP endpoint.

SOAP *some data redacted

So, how do we handle this mess?

Obviously we need to somehow reflect this structure in our code:

Defining how to marshal our structures.

This looks quite nice already… but we still need to implement the soap package.

Implementing the core.

Here we define a Node for parsing arbitrary xml structures. We have Spec types that can encode and decode from this Node structure. Strictly speaking Node isn’t actually required. We could just as well implement the Spec types as Marshalers. In this case having a separate Node tree made things easier.

Here are two types TagSpec and StringSpec. One for walking the Node tree and the other for marshaling a string.

Implementing basic types.

We have TagSpec for pattern matching on names and StringSpec parsing into a string. Notice how the StringSpec writes to a string pointer, rather than a string.

Finally to wire all of this together:

https://github.com/egonelbre/exp/tree/master/spec

The basic idea is to create a separate spec structure that has pointers to the target structure and then let the “spec” type handle all the marshaling/parsing, but write the result into the “target” structure.

This of course can be made to handle very complicated structures:

https://github.com/egonelbre/exp/tree/master/ber

Build your own

This approach gives us a easy way to write different DSL-s for marshaling data. You could imagine this being used for binary protocols or handling multiple formats with a single spec type.

General rule for implementing the spec structure.

Use a pointer to the type you want to capture.

For example, soap.TagSpec didn’t want to capture the input… hence it doesn’t contain pointers. soap.StringSpec wanted to capture a string and so the spec contained a *string. So, if you want to capture a *UserInfo then the spec type for it should contain **UserInfo.

PS: The examples here were meant to be as a proof of concepts. You really should handle errors properly, with meaningful messages and depending on your application your result will vary.

Conclusion

Spec types give a nice way of handling different complicated formats at the cost of some performance. They are flexible in their capabilities and can be composed quite nicely.

As a final exercise you can try writing a nested handler:

--

--