Composed serialization
or writing your own DSL for marshaling
Warning: code ahead!
One of the annoying issues when handling bad or legacy data-formats, is getting the marshaling work with your own nice structures. The thing you want to read in might be a complicated mess of SOAP, but you want something nicer.
Let’s see the problem in action, here’s a “simple” response from a SOAP endpoint.
So, how do we handle this mess?
Obviously we need to somehow reflect this structure in our code:
This looks quite nice already… but we still need to implement the soap
package.
Here we define a Node
for parsing arbitrary xml structures. We have Spec
types that can encode and decode from this Node
structure. Strictly speaking Node
isn’t actually required. We could just as well implement the Spec types as Marshalers. In this case having a separate Node tree made things easier.
Here are two types TagSpec
and StringSpec
. One for walking the Node
tree and the other for marshaling a string.
We have TagSpec
for pattern matching on names and StringSpec
parsing into a string. Notice how the StringSpec
writes to a string pointer, rather than a string.
Finally to wire all of this together:
The basic idea is to create a separate spec structure that has pointers to the target structure and then let the “spec” type handle all the marshaling/parsing, but write the result into the “target” structure.
This of course can be made to handle very complicated structures:
Build your own
This approach gives us a easy way to write different DSL-s for marshaling data. You could imagine this being used for binary protocols or handling multiple formats with a single spec type.
General rule for implementing the spec structure.
Use a pointer to the type you want to capture.
For example, soap.TagSpec
didn’t want to capture the input… hence it doesn’t contain pointers. soap.StringSpec
wanted to capture a string
and so the spec contained a *string
. So, if you want to capture a *UserInfo
then the spec type for it should contain **UserInfo
.
PS: The examples here were meant to be as a proof of concepts. You really should handle errors properly, with meaningful messages and depending on your application your result will vary.
Conclusion
Spec types give a nice way of handling different complicated formats at the cost of some performance. They are flexible in their capabilities and can be composed quite nicely.
As a final exercise you can try writing a nested handler: