Json Schema Type Provider: Why We Built and Abandoned It
This post is a follow up on one of the Jet’s GitHub projects, Json Schema Type Provider.
Rationale and history
4 years ago, when we just started building out Jet Partner API, we faced a massive task of validating and (de)serializing all the incoming JSON messages. We also needed to build documentation for Partners to help them with integrations.
Classic approach would be to have some custom logic built for each endpoint to validate each field and produce meaningful error messages. We felt there’s a better way to do this, with minimum code footprint, so we searched for alternatives.
We chose Json Schema that was on Draft 3 back then. It gave us an ability to constrain JSON payloads in declarative manner. For example, instead of having a piece of code validating length of the product title field, we could have something like
“minLength” : 5,
“maxLength” : 500,
“description”: “Short product description 5 to 500 alphanumeric characters”}
in schema.json file, and then use excellent Newtonsoft library (version 6.0.3 back then) to do all the mundane validation for us. Of course, there were many validations that still had to be done manually, like verifying messages against stored state. Nevertheless, it cut down LOCs we had to write, test, and maintain immensely.
Luckily, our team included outstanding product managers that had no problem translating payload requirements directly into schema files, freeing developers to focus on other tasks. It was a dream of domain language shared between business and development come true.
Moreover, with minimum coding we could turn these schemas into API documentation and deploy it to Azure API Management platform. Even though we since abandoned this approach in favor of richer API documentation site, it allowed us to move with incredible speed when it was critical.
Still, there was all this boring code to write that would represent JSON in F#, plus (de)serialization. That’s when we decided it might be a good idea to have Type Provider to generate POCOs for us directly from JSON schemas.
Reasons for retirement
While implementation wasn’t particularly hard, there were several reasons this type provider was ultimately abandoned in favor of more traditional approach:
- POCOs are no fun to use from F#. We had to deal with nulls on attributes and this alone was enough to take a second look at the solution. Back then Newtonsoft didn’t have much support for F#, and even if it did we were limited in what types we could use for properties in type provider implementation
- Visual Studio used older version of Newtonsoft internally. It meant that we couldn’t pass types defined in Newtonsoft from type provider as it was built with the version different from what it could bind to in Visual Studio environment. It made implementation rather inefficient, we had to parse schema on every validation
- Relying on Newtonsoft for validation limited our ability to customize schema for our needs.
Because all of that we kept looking for more F#-friendly ways of dealing with JSON. JsonValue library from FSharp.Data project suited us well. At that time we were running fsx scripts as microservices, the necessity of referencing both design time and run time libraries made it somewhat awkward. We ended up forking relevant part of FSharp.Data directly into our code base. This also created an opportunity to build custom JSON Schema validation and abandon Json.Net completely.
At that point it also became clear that while validation is necessary for Partner API, (de)serialization is not. A lot of messages coming into API are passed to downstream systems without applying any business logic other than basic validation, so dealing with
JsonValue records directly proved to be much more efficient than supporting translation to and from F# record types.
It is easy to miss an impedance mismatch between JSON and .Net types. (De)serialization by itself is a form of validation and as such it exposes limitations foreign to JSON. Those who remember famous blog post The Vietnam of Computer Science by Ted Neward that described challenges around Object-Relational Mapping can spot similarities here, but that’s a topic for another post.
Having upfront validation based on schema allows Partner API to avoid leaking out details of implementation to customers because error messages are driven by publicly available schemas and not internal .Net-specific representation of the same data. While schema-backed validation is limited to individual properties, it eliminates a lot of boilerplate. Of course, there’s still custom validation that needs to be done that is more business-specific, including validating a message against existing state. Interestingly, it proved rather efficient to do that by manipulating
JsonValue directly rather then using intermediate F# types.
JSON schema creates powerful opportunity to build types with well-defined structure that would (de)serialize in transparent manner. I believe that Json Schema Type Provider can be re-written to operate with
JsonValue types directly and use custom schema validation.