Continuous Integration with Type Providers
One of the most underrated features of F# is Type Providers. The “magical” tool that creates types on the fly using information that the compiler has pulled from a data source.
Why Type Providers?
For example, Imagine you are to consume a REST endpoint that returns JSON. Typically in a statically typed language, you’d have to physically inspect the JSON returned from the endpoint and then create classes that correspond to the schema. After that, you can then make the call and deserialize the response into the corresponding types that you created.
This is a lot of work for just consuming a data source. In a dynamic language, those steps are avoided altogether and you just access properties directly. The downside is you don’t have any guarantees or safety about your code that calls that endpoint. Add that to the fact that IDEs and tools can’t help you here and you just have to trust that you wrote the code correctly. If not, you’ll find out at runtime.
Type Providers give the best of both worlds. The compiler does all the creation of classes and also does the deserialization for you. That way, you can just access data the same way dynamic languages do but still have the type safety that comes with static languages.
The Problem — Continuous Integration
Type Providers are awesome until you want to do continuous integration. Where you want a build server to build your code in isolation. It may or may not have access to your database or API endpoint. Even if you could get that to work somehow. It violates one of the key principles of Continuous Integration which is the idea of having Repeatable Builds.
“Repeatable Builds” is the idea that given the same source code and build environment, the build output/outcome should remain the same.
This becomes a problem because our compiler just used data from somewhere other than our source code in order to build our code.
Luckily, there is a workaround for simpler type providers like the JSON Type-Provider where you can copy a sample output to a JSON file and then point the type-provider to that one instead.
However, it becomes tricky when dealing with a database. How do you encode the database schema into a file? I had this problem when I set up continuous integration for one of my projects. In the project, I used the SqlDataConnection Type Provider which is based on LINQ to SQL behind the scenes.
After inspecting the Type Provider API, I noticed that it takes a DBML file (Database Markup Language). This is a description of the database in XML.
The new challenge is now generating a DBML file because there’s no way I’m crafting it by hand.
Well it turns out there is a tool called SQLMetal that comes with the .NET SDK
Step 1: Copy SQL Metal
Go to “C:\Program Files (x86)\Microsoft SDKs\Windows\v10.0A\bin\NETFX 4.7.1 Tools” or similar directory, copy out sqlmetal.exe into a folder in your project and also check it into source control. In the case of the Dumia project, I put it in “./utils/sqlmetal.exe”
Step 2: Invoke it during migrations
Because it dumps the schema of the database as a DBML file, you need to invoke it whenever the schema changes. Luckily I use FluentMigrator so, in my migration script, I added the call to SQL Metal as follows
This dumps the DBML file in the Infrastructure project
Step 3: Point the Type Provider to the schema file
Next, we need to make our type provider point to the schema file (DBML) that we’ve generated so that it can take our types from that file. This means that we can change the file, the compiler will use that version of the file to build. which pretty much ensures that we’d have repeatable builds.
To do this we simply pass it as an extra type parameter
and Voila! you are done. Now when the application is built on a build server, it will use the dbml however at run-time it will the connection string instead.
The full source code is available at the Dumia project github repo
This might feel like you are losing some of the flexibility of Type Providers but not necessarily. First of all, you can easily generate the schema data whether by dumping a JSON file or generating a DBML. Secondly, changes to the schema file are much simpler than changing raw source code.
Ultimately, if you are experimenting and exploring data then you will get the most out of using Type Providers the way it is out of the box. But if you are writing production level code that needs to be reliable, then you need to have someway of codifying the schema because it should be part of your source control and built by your CI/CD Pipeline.