Scala as a configuration language
You can use Scala as a configuration language - no libraries, no special config language, no macros - just a coding style. This article describes how and why you may want to do this. In summary this style provides more safety, more convenience and more flexibility than traditional config files. Sounds good?
Obviously configuration is not the largest part of your application and you may have bigger fish to fry. But this is one of the places where things break, where applications are often untested and where there is a better way to do things, that is more convenient at the same time.
The traditional way: config files
Most of us use HOCON (Typesafe config), YAML, JSON, TIML, .ini files, etc. to configure timeouts, verbosity, error reporting or connect to different database servers during testing and production. We’ll use the database host and port as the running example here.
To use any of these languages for configuration we need to import a library
that comes with an appropriate parser and way to access the values. We have to decide on a strategy where to read the config file from disk and how to parse out the values. A place where the application fails early if something is missing is probably a good idea, so we don’t crash way later in production.
And we should probably read the values out into a typed structure at a
single place in our app so it won’t fail in multiple places. We can read it into a case class to allow our code to access our configuration values in a well typed way throughout our program. Also we need to document somehow what configuration values exist or at least point developers to the source files that reads them, so they can see for themselves. Here is how this traditional way can look like:
Assuming you compiled your application into an application.jar fat jar, this is how you start it up:
java -cp application.jar Application test
java -cp application.jar Application production
The above is obviously just one example how to do it. It reads all configuration during startup, which is nice because you know early when you forgot something or provided the wrong type. In reality every application handles this differently and how things fail if values are missing or wrong depends on how you choose to use your configuration library.
So that works. Is there a way to improve over this? Yes! What if
- we could check if something is wrong at compile time rather than runtime?
- people who wrote new configuration were assisted in knowing which
values to provide?
- we got away without having to import and use a configuration library?
More fun for less work
We can just use Scala. And here is how. The Scala compiler already comes with a parser and with a type system on top that allows checking if values exist even earlier than at program startup time. There are different ways to do this in detail and we’ll consider a number of variants.
Configuration managed by the application
There is a way to keep your code very similar to the original config-library based example, but switch to pure Scala configuration. The nice thing is that the similarity makes it easy to understand with the traditional config-file mindset. Ultimately we can improve even further, but let’s start with this and move on to a different solution later.
So let’s convert the original example:
No need for a configuration library or separate config files. Everything is
plain Scala. We get compile time checking. This is super easy to implement
and deploy. Just bundle your whole application into a jar and deploy it,
Does this have limitations? Yes. It is not extensible. You need to know all configuration when you build your program. This may be sufficient for you or not. There are ways around this limitation while still following this structure, but it gets complicated. For example you could place the configuration objects in separate jars and classload the one with a specific name during runtime. Or you can ship with a dependency on the Scala compiler as a library and compile the configuration objects from source at runtime and class load them dynamically. People have done this, but it’s complicated and has lot of error cases to handle. So let’s not go there.
Also, if I had to nitpick I would say that a user of this has to know where to
look in the code base to write new configuration. He also has to understand
that args(…) is used to pass the decision what configuration to use through
the program and is eventually used in the Config object to decide.
Can we still get the safety and familiarity of Scala code while having
easy extensibility and requiring less internal knowledge from configuration-authors? Here is how:
This is basically the same as the earlier pure-Scala version but turned inside
out. Instead of passing the decision which configuration to use through our program as a string, we do it differently.
Now our Application does not have a main method of its own. It only has a
typed Scala interface. It’s the configuration’s job to satisfy it and expose it to the command line via a main method. This might look unusual, but it is actually very nice, because of the modularity and extensibility. Our application is not tied to particular configurations or configuration-file-formats. (We might even call the application from a totally different application as a library via its Scala interface).
So let’s say we write an application that follows this style. How do we handle deployments and selecting a configuration based on the environment? There are several alternatives:
1. Bundle configuration with your application
We can put configurations straight into our application. Of course this only
works for configuration we know at compile time, but often enough we do and then this is nice.
If we put it all into the same jar we could choose a configuration like this:
java -cp application.jar TestConfig
java -cp application.jar MainConfig
We can also put this class name into an environment variable if we want to run a different configuration on our staging server compared to our production server.
2. Bundle your configuration separately
If you want to create new configurations after you already compiled your
application, you can of course also compile configuration separately. Maybe
you deploy your application to a maven-style repository, then you depend on it in one of possibly many configuration projects, which only include e.g. your TestConfig.scala or MainConfig.scala.
Eventually you can bundle everything into a fat jar and run it like before, or
alternatively link the jars dynamically, e.g. like this:
java -cp application.jar:test_config.jar TestConfig
java -cp application.jar:main_config.jar MainConfig
application.jar here includes the classes Application and Config.
test_config.jar includes class TestConfig.
main_config.jar includes class MainConfig.
3. Compile configuration ad-hoc
You don’t even need to compile and bundle your configuration before deployment. You can compile it right where you use it. In this case again your application.jar includes only the classes Application and Config.
And you start up the application like this:
scala -cp application.jar MainConfig.scala
scala -cp application.jar TestConfig.scala
That means you would compile your configuration on your production server. You would deploy your configuration .scala files together with your pre-compiled jars.
This also enables you to change configuration directly on the production server without a formal re-deployment process. Not that I would encourage that, but in the first months of your startup, bending the rules may be part of your job.
Compiling your configuration on the server may seem unusual. But when you think about it, it is not that different from having your application depend on a library that contains a configuration file parser. Parsing JSON or parsing Scala is not that much of a difference conceptually, but with Scala you get all the niceties like static type-checking and the common, understood way of how errors are communicated back to the developer.
For this use case the Scala compiler will likely be a bit slower than the parser of your configuration library of choice. So this particular way of doing things may not be viable for interactive tools where startup time matters, but for most Scala applications we deploy, this is completely viable and easy.
4. Interoperating with other languages
Scala for configuration is great, but what if other languages need to access
configuration, too? Some of your applications may be written in Python.
One way to deal with this is enabling the configuration classes to write
configuration out into another format that other languages can consume.
toJson can be provided by libraries like circe, play-json, spray-json, etc.
To turn your safe scala configuration into something Python consumable, just run:
scala -cp ... ProductionConfig serialize > production_config.json
Of course this approach is biased towards Scala being the place where
configuration is maintained. This may or may not work in your workplace of course, but it should be easy enough also for non-Scala developers to change a file that is simply initializing a case class via named arguments.
5. Interoperating with configuration services
When configuration is managed by a service like Apache ZooKeeper you obviously can’t put your config into Scala files. So you are forced to use a library to read your configuration values rather than just Scala code.
You can still follow the nice “decoupled configuration” code structure
described earlier though and thereby have an application that is not coupled to one particular way of configuring things. This makes it easy to e.g. use ZooKeeper in production, while using Scala as a configuration language for development.
6. Mixing dynamic and static configuration
Nobody stops you from using a combination of static Scala configuration with dynamic configuration if that turns out to be most convenient for you. This is similar to how configuration libraries allow overriding configuration values via system properties or environment variables.
You can just make one of your configurations discover those values dynamically and optionally provide a static default value:
Scala for configuration is totally viable. It may be unusual, but it does not require hacks or jumping through hoops. It is actually simpler to do. It encourages a modular application design that exposes your configuration keys in the external interface of your application rather than hiding them somewhere in your application and injecting them via side-effects.