Configuration envy

Doug Tangren
7 min readOct 16, 2018

--

How software is configured has always been a fascinating topic of interest to me. Not because configuration itself is interesting, but the observation of heavily crafted engineering solutions to a seemingly simple problem in the field of software engineering is. This is a post about a happy medium I’ve found for Rustlang applications and itty bitty crate I published a while back that enables that.

👩‍🎨 Some applications have a strong desire for artisanally crafted configurations . Looking back in history, whole markup languages were invented just to satisfy these configuration desires. In some limited cases this is warranted but in many in other cases, not so much. Like any technical problem, these off the conventional path approaches come with tradeoffs. Configuration is a user interface. Users of your software must learn and understand your configuration interface to effectively use your software. Inventing new configuration models and frameworks can be an interesting exercise for library developers but users of your software more often then not benefit more when they are able to reuse their existing knowledge of existing configuration tools and languages. That’s not to say configuration languages are bad per-say, but please consider users the next time you decide to invent a new configuration language 🙅‍♀️.

My view on configuration

Configuration has a purpose. Understanding purpose is important when attempting to understand why something exists. The purpose of configuration is to change the behavior of software based on a dynamic set of inputs.

This is interesting to me because the way I look at layers of software has a strong influence on how I view configuration.

Given any abstract programming language, here’s a how I see a function.

fn foo(args) { impl.. }

Here’s how I see a larger abstraction.

class EnclosingType(args) { impl.. }

Here’s how I see a programs execution

shell program(args) { impl... }

Do you see the pattern? Each level of abstraction is configurable via parameterization, including the program process. One way to describe this is currying. You can think of a program’s execution path as a currying of parameterized arguments that back track all the way to the shell that executes your program. There’s probably a lisp pun in there somewhere about all programs just being lists of parameterized functions surrounded by parenthesis, but in the abstract this is my mental model for configuration: parameterization of inputs at the program process level.

In the 1990’s there was a popular term that sold many books called “dependency injection” which described the ability to change the behavior of software based on the wiring of its components. At some point the wiring became sole focus of interest and people all but forgot the underlying purpose of the wiring. I can’t recall the original source or wording for a little bit of wisdom I picked up on over the years, but paraphrased, I’ve also heard dependency injection described as this.

Dependency injection is a fancy phrase for “passing arguments “ — unknown

An external observer might ask, “Sooo, the reason why software engineers invent new configuration management protocols to pass configuration along to their programs is because one does not already exist right?” Well, no. One does exist. It’s called the env. I like it. Here’s why.

It’s universally available to all programs on all platforms on all OS’s through a mostly uniform interface and a mostly uniform protocol. This means it doesn’t matter how what kind of application you write, where you run it, what language you write it in or what language you’re reading its usage in. You will always have access to the env and be able to understand its semantics. That’s a pretty amazing property in software and that dates all the way back to 1979! It’s interface is simple and easy to understand. You have named keys and, given a key, you can ask the env for a value. For many usage cases that’s all you actually need.

So then the external observer may ask, “Sooo, why would anyone invent a new language and protocol just for program configuration if one already exists, and exists everywhere?”

Software engineers… like to engineer things. Despite the liberating qualities of the program env, some can feel constrained by its limited feature set.

One question often asked about when using the env for configuration is how you store it (for later use) and how you can utilize that source consistently. Since env variables exist in the realm of the software that runs your program, you have to find a way to ensure those variables actually get set prior to process execution. Some engineers like using configuration files to solve just this problem. Here’s one of my favorite env configuration management tricks.

$ cat .env
FOO=bar
BAZ=123
$ env $(cat .env | xargs) program

What just happened? I’ve captured a collection of env variables in a file called .env so I can pack it up and take it on my next camping trip. I then ran env(1) to set up my program’s execution env by calling cat(1) to get the lines of that file and xargs(1) to evaluate each pair, then ran my program with the env configured. Feel free to use this trick and take your next env camping too. No framework required! Interestingly, there is actually whole software ecosystem that’s emerged around reading in configuration of this format. So feel free to pack a framework if you’d like as well :)

Another complaint that usually comes along with using env for configuration is that the env knows nothing of the types programs expect their values to be in. The limitation often comes at the expense of runtime failure at the point the env variables are parsed which may be much exist deep within the application. This is often a form of justification for creating a configuration system that describes types as part of the configuration. And truth be told, some configuration values is very difficult to describe with simple types, thus new configuration languages are born 👶. Another approach may be to reconsider flattening configuration structures to fit a more standard approach. Place the burden on the library author, not the user.

Freedom of expression is important to software engineers, but it comes with tradeoffs. Consider this: when a program speaks it’s own unique configuration language, it’s harder to integrate that program into a larger system of coordinated programs. Speaking a common existing configuration language really helps with inter orchestration politics.

Ideally every program could play together and compromise on at least a few universal configuration languages.

Best of both worlds

Rust, like many other programing languages, provides an std library interface for interacting with the program env.

Rust is a programming language that doesn’t need a optometrist’s prescription to tell types apart which introduces some disheartenment when working with raw env variables for configuration. Ideally a program’s configuration could just be represented as any other typed input, like as struct. Structs have fields, which can hold other statically typed data. This also has the nice property of enabling more unit testability of your application as you then have complete control over providing that type as an input your tests.

Luke: Use the ecosystem

Rust has an amazing library ecosystem. In particular a library for taking some arbitrary input and deserializing ( or serializing if you care to ) type safe outputs exists and has been well adopted in the community. It’s called serde. If you are doing anything remotely related to serialization in Rust, trust me, you need this. Want to store your application’s configuration in a json file, you’re covered. Yaml? We got you. Toml? Yep. There’s probably half a dozen other data formats you could store your configuration in and you’d be set with the Rust’s existing serde ecosystem.

However, for many of my use cases for running Rust applications, storing configuration in file format is less attractive as I’m typically running Rust inside docker containers and container orchestrators typically encourage the use of standard interfaces like the env for configuration. So I pondered 🤔, “What if I could treat my program’s env parameterization with the same level of typing I treat my functions and enclosing types with while getting everything one get’s from using serde for free?” That would be the bee’s knees 🐝 .

Enter: envy. Here’s what it looks like

Note a few things. Config is not a special type. It’s your type. Call it whatever you like. It just needs to derive serde’s familiar Deserialize type which serde makes embarrassingly easy. You’re process brings the env… and that’s it. That’s kind of neat! No awkward and error prone string parsing. No awkward intermediary types. Idiomatic error handling.

Envy is just an env deserializer for serde. Envy assumes the snake_cased names of your struct map to SCREAMING_SNAKE_CASE env variable names provided by your process env. That’s it. Serde is my configuration framework on top of the OS provided program env. I haven’t really needed anything more. Since this program uses only the program’s env for configuration it can be easily configured on any platform basically everywhere. Since my configuration is represented as a simple struct, it’s pretty straight forward to reason about and encodes expectation my application has able its env so that I may fail the application as early as possible when those expectations are not met.

Using the env for configuration will not likely cover all of your use cases but it’s covered many of mine. As mentioned above all solutions come with tradeoffs this one gives you strong typed configuration by limits what you can express. You can currently only express simple types, optionals, and simply vectors. I’ve found you can get very far with these by thinking creatively and giving yourself limits. Try giving yourself some constraints. You may realize that you’ve packed much more that you need in your camper without realizing it.

A keen observer may have noticed in the illustration above that I’ve included program args along side the env. This post focused mainly on my typical usecase: server side applications. I’ve had much greater success with program args as configuration with command line applications. In those cases amazing crates like clap and structopt fill the gap from wishfully typed to statically typed inputs for me. Another sign point for Rust’s amazing library ecosystem.

👋 In a followup post I’ll walk through some work I’ve recently done on a new crate which leverages envy and its simple approaches, applied to a different context.

--

--

Doug Tangren

Meetuper, rusting at sea, partial to animal shaped clouds and short blocks of code ✍