Moirai: A Feature Flag Library for the JVM

Stephen Duncan Jr
Jan 14, 2019 · 5 min read

by Stephen Duncan

Software is really complicated. A lot can go wrong, especially when you’re working with cloud-based, distributed systems. You accidentally introduce a bug into your logic while refactoring. You update your code and inadvertently create performance problems in your service. You could make a small change that performs well in your service but creates performance problems in another service. Perhaps you make an update that seems compatible, but actually causes problems in clients of your service.

Here I will discuss some of the techniques we have at our disposal to limit the risk of negative impact to our consumers. Then I will introduce you to a library I created in order for my team (and others) to implement feature flagging: Moirai.

Minimizing Risk

We can write tests to run against our code in isolation, like unit tests. But these tests don’t cover a lot of the integration problems we see in practice.

We can deploy our service to a test environment and run tests against that system. But it’s expensive and difficult to replicate a realistic environment and — even when we try — it’s never a perfect match for production. Even if we do get close, there are still a lot of problems that we can’t predict with our tests.

To limit the scope of impact for a change, we deploy new versions to take a fraction of the traffic before taking 100% of the traffic. This is good for catching widespread problems that will show up in our monitoring tools, but it’s not great for problems that show up more subtly.

Feature Flagging

With feature flagging we make our change and enable it by some condition (or flag), so that we can control how the new code is exposed. Also called “feature toggles,” feature flags can be used for many purposes and in many ways: Release Toggles, Experiment Toggles, Ops Toggles or Permission Toggles are all categories described in this Feature Toggles article. Especially of interest is Ops Toggles, which is described as in the article as:

These flags are used to control operational aspects of our system’s behavior. We might introduce an Ops Toggle when rolling out a new feature which has unclear performance implications so that system operators can disable or degrade that feature quickly in production if needed.

We use feature flags to mitigate risk of deploying changes. By placing conditions on exposing changes to our code — instead of doing so via deployment mechanisms — we are allowed more fine-grained control over who is affected and by which changes. For example, we can set the condition that our feature is only used by internal alpha testers on our team, or we can set the condition that allows our feature to be used by a wider beta testing group. Essentially, we can roll out our feature to a percentage of our users at a rate we feel comfortable with in order to gauge the performance impact.

Moirai

When my team wanted to start using feature flags as part of our development and deployment process, I briefly looked for existing open-source solutions for the JVM (we primarily use Scala for our services as well as some Java). Most of what I found seemed heavy-weight with a lot of assumptions that didn’t fit our needs. For instance, I found solutions that are Java Servlet-based or database-backed, and some that rely on thread-locals that interact poorly with reactive code patterns where logic runs on many different threads. I couldn’t find anything that quite fit our needs.

As a result, I created my own library — one meant to be light-weight (usable with minimal dependencies for any JVM project) and flexible (functionality composable to meet a wide variety of usage patterns). I call it “Moirai” (the Fates from Greek mythology). Moirai has been used in dozens of services by multiple teams and is now part of Nike’s open source contributions.

Moirai consists a few main features. The first is a ResourceReloader for periodically fetching updated configuration from some source. This allows us to adjust the settings for our feature flags within a minute, instead of requiring a whole new deployment to adjust. The use of the ResourceReloader is optional, you may prefer to stick with making changes only through deployments.

Second, Moirai contains modules for configuration sources and formats. Currently, it has Amazon S3 as a configuration source, and it supports HOCON via Typesafe Config as a format. We deploy our config file through a Jenkins job that uploads the file to S3 after verifying the config file is able to be parsed. Contributions for other sources or formats are welcome.

Finally, Moirai supports some common patterns for deciding if a feature flag should be enabled. Currently, it supports an explicit list of user identifiers that should have the feature enabled as well as a proportion of users that should have the feature enabled using a modulo of the hash for the user identifier. These are expressed as predicates, which you can then combine together to fit your requirements. We typically just combine these together with or to make them additive (as shown in the Moirai README). It’s also easy to add custom logic. For example, it’s simple to just toggle a feature on or off completely rather than to do so based on the users, or to use some other aspect of your data (for instance, by a particular entity being requested instead of the user making the request).

Putting it all together, a typical configuration for your project might look like this:

moirai {
data.useNewService {
enabledUserIds = [
8675309
1234
}
enabledProportion = 0.01
}
}

This lets you test in production with some specific users and then start roll-out to a percentage of users over time, so you can monitor impact on performance and stability while minimizing the risk of a major negative impact.

In your code, checking for the feature flag would look like:

public int getData(String userIdentity) {
if (featureFlagChecker.isFeatureEnabled("data.useNewService", FeatureCheckInput.forUser(userIdentity))) {
return dataFromNewService();
} else {
return dataFromOldService();
}
}

Moirai is designed to have minimal dependencies. The core module only depends on SLF4J, and the other modules add in the dependencies for the particular technology they use. It’s also designed to let you flexibly combine together different sources, formats and decision patterns. See the README for all the details. You can even use the ResourceReloader to reload any kind of configuration or data that you need, not just feature-flags.

Final Thoughts

We have found Moirai and feature flags in general to be a very useful addition to our toolbox to mitigate risk. We’ve used it to roll-out new implementations of features for performance, for controlling the load from back-calculating data for our whole user-base, and even for adding whole new services into our data-processing flow. Multiple times, this has let us find problems and fix them with little or no impact to our user base.

Nike Engineering

Nike’s software engineers create the future of sport. They innovate retail experiences, connect athletes to the brand and create powerful moments of distinction through the Nike Digital ecosystem.

Stephen Duncan Jr

Written by

Principal Software Engineer, Nike

Nike Engineering

Nike’s software engineers create the future of sport. They innovate retail experiences, connect athletes to the brand and create powerful moments of distinction through the Nike Digital ecosystem.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade