Feature Flags — A Deeper Look

Natarajan Venkataraman
5 min readOct 24, 2022

--

Overview

Feature-flag, aka feature-toggle, is a solution for building SaaS applications and which help organizations deploy new features with ON/OFF switch in a selective manner. As such, we can think of feature flags as an alternative to A/B testing or Canary, since the same objective can be achieved with selective turning-on/off of the feature(s).

The feature ON/OFF can be set based on percentage (ie. similar to Canary). Or it can be based on user or some other context.

For feature flags to work, we need to embed the feature flags in code. Note that he feature flag value (ON or OFF) needs to be determined every time. It can be ON for one user and OFF for another. In this respect, feature flags are different from config knobs (or ENV variables), which are static in nature, applies universally and not selectively. Also, any change to the value could require restart of the component.

The diagram below illustrates how feature flags work

Feature Flags at a Glance

Obviously, in all components and in whichever files the Feature-X changes are present, the flag checking and if-else condition (with identical context), needs to be present and must evaluate to the same value. In such case, the percentage based feature flag implementation will only be of value if feature can be enabled or disabled at a single component alone.

Incorporating Feature Flags

There are many solutions for incorporating and managing feature flags. Examples of such solutions are LaunchDarkly, Harness, Flagship and CloudBees.

A typical solution will provide a controller which is a repository of feature flags and rules to activate each feature flag. The solution would also provide an SDK (for various languages) using which the feature flag activation (and rule evaluation) can be performed at runtime in the application at each point where the feature flag activation needs to be checked.

The developer would incorporate the SDK and invoke the specific feature-flag check and provide a context. The context could be user-id, customer-id etc. The SDK will then process the rule for the context provided and return the ON or OFF decision for the feature concerned. The code will thus include a conditional statement handling the active and in-active decision for the feature concerned.

The developer MUST NOT take the value once and store in some global value, since it just defeats the purpose. However, it may be ok to store it for a specific API invocation so that any specific API handling where the feature is checked in multiple places can be a bit more optimal.

For cloud native applications based on Kubernetes, some of the solutions provide a Docker container relay which can be either within a Kubernetes cluster or outside the Kubernetes cluster. The various applications that use the solution can reach out to the relay for faster access to the feature flag rules/state. The relay will keep in sync with the controller.

The solution will also provide an UI using which the SaaS operations team can put in the required rules to specify when the feature should be enabled.

Benefits

The primary benefit comes in where the selection can be restrictive initially. As the feature proves it worth (or is highly reliable), the SaaS operations team can widen the selection until it is ON for all users.

This enables faster release of new features and gain feedback and confidence on the feature, iron out any issues and the open the feature out widely.

Thus, this is similar to canary deployment or A/B testing, but with a single set of images and lesser complexity (in a sense) as compared to canary etc. Also, the company developing the application can avoid having separate cluster for pre-production or for alpha testing. Thus, you can turn the new feature on for just (say) internal users (ie. eat your own dog food) or alpha / beta customers of the company and later for others. This can lead to lower costs and operations overheads.

Also, rollback of a feature is as simple as turning OFF the feature flag via the UI. It is immediate and instant.

This mechanism also avoids need to work with Kubernetes service, API Gateway, Service-mesh etc. to get Canary or A/B testing capabilities.

Drawbacks

Firstly, we cannot leave a whole bunch of feature flags in code for ever. Once the feature is well baked, the feature flag for that feature MUST be removed from code. This means code changes and retesting is going to be required. Some code refactoring between the feature flag ‘if’ and ‘else’ code block may be needed to remove code duplication.

Further, other code changes (for other features) or bug fixes may need evaluation if the code under feature flag needs to be touched, adding to the development challenges.

When many features are in ‘vetting’ stage in production, the code will be peppered with feature flags making code readability an issue. Maybe there are even scenarios where feature flags may be nested.

Obviously, any DB changes needed for a new feature may need DB schema update. If the feature is ‘bad’ and needs to be removed, reverting the DB may also need to be done. Feature flags per se may not be of help. But this issue of DB schema change and reverting it is a challenge for any feature deployment method, eg. Canary or A/B test.

Likewise, the UI framework also requires support for feature flags so that the UI for a new feature is shown only when the backend is also going to be activated. This can pose challenge.

In real-life, certain large customer may ask for specific fixes and having feature-flags introduces a temptation to put in a ‘customer specific’ customization under feature flag and keep it ON for ever and only for that specific customer. Obviously, this is more of a product development discipline issue and such temptation must be avoided to misuse feature flags for such scenarios.

Note that each feature flag if-else condition will validate a set of rules to determine whether the feature should be enabled or not. Thus, too many such if-else condition can have performance consequences to be mindful about. (Note: This may not be a major issue for cloud native application based on Kubernetes, where the pods are horizontally scalable stateless set components).

Any solution that comprises a hardware or software at customer end and SaaS based controls would not be able to leverage feature flag for features that require support at both ends.

Conclusion

In summary, effective use of feature flags require architecture controls, carefully designed APIs and modular microservices, so that the costs associated with feature flags in code are minimal. In many cases, introducing feature flag into an existing application will require some refactoring, which is anyway a good thing. Without doing so, just introducing feature flags due to hype will turn out to be detrimental.

The way to think of feature-flags is as another option to deployment strategies such as canary deployment or A/B testing. Organizations need to consider the accompanying overheads as described above and maintenance costs as well as maturity of development and product organizations in using this tool.

--

--

Natarajan Venkataraman

Technology enthusiast, thought leader and evergreen learner.