Feature Flags are not a Silver Bullet

Context in which this applies to me

The company I’m working in runs iOS App Development projects using Agile Scaled Scrum. As a result a number of scrum teams work on the same app code base. Our release train runs to a regular cadence. There may be several long running features spanning several sprints. Read on in this context.

What is a feature flag? OR How to identify a thing is a feature flag?

Feature flag IS a single thing related to a single feature which can turn that feature ON. As you can see, this definition is not prescriptive regarding app versions or whether the flag should reside in app or elsewhere.

It is NOT just a thing in the mobile app but with scrum development being end to end, its a thing across the whole development stack! Woah! Giving you the heebie-jeebies, now, aren’t they!

It is NOT a feature TOGGLE which and turn a feature OFF also. The reason is that you don’t want the user who has seen the feature on one day, to not see it in the next, just cause someone decided to turn it OFF. This is usually the role of a kill-switch feature which implements correct messaging to the user in case it needs to be exercised.

The problem it tries to solve

Some features require a long time to develop, but agile scrums need to continually test the app increments and be able to release shippable increments at a regular cadence. Flags allows agile scrums to run stories in long running features, spanning several sprints, without affecting the user’s experience in the app being regularly released.

The problems that it introduces

  1. False Perception — The biggest problem! — of power and ease of use that feature flags present to stakeholders:
  • Any number of long running features can run in parallel for however long.
  • Flags soon start being misconstrued as toggles. Switching feature ON and OFF
  • Flags become an instrument for Stakeholder indecision: “I have multiple features going on that use flags only because I cannot decide which is my priority ONE for the next release!”
  • Stakeholders fall out of the habit of grooming story dependencies and sequencing feature across scrum team. This grooming is especially needed when multiple feature flags being implemented in multiple teams touch the same set of code. This is also important in order to find the smallest set of stories that constitutes a logical shippable point.
  • Flags become a Market Release tool that can turn on any combination of features. So when multiple features are completed, they may decide to release only a few while keeping others dormant.

2. How many older versions of app should be supported :

  • Thing about the effect that switching on a feature flag will have on older versions of the app which may contain code for partial implementation of the same feature
  • Also how many older versions should the company continue to support after the feature release.

3. Do you need to support older version of api alongside the new api version OR is the new api backward compatible :

  • While the feature flag is not yet turned ON, then to support the existing app functionality, you may need to continue using the older apis. While when the feature is ON, you may need to use a newer version of the same api. So you need to think how the old api/mapping code will reside alongside the new feature’s api/mapping code.
  • Even when the old api moves to a next version for the new feature, the older version of the api may need to remain to support the older app versions still in production. So you will have to also plan to decommission the old api version appropriately eg. force upgrade app users to the newer app version.

4. Accessibility & Localisation — ‘nuf said! —Since accessibility is many times implemented in code, consider how it will co-exist in areas of code fenced by feature flag and those outside it. Same goes for localisation.

5. Code integration complexities

  • When multiple scrum teams check in their work, Git tries a best-effort integration of the diffs. Yet this is still a text comparison and not semantic. So a merge may introduce code paths that may be undesirable. Before merging the feature branch, rebasing it on develop will help ensure that the feature is being built on current state of flags being introduced in other scrums.
  • After doing the above, running ALL tests on the feature branch itself ensures that the flag you’re using in your feature still behaves as expected alongside any other flags for features coded till then.
  • Sometimes in cases of conflict, the developers for the conflicting feature flags need to ‘pair review’ to determine the LOCs to choose.

6. Testing all combinations of feature flags: All hell breaks loose as this factorially increases the acceptance criteria permutations to test.

7. Release schedule: Once the feature is completed as a whole, then its technically shippable. At this point stakeholders should concentrate effort to guarantee its release to customers in the next release. This ensures that the feature flagged code does not need to be maintained by the team for too long. This time-to-release, depends linearly on the time needed to complete all test/approval stages.

8. Continuous feature code cleanup : Once the feature is in customer hands, the feature flags should be removed from app codebase to keep up code quality. Doing this results in the need for another round of regression testing to ensure this did not break anything.

If you have automated regression tests, this is easy but when organisations rely on manual testing, this is an overload.

9. Being OK with non-production code shipped with app : Stakeholders including Security teams need to accept the fact that till a feature is actually turned on, the released production code will contain dormant code for that feature. So potentially the app could be disassembled to know what new features/api end points are going to be introduced.

All of the above points can be streamlined and made easier when product owners prioritise features correctly, refrain from changing them frequently and work with stakeholders to smoothen the execution path for scrum teams.

What features can be flagged

  1. Prime candidates are features that have a single entry point and can be architected as closely as vertical slices/microservices across the development stack.
  2. Next best candidate, are those that can be modelled using HATEOAS apis ie. link relations like links/rel/href/next/action tags. So the app relies on the api/api response to drive when the feature is exercised. So then the app can become an engine of application state sent via the backend apis.
  3. When the app has concept of version, then the feature release can be tied down to a particular app version that can be injected via an external config api.
  4. Views in Xibs are visual and be easy flagged. Views in code present a difficultly ,like, view creation and layout code has to be identified for flagging. This becomes difficult with reusable views. Subclassing means having the flag regulate code at various levels in the hierarchy which is a maintenance nightmare.

What cannot be flagged

Any feature that has its tentacles across the app cannot be feature flagged.

Strategies for flags that I’ve seen

  1. Configuration over code: Config api/file that dictates what features are turned on. The thinking is to treat the app as a white-label app where combinations of features can be turned on eg. use with stack views to show/hide views and features, Storyboards/storyboard references can turn on whole workflows, replacement views in Xibs (awakeAfterUsingCoder:) can swap in whole different views.
  2. Api response contains state that exercises the feature effectively turning it on.
  3. Features are tied to a version of the app defined in an external config api/file eg. turn on when version > XYZ. So that its turned ON when the version of the app matches that in the config. When doing this, due consideration needs to be given when multiple features are to be released in different app versions.
  4. Compile time flags — Adopting this usually means an explosion in the types of builds created for testing different combinations of the features. The only advantage of this is that you never ship out code for features you plan not to release in the next app version. Yet too much trouble.

Engineering Practices that support it

  1. Git for source control : feature rich and mature merge/conflict resolution tool. Lends itself to GUI tools like SourceTree/Tower and visual merge tools like Kaleidoscope/FileMerge.
  2. Git flow: Process for feature development. Handles feature branches/merge to develop/auto deletes/release branches automatically.
  3. Rebase feature branch onto develop to ensure that feature is developed on latest checked in code so that the project compiles and conflicts are resolved in feature branches before the feature hits the CI.
  4. Automation tests that run at each check-in to CI and cover all combinations of feature flags.
  5. Automation BDD tests with SBE (specification by example) fed data eg Jbehave/Calabash.
  6. CI setup, like bamboo that automates build creation and runs tests at each checkin. Can be setup with rules which forces code review before feature branches are integrated into develop. Can implement traceability eg with story in Jira.
  7. Enterprise build to distribute for Alpha testing eg. Hockey
  8. Test Flight for faster Beta testing
  9. Developer discipline to maintain traceability of code to the story identifier eg. Jira/Git integration.

Scrum Stories

  1. Should visually show dependencies between stories of each feature, so that stakeholders can iron these out cross scrum teams.
  2. Acceptance criteria should be created for when feature flag is off (regression) and on (progression).

Plan for Feature Flag Testing

  1. Create right environments to develop and test. This should be a small number so that its easy to JIT setup/deployment/test the stack.
  2. Ensure tests cover the factorial combinations of all features.
  3. Have data/end points for when flag is on and off.

Release Plan

  1. Release feature blocks to production as soon as a feature is done.
  2. As different strategies can be applied for different feature flags, have a clear plan on how the feature will be turned on and in which app version and share these with stakeholders/testers/production support teams.

How to convince stakeholders to take decisions to smoothen the execution of flagged feature

Now that you have a good understanding of the need for feature flags, the problems in managing and developing with them and the maintenance nightmares they introduce into your beautiful app!, it helps if extraneous conditions affecting the development and release process are smoothened out. Some things that can help:

  1. Get Stakeholders to appreciate the increase in delivery problems with addition of each new feature flag and keep the number at a minimum.
  2. Get Stakeholders to plan the priorities of features so that scrum teams can really work as they are supposed to by swarming to complete top priority features first.
  3. Try to keep the app pristine, by endeavouring to keep is as a function of some external input/api response.
  4. Get the scrum teams to identify inter/intra feature dependencies so that the effects of the flag on related stories can be visually depicted.
  5. Keep a small number of scrum teams that pollenate knowledge of the app. Value their opinion when deciding when to use a flag. Standing up a new scrum team for each new feature is an overhead to existing teams/processes.
  6. Designers and dev teams should work together to create the app/stack almost like a vertical slice which can help institute flags with minimum effort.


In conclusion, feature flags are one option for managing the release of long running features. I do hope its clear from the above elaboration, they have to be smartly applied. Even then, they will have a significant hit to testing/future maintenance. Of all available options, in my opinion, keeping the app responsive to config/application state, works best, because this keeps your app pristine for the future!