The risks that did and didn’t pay off when building “throw away” prototypes
In the Financial Times Apps team we’ve spent a large part of the year in rapid feature experimentation, or something we call ‘discovery’. Our product, design and user research people had been investigating what might make our apps ‘stickier’, and get more FT subscribers to develop a daily habit. With a long list of potential feature ideas we wanted to test them quickly, and get real user insights to add to the prior user research. As a result, we learnt a lot about building ‘throw away’ prototypes and deploying them in our existing iOS and Android apps.
We deviated from our usual working processes
To validate all these feature ideas we needed to make fast releases and run a lot of A/B tests. We decided that our implementations had to be quick, dirty, and easy to remove.
We knew that we could have issues moving at such speed. To mitigate this, as a team we agreed:
- To keep experimental code as separate as possible. If the code is not production quality we need to limit the potential blast radius of features and try to avoid them accidentally becoming untested permanent parts of the codebase.
- To document what we would improve if given more time, the conscious shortcuts we had taken and specifics we intend to add if the feature is productionised.
- To allow time to productionise the implementation if we decide to keep a feature.
- To have engineers working on discovery 100% focused on discovery.
We took risks to minimise engineering effort
- Not to add unit or end-to-end tests. We would rely heavily on our two quality assurance testers.
- To have one code reviewer looking for negative side effects (not optimal code). Normally we require two reviewers for all code changes.
- Not to fully groom tickets, and instead have a dedicated discovery stand up every morning so all the members of the multidisciplinary team (engineering, design, quality assurance, product owner and business analysts) can agree approaches on the fly. Generally we would be guided by what the engineer thinks is quickest and easiest when in the weeds of the code.
So far we have released 13 features
The experiments include new pages, signposts directing users to particular content, and onboarding for new and existing features.
Newsfeed — a real time list of all news content published by the FT.
Version one of the Newsfeed was a list of articles with headlines and a summarising sentence but initial user feedback suggested some confusion around the purpose of the page. To emphasise that it was displaying the very latest news in real time we added more context with hour and date markers.
The design was intended to be an infinitely scrolling list, but for speed of development we limited the page to 100 items via a single API call. As a brand new page the Newsfeed was easily isolated from the rest of the app, it had separate data fetching logic, a new API endpoint, and reused existing UI components. This meant it was quick to build and simple to implement the redesign.
Continue Reading — an overlay displayed on app startup to remind users about the last article they started but did not finish reading.
Positioning Continue Reading at the bottom of the screen meant it overlapped with some existing pop-up messages. For speed we decided to add a CSS class name to the page body while Continue Reading is in the DOM. We use the CSS class to prevent any other pop-ups being shown at the same time. This was an effective shortcut until later experiments also needed to be rendered in the same area, with more complex conditional logic. Trying to be quick, we repeated a mixture of setTimeouts and CSS classes to avoid overlays clashing. But this was more and more brittle and we struggled to cover edge cases.
Taking only a day and a half to build, Continue Reading was one of our fastest prototypes to deploy. However some early users reported that they couldn’t make it go away. As a fix, we added more dismiss gestures (assuming the initial gesture was not intuitive enough) and a setting to turn the feature off altogether.
Some risks paid off and some didn’t
Over the months, we iterated on our approach. We started with a single engineer (me) and rotated engineers a little. We then increased the number of people working on the features. Ultimately it transitioned from two week sprints with one engineer building a fixed list of features to a mini subteam following kanban. Also, the more we built the more we realised that some unit tests really would be necessary if interacting heavily with existing parts of the codebase.
What went well?
- We released a lot of things!
- We iterated on three of the experiments and re-released them.
- Deleting completed experiment code has been quick and easy (so far).
- We were often able to reuse experimental code and existing production code across discovery features. Using consistent design elements meant we moved quickly. E.g. We built a feedback button that was used on all new pages and we used the existing carousel component.
- Some features were very quick to implement. Most noticeably those that had minimal interaction with the existing codebase or UI and didn’t rely on an existing data source.
- I found the initial approach of working in a fixed two week sprint really rewarding (even if it required a very intense cadence). Realistically this is most sustainable in short bursts.
- We worked even closer with colleagues in product, UX, QA, BA, and user research. Even though the Apps team is already an integrated cross functional team, it was more collaborative. As an engineer, discussing the feasibility of requirements in an ongoing way was an opportunity to be more involved in what we built, as well as how we built it. Changing requirements and ambiguity might be uncomfortable for some, but I found this and the emphasis on the engineer to propose a quick solution fun.
What went less well?
- We introduced bugs. Moving at such speed, bugs were to be expected, particularly when using existing parts of the codebase. E.g. Continue Reading leant on an untested module that was written more than 7 years ago (before our git history starts). In the effort to move fast we trusted this interface but it led to multiple issues that were tricky to pinpoint.
- We impacted the overall codebase health. Our discovery development continued for many months so we have added a lot of untested, less scrutinised code. It’s taking a number of months to draw conclusions on all the features implemented so while we’ve deleted some, many are still in situ. Naive approaches I took for a feature implemented months earlier have ended up hindering new discovery work. Having a rough idea about the timeframes for experiments could help engineer the feature in a way that is sustainable.
- Your ‘craft’ as an engineer can feel compromised after months of quick and dirty work. Four weeks of rushing code I expected to be deleted soon was fine. Moving tickets quickly and giving product rapid features was satisfying. Eight months of that sub-optimal code in the codebase makes me feel code shame. Is that code my legacy in the repository? Is that code good enough for paying FT subscribers? Is that code going to trip me and my colleagues up when we come to touch that area of the codebase again?
- Working in this way for a prolonged period reduces technical learning. There was less access to the wider team’s collective expertise because we worked as a subteam. From a personal development point of view, a key way I learn is through giving and receiving code reviews, especially those in technologies and areas I’m not experienced in. Working on discovery for a long time meant that I did not have the usual access to in-depth code reviews from more knowledgeable colleagues and was not responsible for reviewing non discovery work.
On balance, doing less might be more productive
Today, analysis is ongoing but we’ve paused on building new experiments. We’re still considering the future of some of our completed experiments, we’ve refined tickets to productionise some, and deleted a few more. But here is what I suggest we do next time:
- Limit the number of experiments in the codebase at any one time. We started to trip over previous experiments when implementing new ones, e.g. overlays designed to appear on the same section of the screen (but in different situations or in different A/B tests).
- Include really specific real user monitoring requirements for tracking each feature. As we were iterative with design and behaviour requirements, our tickets started with more ambiguity than our non discovery work. But with tracking, any room for interpretation led to time spent debating approaches and a few mistakes: we would benefit from making sure engineers had an agreed and consistent implementation for tracking and that the Product Owner knew what could be tracked easily without any re-architecting. This might sound easy but we had differing opinions, largely due to different understandings of how the analysis could and should be done. Tiny mistakes or inconsistencies can have a big cost; if the test does not have statistical significance we’re back at square one.
- Have engineers limited to a few weeks on discovery. It’s easy to feel motivated for a quick and dirty sprint. But spending months working without tests, deploying code that you know is a long way from your best, and feeling like you can’t stop and improve things starts to feel like you’re having a negative impact on your codebase.
- Have each engineer see their discovery features through to completion. Our team retros revealed a concern that discovery features would be thrown over the wall for engineers outside of the subteam to maintain. Those on discovery kept building and building without pause so those not on discovery became the de facto maintainers. Instead, whoever implements the experiment waits for the results, iterates, removes, or productionises.
We’d love to hear about your experiences building prototypes and releasing feature experiments within your production apps. Did you adjust your ways of working? Did you make compromises?
For more about discovery at the FT, see this post to find out how the Product team have been thinking about discovery, and how it has changed. See this post for the Product focused learnings from our Apps team discovery work.