Gousto Software Engineering team: Best of 2021
As we approach our 10 year anniversary in 2022, we reflect back on another busy year at Gousto in which we’ve doubled our team size to nearly 2000 and increased our number of monthly meals delivered to 8 million. In tech, we now have 18 squads across 6 tribes, which means we’ve released loads of features, functionality and experiments over the last year. Here’s a round-up of our favourites:
> Gousto iOS Widget
The north star objective for the Activate tribe is to ‘help more UK households to establish a habit around a better way of eating dinner’. The development of a widget to live on the home-screen of iOS devices aligned neatly with our north star. The widget presents customers with their latest order status — serving as a helpful reminder of the number of days left to order, and enabling quick access to the Gousto app.
Important context to its development is that the widget began as a project which a group of colleagues across Tribes and Squads collaborated on during their Tech 10% time. The feature was delivered by Rockets after a demo of the MVP revealed the project team’s hard work to be production worthy.
> Early-life subscriber frequency incentivisation
It perhaps isn’t surprising that the customer behaviour which correlates most strongly with long-term retention is the frequency with which the customer places orders within their first three months from sign-up. We therefore ran an experiment within Rockets to provide customers with a target number of orders to be placed within their second month — based on the number of orders within their first month — in return for a discount applicable to orders within their third.
As part of the design of the UI we looked to leverage gamification to present the customer with their target, and to generate a feeling of excitement by visualising progress throughout their second month. Overall the experience was extremely well received by the treatment group of customers, with the metrics attributed to the experiment surpassing all expectations. Most importantly we’ve been able to build our confidence in an experience which influences a behaviour correlating strongly with retention.
> Tertiary menu
One of Gousto‘s main goals, both this year and next, is to improve OTIF (On Time and In Full), which basically means ensuring that customers get their recipe boxes when they expect them and in a good state. In the Discover tribe, we contributed to this by working to improve forecasting by offering users the opportunity to order their recipes 3 weeks in advance rather than just 2. If we are able to know 3 weeks in advance what our customers have ordered, this means we have more time to ensure we have the necessary ingredients available in the right quantities and have our picking lines set up in the right way. This involved collaboration from squads across all tribes in order to deliver this project by Q4. We learnt a lot and will continue learning about how to effectively run initiatives beyond any tribe or squad boundaries. And users loved it!
One of our big initiatives this year was to fully roll out the ability to add sides dishes to our meals. We’ve gone through several iterations of this initiative, and de-scoped the original idea to allow the user to add a side when choosing the recipe. Instead, after a lot of workshopping, we delivered two iterations of the project: a list of sides that the customer can consider before checkout and a list of sides which match up with their chosen recipes. We are learning a lot about how and when users want to choose their sides.
> Order API
Our current order system is not allowing us to deliver the diversity of choice we want to give to our customers. We need to build something better that can scale with the number of users and can change to give them way more options.
All Radishes engineers have done a tremendous amount of work to deliver an API that can grow with the business. But this is only the first step, we will continue working on the next steps to deliver a much better system, fit for the times and the goals of Gousto.
The Turnips are responsible for the personalisation of our menu, which currently translates to ensuring that our customer recommendations are the best they can be.
> Early box recommendations
After building and releasing a brand new recommendations engine in the first half of 2021, Turnips have extended this algorithm by using state-of-the-art technology to give our customers even more personalised menus for their first five boxes. Previously our recommendations relied on a customer having a backlog of orders that we could use to generate said recommendations, which was great for established customers but not so much those new to the Gousto product.
Turnips now serve recommendations to all customers, regardless of longevity, by passing previous order data to our engine as soon as they place an order, effectively generating recommendations in real time.
> Recipe Battles
In order to provide any meaningful recommendations for early box customers, we need to be able to find out what they like — this is fairly difficult when it’s their first time ordering a box. We therefore devised a Recipe Battle interaction (similar to a This or That style feature on instagram) which prompts the customer to select their preferred recipes in a series of battles. The idea is that we will take these preferences and then feed them to our algorithm to support the early box recommendations mentioned above. Our workshopping process involved various designs and concepts before we eventually settled on This or That. It’s the first time we’d ever tried ‘gamification’ in the menu space so we really enjoyed being able to experiment with animations. Although after speaking to customers, it turns out that our initial iteration of 15 recipe combinations was a bit too much, so we’ve since reduced the process to just 5 pairs.
> Numerical Measure of Customer Satisfaction
Creating menus that would wow a large proportion of our customers and leave them salivating before their box arrives requires a deep understanding of customer preferences. Especially in the world where Gousto menus become larger each year.
Therefore in early 2021 our Data Science team built and validated an algorithm; a numerical measure of customer satisfaction with menu variety. The name of the algorithm — Percy. After a successful validation it became our number one asset in the treasure box of menu optimisations. It is now used for creating data-driven, relevant menus for our customers.
Outside of menu optimisations, the new score is also valuable for analysts and product teams across Gousto. We are always thrilled to see Percy “in the wild” used by teams to segment customer analysis or target users for qualitative research purposes.
> Brand New Menu Planning Tool
Before 2021, whilst we already had our menu creation algorithm up and running, our Food team was managing menus post-creation in spreadsheets. This meant that when a change to a planned menu had to be made (let’s say because we couldn’t source enough courgettes) the Food team had to manually find replacements to courgette recipes from the available recipe library, while trying to maintain various menu KPIs in place.
With the complexity of our menu optimisations increasing and menus becoming larger, spreadsheets weren’t fit for purpose anymore. So we formed a new cross-functional squad with an addition of 5 software engineers and a product designer and built a smart Menu Planning Tooling that enables the Food team to maintain key menu metrics throughout the turbulent menu change process.
In the Peas we work on Optimising Gousto’s ‘Network’, which means looking at both expected and actual customer orders and making decisions on which of our multiple sites should fulfil that order to get the best outcome for the customer, gousto, and the planet.
> Automated Order Allocation
In order to build an optimal network it’s really important to build confidence in the demands placed on each site on any given day. In particular, we want to know how many orders, and what ingredients will be required to fulfill them.
When we opened our second factory in 2020, the decision on which site an order would be fulfilled from was taken quite late (3, 2 or sometimes just 1 day before). This meant we couldn’t optimise the network and plan as easily as we’d have hoped. Shortly after launching we introduced a concept called ‘Advanced Routing’, where we’d make a rudimentary decision on where to send an order much further in advance, and that helped us increase our certainty in the future.
In 2021 we wanted to take Advanced Routing a step further, so we built Automated Order Allocation, or AOA. AOA followed three principles:
- Modularisation — AOA consists of multiple ‘modules’, which all look at specific metrics. For example, one module provides a high score to the site which would have the best courier option, another module provides a high score to the site which has the most space. Once each module has made a decision, a weighting is used to combine them and produce a final outcome. This approach gives us much more control over the network, it also helps us plan our discovery and delivery work better, enabling us to come up with a roadmap of future modules.
- Testability — As you can imagine, lots of different variables go into making a routing decision. We didn’t have realistic copies of these variables in our test environments. For AOA we’ve fixed that, adopting a bulkhead pattern with extensive tooling that allows us to generate scenarios and run AOA on them. This has massively reduced the risk of development
- Comprehensive data exhaust — Making a decision doesn’t just need a lot of data, it also produces a lot. For example, you want to know how many orders were allocated to other factories when the decision was made, what recipes the customer had chosen, and what the history of decisions has been. For AOA we worked with our Data Engineering team to build a brand new pipeline using Databricks that lets our data science and analytics teams access all the variables used in previous execution of AOA in a cost effective and straightforward way.
For 2022 we have lots of ideas for new modules and new directions to take AOA, and lots of data to start digging into!
The Care Tribe within Gousto is dedicated to providing our customers with industry-defining customer care.
> Big winners of the European Contact Centre and Customer Service Awards
One of the highlights of 2021 was being the big winners of the European Contact Centre and Customer Service Awards (ECCCSA’s) — ‘THE’ award for all things customer service. We won“Small Contact Centre of the Year” for a 3rd year in a row, “Most Effective Digital Customer Experience” and we were runners up for “Best Use of Business Intelligence”.
It was a really proud moment to be recognised by industry peers as providing an industry leading customer care experience. We’ve worked really hard to minimise the amount of issues our customers experience and empower customers to easily self-serve when they do experience issues.
> Improving and making internal tooling more secure
In 2021 engineers in the Care Tribe were tasked with stabilising our internal tooling and eliminating potential security vulnerabilities.
Our internal systems are currently used across the business by both engineers and customer care agents, however they haven’t received as much attention as others — in 2021 this changed.
Initially, we worked on the first phase of stabilising the technologies we use internally. We upgraded Ubuntu, enabled some of our core tools to build as an AMI(Amazon Machine Images) and upgraded a handful more dependencies, including upgrading Gulp. This allowed our engineers to be more confident when releasing changes.
The second half of this work was centred around making our internal tools more secure. Every 6 months we run pen-tests for our applications in order to discover any security issues before they become a problem and to mitigate harm to the business.
In Q4, based on recommendations from the Pen Test report, we were able to fix the two vulnerabilities outlined. This work resulted in Gousto’s services being more secure.
In the Pumpkins our mission is to improve how we plan to ensure customers get exactly what they ordered across our improving choice of recipes. Over 2021 we have increased the number of recipes we offer which makes this task become even more complicated. Here’s how we are using technology to help improve our planning so that it scales for the future:
> Projecting our Stock Position
Over this year, the Pumpkins have designed and started to build a system that can report what stock we have at each of our sites and how that might change in the future. We are able to combine data such as the forecast of what our customers want, the stock we currently hold and what is incoming over the next few days to work out how much stock we will have later on. We’ve built a scalable system which revolves around being event-driven and only updating when something meaningful has happened — taking into account only what has changed. We know that accidentally dropping some tomatoes won’t affect how many cucumbers we have! Another part we are proud of is building this with automated acceptance tests running directly in the cloud to simulate a production-like environment as closely as possible. This not only helps to uncover bugs and provide robustness but also gives us the confidence to make changes at speed.
> Planning our Transfers of Ingredients
We also spent time on improving our component that informs us where to move our ingredients between factories and when to do so, allowing our gousto boxes to be filled up correctly and without delay. With the help of Dom (one of our Data Scientists), we can now do this over many more sites and can even optimise to minimise the number of trips.
iOS Foundations are responsible for all the things that the feature teams aren’t. From low level architecture and features, to managing our CI stack to building tools for the rest of the iOS guild and wider company. We have a lot going on but here are some of our 2021 highlights!
> Feature Apps
A lot of 2020 was spent breaking down and modularising the iOS codebase to allow us to scale the team effectively, but 2021 allowed us to finally take advantage of that work!
This year we have added the ability for squads to easily build out feature apps by supplying them with a SwiftUI based toolkit which includes sign in functionality as well as feature flag and analytics debugging tools. These feature apps allow squads to concentrate on their individual features without worrying about building the entire app which in turn speeds up development and testing as well as allowing the wider squad to get insights into how new features are progressing without having to download the entire app.
> Build Cache/Conditional Testing
Another advantage of modularising our codebase throughout 2020 meant that this year we could effectively define and build out our module dependency tree for the iOS app.
In an effort to speed up our CI we took advantage of this new understanding by running jobs to cache all of our modules. Due to the modules being cached we only need to invalidate parts of the dependency tree that are affected by a change in a pull request. This means our CI only needs to build what’s changed or effected by a change further up the dependency tree resulting in much faster CI builds as not only do we no longer need to build those cached modules, but we don’t need to test them either!
> Code Health Metrics
In an effort to give squads and management more insight into the health of the iOS codebase we introduced the concept of code health metrics. Code health metrics are a bunch of data points gathered about the state of the codebase which are generated for every release of the app (every 2 weeks). These metrics could be to do with unit test and UI test coverage, documentation coverage, PR sizes, file sizes, use of deprecated code and much more.
These data points are gathered together and displayed in a web dashboard to see how these metrics change over time. This not only gives feature squads an indication of areas they need to work on, but also gives management teams high level insights.
The Gousto Platform Engineering team’s mission is to improve developer productivity and set the foundations to allow Gousto to scale as we add more engineers.
Over 2021 we have achieved a massive amount! Here are some of our highlights.
> Reducing team distractions
At the start of 2021, the team was spending about 20 hours a week supporting our Product engineers. This ranged from debugging failed builds and sharing platform and AWS knowledge to being asked to support requests that the team is not responsible for.
With our hard work over 2021 we have managed to get support down to less than 2 hours per week, freeing up a full-time platform engineer from managing our support channels. We’ve made this improvement whilst increasing the number of product squads and engineers that we support across Gousto.
How did we do it? Firstly, we measured which areas were causing distractions. Then, focussing on the areas that took up the most support time, we made a series of targeted improvements. We removed legacy software to improve stability of our builds, introduced a production incident process for our internal tooling and enabled “budgie” deployments of our EC2, ECS and Lambda services for early detection of platform issues. We also nurtured a close community of product engineers called Baby Potatoes, with whom we share additional knowledge and enhanced permissions. They provide a rapid response to their squads as well as giving us valuable feedback and helping us serve their squad’s needs better.
> Simplifying Platform
Throughout 2021 we have focussed on simplifying the Platform tooling to enable Gousto to scale, enabling more engineers to contribute improvements and encouraging more inner sourcing. Complexity that we have removed has included unused features, duplicate ways to deploy Lambda functions and a legacy custom SQS2Lambda solution, which we have migrated to use the AWS native SQS-Lambda integration.
At the same time we have upgraded the open-source tooling that underpins our platform. This includes version upgrades for Ubuntu, MySQL and Ansible and the removal of Python 2 and legacy Node.JS and Lambda functions. These changes have had a massive impact on the reliability of our tooling used by our product engineers, thus improving the developer experience, reducing the amount of things the Platform team has to manage, support and maintain and enabling us to scale up to 2024 and beyond.
> Foundations for the Future
As we approach 2022, the team has been focussing on strengthening our foundations for continuing to scale up. Two areas which were becoming increasingly challenging as we have grown were AWS identity and access management and the overuse of resources in our non-production AWS account.
To improve our AWS user management, the Platform team introduced an AWS Single Sign On solution with Okta. This is a centralised solution for managing access to AWS which is more scalable and secure than managing IAM manually, as well as being easier to understand and make changes to. It has been successfully rolled out to all engineers who access our multiple AWS accounts.
The growth of product teams has meant that our non-production AWS account is bursting at the seams, so we have implemented changes to our tooling which allows us to quickly and reliably provide dedicated AWS accounts to individual squads. This alleviates a significant bottleneck in AWS and is a major step towards our 2024 goals. It also dovetails nicely with our Okta work as the new accounts will be accessed using SSO from the outset.
“Provide capabilities to measure, improve and hold Gousto accountable for the stability and reliability of our tech, to enable a magical customer experience”
As a new and small team, we couldn’t just take on everything in that space and go after it all at once, so we chose to focus on 4 things:
> Promote observability
It was surprising to us how often people were relying on log files to detect and diagnose problems. We checked out some vendor solutions, looked at what we could improve within the existing environment, and formed an Observability Guild to help share and extend the work of all observability enthusiasts at Gousto. It’s an ongoing journey, but several squads now have actionable information about their services which they didn’t have before, and we’re looking forward to making life better for everyone.
> Care about our production environment and the people who look after it
Our 24x7 on-call coverage is provided by a small team of volunteers, who were often having to swap their shifts due to vacations and home commitments. We also noticed that some incidents weren’t documented well enough to be handled by the oncall team. So we wrote alerts to detect incomplete playbooks, helped teams improve existing playbooks and escalation policies, and scheduled on-call shifts to automatically avoid holidays and other conflicting calendar events. Result: much happier on-call engineers, more folks volunteering.
> Learn as much as possible from all our incidents
Every outage is an unscheduled investment in learning for the company. So we review each and every one, look for wider themes, hold teams accountable for following up, and try to help share the knowledge acquired every time. We’re already seeing fewer repeat incidents and more constructive discussions about avoiding or resolving future incidents.
> Automate ourselves out of a job (and into a more interesting one)
We made the time for reviewing incidents in detail by taking the existing reporting process and turning it from a day-long copy/paste exercise into a handful of scripts that gather all the data in a few minutes.
Any time we are faced with a manual task, we challenge ourselves to automate it. The more routine work we can delegate to the robots, the more time we have left for interesting engineering.
So there we have it, 2021 in one (hopefully not too long) blog post. Another year of online collaboration, growth and great initiatives.
If you want to read about what our data teams have been up to this year, head over to this post.