As a CTO at one of Europe’s largest technology businesses, I look after around 200 engineers, product managers, data scientists, UX designers and researchers.
One of the main responsibilities in my role is to create an environment for our engineers to do their best work and help the business on its ambitious mission to become the global leader in Beauty and Wellness (as well as a number of other pursuits, just search “The Hut Group” in Google News).
Recently, I took on a side project coding an internal tool to help our Service Operations team reach my engineers when there is a potentially major problem out-of-hours. As our core platform drives over $1B of revenue each year, even small disruptions to services can be very expensive, so minimising the time to restore service is important.
This is a story of that project and how working with Jesu, who heads up Service Operations was the best experience I’ve had and how it has changed the way I think about the engineering experience.
For at least 10 years, THG has always had a Service Desk (SD), which is a team of people who work 24/7 on shifts to continually monitor and respond to problems. This team isn’t an engineering team, although many people in that team develop software skills and have gone on to become successful engineers in THG.
When the SD team discovers there is a potential issue, they look for evidence that customers (in this case, any of the c. 500 million people who visit our site each year) are experiencing an issue below the standards that we set. If there is any chance of impact, the SD team will contact the out-of-hours engineer to investigate and fix the problem.
Occasionally, the issue might require multiple teams to fix the problem, in which case several people need to be called out. During the incident, the SD team work with the engineer to make sure customers, customer service and business and tech teams are kept informed of progress.
When we had a multi-team issue, Jesu and I were noticing it was taking a long time to get all the right people online and talking together to resolve the issue. Frustratingly, the SD team were writing the communication messages and also doing the call-outs, meaning when they were trying to make calls, they weren’t writing the communications and vice versa. We were often stuck making the trade-off between telling the business and customers what was happening or trying to get the engineers online so they could solve the problem.
It was at this point that we started talking about building a tool that would make the phone calls automated. This would allow the SD team to stay in the incident chat room, writing communications, rather than being distracted making calls.
A further advantage would be that these calls could automatically be logged so it was easily visible who had been called so far. This is important as it normally takes 5–15 minutes for an on-call engineer to get online, so knowing someone has been reached means you know they will be online soon.
I decided to take this on as an evening and weekend project with Jesu being the client and stakeholder.
The first thing that made this experience different was that Jesu and I had the same goals for the project and it affected both our teams so we had shared skin in the game.
I wanted my engineers to be called out quickly and accurately as that would reduce the length of any downtime and also get my engineers back into bed sooner. Jesu wanted his team to spend more time writing clearer and faster communication and to reduce the time for any incidents to be resolved.
The second thing Jesu did well was he prioritised the feature requests based on the biggest impact to the users. Many people will give you ideas for improvements, but what he did that really helped was he went through each idea from the point of the two user groups (the SD team and the engineers) and looked at what idea would make the biggest positive impact to their experience.
The final and most important thing Jesu did was that he made the time to give feedback, early and often. I felt like with every deployment, I was giving him a new tool, and as this was an evenings and weekends project, I often got the feedback from him late at night or early in the morning.
Here are the lessons I learnt on how to make the engineer experience better:
- Pick stakeholders that will be engaged in the project. Help engineering managers to set goals for the project that align with the stakeholders, which will help not just with speed of delivery but also product-market fit.
- Product manage with the end user in mind and be focused when deciding what to build (don’t build slick animations when what you really need is a bigger button that works better on mobile).
- Establish feedback loops regularly and make sure testers either are the end users or sympathise deeply with the end user.
At the start of 2019 we built out a new function within Tech called Pre-Engineering which covers stakeholder engagement, product management and user experience design and research.
By being intentional about the importance of this for developers, we are helping to improve the effectiveness and enjoyment for every project.
If you are passionate about building things and are interested in a role in Manchester, England, please check out our vacancies: THG Tech Careers.