Optimising for the customer

Ryan Cormack
Just Eat Takeaway-tech
4 min readNov 1, 2021

Every decision we make while writing our software impacts somebody. If it didn’t, we wouldn’t write it. That’s why it’s important to know who your customer is and make sure you meet their requirements. Sometimes these requirements change. They don’t all come in the form of a User Story and the customer isn’t always a person looking at a screen.

At Just Eat Takeaway.com we have an HTTP endpoint that serves our Menu data. At first, the requirements were that we serve the data to backend processes where there isn’t a human user at the other end waiting on the asynchronous call. To get started quickly the team used our default go-to combination of .NET and AWS Lambda functions to process the requests through an API Gateway. We had a good template and it was really fast for us to get the API shipped.

.NET Lambda Max durations

Soon after this, we started onboarding more consumers to our API, some of which were serving UIs to end-users. The cold start times on our .NET Lambda was starting to have a negative impact on our user experience, with sometimes hitting 4 seconds. At that point, our requirements changed and we had to re-evaluate our technology choice. We decided to re-write (re-writes always need to be fully considered, they can often be expensive and time-consuming) a single function inside the Lambda so it was running on top of the NodeJS runtime. We received our Service Level Objective (SLO) and decided the amount of optimisation required in running a .NET Lambda wasn’t going to be able to acheive this. The particular function is also really small. It does a small amount of validation and returns a signed S3 URL to the caller. We pressed ahead and had a Node function, written in TypeScript, up and running in only a few hours, which were passing all of the existing integration tests.

The dramatic decrease in times after a TypeScript rewrite

We released the function and immediately saw the benefits. The upper 99 minimum times remained about the same, but the max time, over a 24 hour weekend period was almost 3x faster and the average almost 10x faster. Not only did this improve the load times for customers, but the lower invocation times on the function meant there was less costly.

There are several ways we could have solved this problem to improve the experience for the end-user. The team I’m on tend towards Lambda because it reduces the operational toil that we experience when having to support a persistent compute model and it was significantly quicker to write it in TypeScript than port the .NET code to a container or EC2. It did however require some learning for the team. This was our first TypeScript code running in Production. The syntax and type system are very similar to C# and the fact that Lambda lets us focus only on our specific business logic meant we didn’t have to learn too much about Node or handling HTTP requests on Node - Lambda takes care of all that plumbing for us. I’ve explored combining .NET and Node functions in a single Lambda here.

The API has been running in production for several months now, without issue. All our upstream clients get a much more stable response time (shown in the upper 99 charts below) and from the learnings, we’ve been able to be much more confident about being able to run other TypeScript Lambda functions not just in this team, but across the company.

.NET upper 99 response times over a 24 hour period
Node upper 99 response times over a similar 24 hour period

Here we’ve looked at the importance that our technology choices play in how our customers perceive our services. There’s always a tradeoff between familiarity bias and doing something new, especially when it involves things like a rewrite of working code. But in order to achieve the SLO set out, we decided to rewrite the function in TypeScript was the correct thing to do. In this case, the function was small, it only took about a day of effort and the results have had a really positive impact on our customers, as well as helping the teams become more comfortable with a new technology choice that we can use in the future.

--

--

Ryan Cormack
Just Eat Takeaway-tech

Serverless engineer and AWS Community Builder working on event driven AWS solutions