Embracing Serverless — Part I

Codes on cloud, without servers (Origin: Unsplash)

Editor’s Note:

In this multi-part blog series about the journey of adopting serverless architecture, Salvian will begin with sharing two major pain points of managing traditional server infrastructure, followed by the compelling benefits serverless architecture provide.

Salvian Reynaldi is Software Engineer in the Backend Infra team, whose responsibilities, among others, include improving backend-related application development processes.

Traveloka turned 10 this year. As a software-driven company, we’ve been using Java in our backend since 2012. While using multiple programming languages (polyglot) seems cool and would probably be ideal for certain use cases and circumstances, we think it’s not what’s best for our teams yet. Sounds a bit boring, huh?

We are also a heavy user of Amazon Web Services (AWS), because our backend teams manage hundreds of microservices, which need a powerful platform. Like our current view on polyglot, we’re not going for multi-cloud, or be cloud-agnostic, because apart from increasing our cloud cost, it would also add unnecessary complexity, take significantly more resources to set up, maintain, and debug the clouds, which would ultimately impair engineering productivity.

Traveloka started out using the good old EC2 (VM) in 2012 to run backend applications. Since then, we’ve been improving our development process and architecture such as migrating to Auto Scaling Groups in 2018 and using more ECS Fargate (containers) since 2020. While these initiatives solved some engineering problems, some major ones still remain to date.

The “Easy” Problem

(because EC2 and ECS are “easy” 😉)

The first major issue is server maintenance. We find that most of our software engineers would prefer to build products such as, Mart, our recent online grocery service, rather than maintain servers. Our incidents are often caused by late horizontal-scaling responses during huge events such as the EPIC sales. Despite the name, AWS EC2 Auto Scaling Group’s scaling policy needs to be carefully crafted and maintained so that the automatic scaling works reliably and efficiently all the time. In essence, our software engineers choose not to be server experts, but their services often deal with fluctuating workloads.

If only there is an auto “auto scaling”.

The second major issue is cloud utilization. Our EC2 resource utilization was extremely low, even more so by our internal applications, and some services are basically idle at night. Our teams spawn EC2/ECS instances, whose cost accrues based on its operating duration, to run applications that handle just a few requests daily. I’ve seen services that are productive for only less than 10% of the time. That is very inefficient.

Hmm.. aren’t those two issues related to cloud usage? Probably AWS has a solution for them. Well, there is. It’s Lambda, a serverless, event-driven compute platform by AWS, which has been around since 2014.

Less Servers, More Business Value

Unlike the term wireless, where devices connect to networks without wires, the term serverless could be misleading since applications would still run on servers (i.e. MicroVM in the case of Lambda). The three main attributes of serverless offerings from major cloud providers are usually pay-per-use pricing, real automatic scaling, and more managed servers. An O’Reilly survey in 2019 indicated that many organizations agree on those serverless’ benefits.

Figure 9 of the O’Reilly 2019 serverless survey report

What’s the antonym for serverless? Serverful? Servermore? Servered? Or probably, servermuch?

The Catch

One common concern about serverless is vendor lock-in. For example, every major cloud provider’s serverless computing platform, just like AWS Lambda, is not interoperable with one another. However, if we look more carefully, using Lambda doesn’t mean that we’re locked-in, just coupled. We’ll always be coupled, to some degree, to the tech products/services that we use, including open source software. As long as we understand our application and the overall tech landscape, migrating to the alternatives wouldn’t be the hardest task, if necessary. There’s a good article about this: Don’t get locked up into avoiding lock-in.

Another popular concern is cloud cost. It’s true that on paper, Lambda is twice more expensive than EC2 for similar memory-vCPU config. However, since lambda instances won’t always be running like EC2 / ECS, for idle services, migrating to serverless could in fact reduce the cloud cost.

One of Traveloka’s 10 principles is Think Bigger. Let’s see the bigger picture of our engineering costs. Besides the explicit cloud cost, there are also costs in engineering time for operating, maintaining, and troubleshooting servers. These tasks would delay engineers from producing real business value, while the time to market is key. We could hire a dedicated team to maintain the servers, who would work closely and scale along with the engineering teams. Or, we could outsource server maintenance to our cloud provider and devote our engineers to build and improve our own products to generate even more revenue.

Currently, some operational works still remain after switching to serverless though. For example with Lambda, we still need to plan the “provisioned concurrency”, especially for services with high-traffic. Lambda also doesn’t work well with monolithic applications. So, we might need to re-architect them first. We’ll share more serverless gotchas in the upcoming posts.

The Serverless Ecosystem

The first component of the whole serverless ecosystem is AWS Lambda. My team has been using Lambda to run scripts in the cloud, such as in our deployment pipelines. As we also want teams to start running Java applications on Lambda in production, we need to have proper engineering tools and processes for it, similar to what we have had for EC2/ECS. We also need to ensure Lambda gets engineering teams’ share of mind.

Serverless isn’t just about Lambda. Backend applications often need to save some states that could be in the form of binary objects, key-value documents, good old relational tables, and so on. Pairing serverless compute instances with non-serverless datastores through long living connections might not work well in some cases. A serverless datastore is preferable here.

There are still many other serverless components, which can help us develop hyperscale applications. For example, AWS SQS can be used as the sources or destinations of our applications. By using managed services, we’ll instantly gain all the benefits like reliability and security. Plus, we would write and maintain less or even no codes at all, thanks to AWS direct serverless integrations. The fastest and cheapest Lambda functions are likely the ones with fewer codes. Less codes is also the engineering productivity nirvana.

That’s all for now. In the next blog post of the serverless series, we’ll dive into these AWS services deeper and look into one of our backend serverless architectures.

End of Part I

We don’t claim that going serverless is the right answer for everybody, or even for us. We’re just at the beginning of our serverless journey. We still need to learn more about it, see its long-term impact, and refine our decisions from time to time. According to the O’Reilly survey mentioned above, this journey would pay off. We want to find out whether this is true for us as well.

Many companies, like us and some of the MANGA, rely on cloud providers. Still, there are non-cloud-provider companies trying to build their own cloud, which is the opposite of our serverless path. On this, I’d like to borrow one of Ray Dalio’s principles: be radically open-minded. Don’t just chase that shiny “anti lock-in” tech everyone’s raving about, or blindly follow a blog post from a lifestyle superapp in Southeast Asia. We should always find what works for our own use cases and circumstances. Remember, engineering is about making the right tradeoff.

If you’re interested in experiencing and improving our backend engineering, look for the opportunities at our career page. Come join us on this serverless journey!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store