Scaling with Serverless Framework on AWS

Serdar Gokay Kucuk
HeyJobs Tech
Published in
5 min readApr 6, 2021
Photo by Volodymyr Hryshchenko on Unsplash

Intro

HeyJobs started experimenting with serverless technologies quite recently. We were already using lambdas here and there and had an internal message queue for relaying job updates between our systems. The main limitation that kept us from implementing a larger scale applications was the database, because we believe our products benefited from having a relational database for analytical purposes. But after the release of AWS Aurora Serverless, we noticed we are ready to run some tests on the stack. An on-demand scalable database was the remaining puzzle piece that led us into considering extending the usage of serverless architecture for larger scale applications.

We decided to try out scaling up features of serverless systems. HeyJobs processes hundreds of thousands of job details daily, and this workload is quite spikey. On specific hours, while processing large data volumes, we need a database cluster with over 100GB of memory with tens of parallel workers to consume all available data . But then, for the rest of the time, this cluster didn’t need to process more than a few analytical SQL queries per hour. A perfect match!

Development

After some tests and thoughtful analysis, we decided to go with Serverless Framework(https://www.serverless.com/ or for more development focused resource https://serverless-stack.com/). Some of the strong points we saw at Serverless Framework instead of other options (Chalice etc.) were,

  • It has a strong focus on creating modular components and re-using them.
  • There is no vendor lock-in (more on this later).
  • It is software language agnostic.
  • For AWS deployment, by utilising CloudFormation, it has built-in declarative infrastructure as code.
  • It has an active community and an extensive repository of plugins.
  • Clear separation between serverless infrastructure and application code.

We liked the idea of heavily parallelising our workloads to lower our total processing time, and just paying for what we use at peak times was attractive.

The development process was primarily smooth and without any surprises. We have chosen to go with Python as our language of choice due to libraries like “pandas”, giving us lots of flexibility regarding data processing. “Boto3'’ is also a great library to interact with the AWS resources we deploy with Serverless Framework. We used seed.run for managing our deployment pipeline. Seed.run gave us the ability to configure and utilize pull request applications that cost nothing to run. Its features made testing and QA easier than ever. But of course not everything worked out of the box.

One of the issues we encountered was the 250MB unzipped project content for lambda functions. While it’s possible to deploy Lambda functions individually, it took too much time to build, and we had to limit our library usage from time to time. A very recent addition to AWS now lets us use docker images with Lambda images pulled from ECR, and with this upgrade, there is a new limit of 10GB in total project size. This enabled us and many others whose AWS region supports it, for the implementation of heavier workloads on Lambda.

Persistent Storage

Whenever we needed to work on defined data sets whether it’s feeds or data dumps from other systems, we went with S3. Whenever a certain file gets dropped in S3, we triggered a handler on our serverless project and worked on it. For data exporting, we just used S3 buckets as destinations due to their predictability and lower cost. But we had to use a more recent yet still known technology for our OLAP requirements.

We used the first version of Aurora Serverless with PostgreSQL compatibility. Aurora Serverless is the critical component to use with serverless applications if there is a need for relational data storage. Our main processing pipeline is composed of SQS and Lambda functions. SQS has a scaling characteristic that starts triggering 60 more parallel invocations after each minute on Lambda. This scaling behavior works OK with Aurora Serverless v1’s scaling factors. Lucky for us Aurora Serverless V2 began its preview already with MySQL and PostgreSQL version probably coming sometime later. Scaling will be even more in sync with Lambda based workloads when we switch from V1 and V2.

Domain Driven Design

We also came up with a way of implementing some components of domain-driven design on top of some AWS services. Especially creating a message bus, using SQS and SNS gave us lots of integration potential for the future. We decided to use these components in the following ways,

  • Events: Some messages, we just want to put them out, but we don’t want to deal with whether it’s processed or not by some number of consumers. We decided to implement our events on top of SNS, allowing different parts of the company to attach to our events. Whoever wanted to consume these events just had to deploy an SQS queue between itself and the topic. In this setup, we had a single producer but multiple consumers.
  • Commands: Commands were a bit different, and we utilised SQS directly. In commands, we wanted to make sure they are processed successfully. Whenever we deploy a command onto our message bus, we find out that it was always a single consumer with one or more producers at the other end of an SQS queue.

It’s really hard to describe in sentences but mainly these two services made our architecture really easy to extend and integrate with. After deployment whenever we need to integrate with a different team, it’s usually just a matter of connecting another SQS queue to our SNS endpoint and letting other teams process the events in their own domains. The same goes the other way around, whenever some other team needs to use a component we have, they just get access to our queues and publish messages there to trigger some actions in our domain. The ability to expose our interfaces in an event based manner while still not exposing most of our internal systems makes everything easier to reason.

Conclusion

In summary, we managed to lower our cost of development and running by switching to serverless architecture for this data processing workload. But this isn’t the end of the story. While serverless architecture worked quite well for our scaling needs, it also has another strong point. Scaling down, meaning implementation of workloads that don’t require lots of invocations. This is what HeyJobs started developing and we see lots of cost saving potential there too.

Interested in joining us ? We are hiring in this team and also in other areas !

--

--