Using Event-Driven Architecture to Break Apart a Monolith

Published in

HeyJobs Tech

5 min readOct 6, 2022

So this is yet another blog on event-driven architecture 😒 🎉

However, instead of talking about the theoretical aspect, I’ll talk about the steps my team and I took when we recently introduced event-driven architecture into our system to break a piece of functionality out of a monolith. I hope this gives you useful insight into what building something like this looks like in a real-world scenario.

Some Context

My company, HeyJobs, matches essential talent with employers. Our goal is to be the #1 Career Platform for essential talent in Europe.

We have a bunch of different codebases and services; however, like many growing companies, a large portion of our business logic is contained in one big, main monolith. And, like most growing companies, we are starting to feel the strain of our growing user-base and developer team so have started the process of slowly deconstructing it. Most recently, we used event-driven architecture to split off a chunk of business logic to a separate service outside of the monolith.

One of the responsibilities of my team is tailoring job recommendations to our users, and this is the functionality that we wanted to extract out of the monolith and into our new shiny Job Recommendations Service.

We base our job recommendations on which jobs a particular user has applied to, bookmarked, or a bunch of other trade secrets…

The key takeaway, however, is that we need this data from the monolith in our service to generate recommendations. If only we could replicate the data from the monolith to our service with eventual consistency:

Event-Driven Architecture — quick definition

Event-Driven Architecture is a software architecture paradigm where events are used to trigger and communicate between decoupled services (commonly used among microservices). An event here means any change of state.

Event-Driven Architecture — in our case

Whenever a record we rely on to recommend jobs is created or updated in the monolith, we send an event and consume it (save it) in the Job Recommendations Service.

Steps We Took

(Note: we use AWS and in this article I’ll refer to specific AWS services for simplicity, but these steps can be easily done with another cloud provider)

Created an Event Schema Registry

When implementing an event-driven architecture, it is important to ensure all events have a valid and consistent structure so consumers can process them successfully. We achieved this by storing JSON event schemas in a centralised registry which all event publishers and subscribers could access. We used AWS EventBridge to store our schemas.

We created a GitHub repository for managing the schema files. The CI/CD pipeline of the repository was configured to automatically run scripts that checked that the schemas were valid JSON and had backwards compatibility with previous versions before uploading them to AWS. We also configured a CODEOWNERS file so that teams that wanted to subscribe to specific events could be automatically requested for review whenever a pull request was opened which proposed changes to a schema that they were subscribed to.

We then created JSON event schemas for each of the events we needed.

Prepared our new service to ingest events

This involved creating the database to store the records (obviously), creating an SQS queue where events will eventually appear, and implementing AWS Lambda functions for each event which basically checked events had all the necessary data, extracted and structured the data from the events, and then stored them.

Published the events

We created an SNS topic for each event we wanted to publish then implemented functionality in the monolith so that whenever a new record we wanted was created or updated, an event would be sent to the corresponding SNS topic. Before an event was sent, it would be validated against the event schema to ensure our events had a correct and consistent structure.

Our event pipeline was complete; we were ingesting events and replicating the records in our service with eventual consistency.

Backfill the existing data

Finally, once we had records from the monolith being successfully replicated in our service, we needed to do a one-off backfill to get the existing records.

Ops + Maintenance

Monitoring + Alerting

Since the underlying infrastructure of the service that was extracted out of the monolith was totally changed, it was vital to create comprehensive monitoring for our system. We used Datadog to create alerts for a range of metrics; some essential monitors which we can recommend are:

Number of messages on the Dead Letter queues
Free-able RAM, CPU utilisation, Read & Write IOPS of the database
Number of SNS notifications (if this number becomes unusually low there may be a problem with the publisher)

Deployment

To deploy our service, we used the Canary Deployment strategy — meaning we deployed our API then gradually routed increasing numbers of user requests to it. This had several advantages:

Since getting the capacity of infrastructure right the first time can be difficult, gradually increasing the load allowed us to find bottlenecks early and adjust our system without it getting toppled by the traffic
Testing in production (you can’t beat this level of testing accuracy)
Testing our alerting. When creating our alerts, we chose most of the threshold more or less arbitrarily. Our deployment strategy allowed us to fine-tune our alerting over time.

Lessons Learnt

Supporting the release

Not too long after completing our release, our system started to be strained by the traffic. In hindsight, this isn’t surprising, but we failed to account for this when planning our capacity after the release. This meant other sprint items became de-prioritized mid-sprint to allow us to work on re-configuring our infrastructure. Next time, we will ensure to keep some capacity free for supporting a release like this.

Conclusion

We have already started experiencing the benefits of owning our own micro-service and aim to keep building on top of this architecture by adding more events for other teams to utilize. I hope these steps can be useful for you and your team if you want to split up your monolith and benefit from this architecture too.

Interested in joining our team? Browse our open positions or check out what we do at HeyJobs.