The Last Guide on Event Driven Architecture You’ll Ever Need

Start saying goodbye to your tightly coupled spaghetti monolith monster 👋

Javion Cai

Published in

ELMO Software

7 min readOct 5, 2021

A team of engineers gather in a meeting room.

“Okay guys, what needs to happen when a new client signs up to our service?”

Easy.

Just get their databases ready, send them a welcome email, bill them for their subscription, and while we’re at it let’s also send out a notification to the CEO saying that the new product is doing well.

Then, one of the engineers starts to say something. “But um…”

Oh. no.

“In order to send out an email, service X needs to talk to service Y and fetch a token from <REDACTED TECHNICAL JARGON> and BLAH BLAH can’t do XYZ when BLAH BLAH is doing ABC so <MORE REDACTED TECHNICAL JARGON> will need to…”

“CI/CD?”

“KUBERNETES!!”

Does this meeting feel familiar? As if the problem the meeting sought out to solve has just reveal so many other problems that needed solving? As if there are so many problems that needed solving, there’s no time to solve any? As if the software itself had turned into a plate of spaghetti so tightly wound together that we could no longer pull anything apart?

We all start scratching our heads.

From which moment did our beautiful software turn into the spaghetti monster it is today?

Recognising Spaghetti Systems

Spaghetti system · [ spuh-get-ee sis-tuhm ]
noun
a white, starchy pasta of bad decision making that is lumped together in the form of tightly coupled services, repetitive code, hacks, and served with any of a variety of bugs, downtime, and/or other disasters.

Some people often equate monolithic systems with spaghetti. This is definitely not true, a well built, well abstracted, monoliths can out-perform a poorly build distributed system any day of the week. However it takes great engineering discipline to do so without it becoming a big ball of mud.

Another blog post on Monolithic vs. Microservices Architecture maps out the common pitfalls of poorly designed monolithic systems quite well.

- Application is too large and complex to fully understand and made changes fast and correctly.
- You must redeploy the entire application on each update.
- Impact of a change is usually not very well understood which leads to do extensive manual testing.
- A bug in any module (e.g. memory leak) can potentially bring down the entire process.
…

A tightly coupled system where everything depends on each other is a common recipe for disaster. Natural context boundaries in event driven systems makes it easier for decoupling and allow independent services to stay independent and running regardless of the changes in other parts of the system.

Event-Driven Architecture at ELMO

Here at ELMO we’ve been lucky enough to understand the pain that deeply coupled and synchronous systems bring to us. This has allowed us to seize an opportunity to bring to life our own event driven system.

Before we get started on the juicy parts, if you’re not already familiar with event-driven architecture, here’s a few reading resources that I would highly recommend to anyone who wants to get their hands dirty:

A key element of event notification is that the source system doesn’t really care much about the response. Often it doesn’t expect any answer at all, or if there is a response that the source does care about, it’s indirect.
- What do you mean by “Event-Driven”? (Martin Fowler)

At ELMO, we create cloud solutions for HR & Payroll. One of the exciting new products that we were building turned out to have some quite complex requirements. Welcome to the world of building enterprise software.

For this particular project, we built a service that recorded account balances for employees. A lot of different services would talk to this service to provide information, updates and this service then created a record of transactions for balance changes.

The catch was that this service also had to be able to present a snapshot of a balance in the past, present, or future. On top of that, the past could also be edited retrospectively. 😱

This was crazy! Not only did we need some sort of mechanism to make sure that the balances could somehow be projected and “forecast-able”, we also needed to save periodical snapshots of every balance tracked, while being able to create alternate history timelines so that history can be edited!?

A Process Of Evolution

Historically, most things at ELMO have been synchronous. Synchronous workflows are relatively easy to understand, straightforward to test, and most importantly it works.

So that's the approach we took in our first iteration. However, we soon started to find ourselves needing to create tech debt and workarounds because of a lot of the unnecessary coupling that we had built into our growing spaghetti festival.

“But um, we can only do this if we also did that”

“But um, we can’t do this, because of that, so we got to do it the other way”

We knew we couldn’t let this “almost too familiar” scenario spiral out of control. We started considering how we could start moving towards an event driven system and help ourselves escape the growing logic spaghetti that was being created.

An Event Driven Dream

Event-driven systems are not a new concept in the software engineering world but it was still a relatively new concept to ELMO when it was first brought up here. It required a whole new way around how you think about your code. Things moved from being imperative to reactive, talking became listening and most importantly “API contracts” had become one-way instead of two.

Adopting a full blown event driven system doesn’t happen overnight.

But it does start with a mindset change and that can definitely happen overnight. Once everyone on the team starts thinks about events with a more reactive mindset and the discussions will subtly change from:

“After doing this, then that should happen”

“If this happens, then that should happen.”

You might be wondering “but those two sentences sound virtually the same?”

In an event-driven system the “this should happen” and “that should happen” actually become the concerns of two separate pieces of software. One piece of the software would be responsible for the “this should happen” and it will notify the world at large that “this” did in-fact happen. Then other pieces of software can react accordingly and make “that” happen.

When you originally had a complex choreography of how things should happen, they instead are now orchestrated to work together in a Lego block fashion. It leads to smaller, more manageable components (microservices) that really enable your system to be more “evolvable” as each building block is smaller and independently managed.

With this mental model change, your team has already start the journey on building a flashy new event driven system.

An Analysis on Event Driven Systems

This is where things will get a lot more technical as I get into the knitty gritty of events-driven systems and how they exist in the practical world. Enjoy!

Event driven systems are good because:

They are reactive and we know reactive systems are resilient and more scalable. (Why reactive?)
They create loosely coupled systems speeding up development and creating autonomous product teams.
Increased loggability. Your events provide a single source of truth to know exactly what your system is doing.

They make life better because:

High scalability - For asynchronous operations we’re able max out on our hardware utilisation. We no longer need to wait until the blocking operation has completed.
More robust information - Since events are recorded as they occur, software's have access to all the data and context they need to make the best decisions.
Forecasting functionality - Event driven systems are particularly optimised to make use of real-time analytics. This means that patterns can be identified from system events and allow software's to have a certain level of “pre-emptivity,” drastically enhancing the typical customer experience.

The Road to Our First Event-Driven System

Before we had the capability to adopt one for ourselves, we had to make sure that we had the necessary building blocks in place.

First of all, we needed a solid naming convention that guides engineers on how to label things. If everyone just called everything by different names and used different interpretations how could we even start thinking with events?

Although UL and DDD was important and an ideal we sought after, we didn’t need to go that far for the sake of realising our event-driven system.

Luckily at the same time, an official Elmo Naming Conventions Guide was being brewed up in our Architecture Handbook. It was the perfect opportunity to include what we needed for our event driven dream. This included:

the metadata required around events
the definitions of global identifiers
enumeration values
considerations around name-spacing and uniqueness

These conventions, guides and mindset changes have now become ELMO’s new event protocol.

First of all, I want to thank everyone that has made it this far. Hopefully you’ve had as much fun reading as I had writing it.
As an official favour for my first official medium blog post, please leave a clap👏 and also a follow 🙏🙏🙏 ! Part 2 is coming soon!

Stay tuned for Part 2 - The Last Guide on Event Driven Architecture You’ll Ever Need where I talk about:

More on naming conventions
Breaking your business into events
Formalising event shapes
Implementing a standard event transport using SNS and SQS
Asynchronous communication between apps
Event sourcing and CQRS