Business events in a world of independently deployable services
In a previous article, I highlighted the simplicity of Martin Fowler’s definition of domain events. In this article, I’d like to dig deeper into a discussion of what it means to define domains for analytics and, as a corollary, the resulting enriched events that ease downstream analytics while enabling authoritative business events without sacrificing the organizational velocity promise of independently deployable microservices.
Analytical Processing History
First, a brief history.
If you follow analytics, the above picture is likely familiar to you.
The rest of this article is really about this picture, as it applies to the emergence of microservice and stream oriented architectures with respect to analytics.
Historically, organizations operated this way. Front end teams would develop apps on one or more front end databases. OLTP database engineers would be the stewards of those respective domains. ETL teams would copy this data, transform it, and load it into analytic systems that would then be used to drive entire armies (even industries) of business intelligence via a host of analytical tools, systems and ecosystems. Such processes would take days (or hours at best) and had evolved into quite a reliable and stable structure that served the industry well for years.
Then, several mega-trends happened.
- The surge of mobile device use over desktop
- The proliferation of ephemeral infrastructure as a service via the adoption of cloud compute, storage and networking
- The explosion of data
Along with these mega-trends came the pressure to reapply time-tested analytical methods to this emerging world of mobile first, elastic compute, cheap storage, and ever-increasing mountains of data.
In Ben Stopford’s blog post, he introduces the notion of the data dichotomy.
The Data Dichotomy
Data systems are about exposing data.
Services are about hiding it.
Microservices typically are domain driven, independently deployable bounded contexts.
- Microservices usually serve as an abstraction making data on the inside seem smaller on the outside.
- Data services usually function by making data on the inside seem larger on the outside.
How does one resolve the tension between these two use cases?
Answer: Rely on shared streams of data.
These shared streams typically consist of commands, domain events, platform events, infrastructure events, enriched events and, of course, business events that ultimately enable queries on key aspects of the business.
Business Events — a definition
What’s the difference between a domain event and a business event?
Put simply, a business event is a domain event whose domain is critical for running the business.
So what is a domain that is “critical for running the business”? Aren’t all domains critical? Yes they are. However, because of microservices, product domains usually are focused on some operational/transactional concern. Therefore, they solve for speed of development, and as an organization scales, they optimize towards independently deployable microservices. Furthermore, as this organization scales, analytical product teams naturally focus on delivering, well, analytics, and hence their analytics domains usually end up being some set of business events.
These business events are often sourced from one or more enriched upstream events (typically upstream domain events). This is because business events are usually not “observing” any state change or responding to some “command” directly. This typically is the responsibility of upstream domain events and the corresponding product teams (which are responsible for the corresponding aggregates). Business events are usually taking facts of what has already happened and enriching them. This makes business events ideal for optimizing analytics and reporting.
Within a company, there typically exists a handful of key performance indicators or business events that are used to describe the health of the business. Sometimes these streams get rolled up into a single metric (e.g., Netflix SPS). Other times, these streams are a set of several independent facts that together describe the collective health of a company.
Business Events — a case study
To further clarify the distinction between business events and domain events, consider the following vacation rental example.
Let’s assume that we’ve already come up with a domain via some method (such as event storming).
The booking domain event might look like this:
The bookingRequestId is an id to the originating booking request.
The BookingRequested event is an example of a command event that triggers the Booking domain event to “spring” into existence.
Great. So we have a domain event and a command event.
Here is where business events come in.
What if, as a business you want to track at any given moment, the total amount of booking value ? This may be an important metric critical to the business (whether this actually is an important metric to the business or not should be determined by business stakeholders. For the purpose of this post, lets assume that this metric is critical to the business).
These are some of the questions you might want to answer:
- What is the amount of booking value for the last day? Last month? YTD?
- What is the amount of booking value broken down by country? By Traveler Last Name? etc?
These types of questions have historically been answered by analytics teams that are well versed in creating fact tables, dimension tables, and data systems that scale well for asking these types of questions.
In a scaleable, microservices, stream-based world, these types of questions become very simple to answer.
Enter the business event.
At first glance, it looks a lot like “old-school” facts in a star schema table. Technically, a business event enriches the upstream domain events to make queries and analytics efficient. In a sense, like traditional analytic fact tables in a data warehouse, business events “de-normalize” data to optimize queries.
Here are some observations of this sample business event.
- It enriches the Booking domain event
- It is denormalizing fields of the BookingRequested command event.
- It is denormalizing fields from the Traveler domain event stream (not posted here), namely firstName and lastName.
- It is denormalizing fields from the Property domain event stream (not posted here), namely propertyState and propertyCountry.
- It is denormalizing fields from the Invoice domain event stream (not posted here), namely amount.
- Finally it has some “pre-calculated” fields such as year, quarter, month and day. This makes downstream processing and analytics efficient to “rollup” values by downstream reporting requirements.
The biggest takeaway, is that the business event, unlike all the source domain and command events is really enriching the Booking domain event for the purposes of analytics.
Specifying Business Events
So, what goes into (and doesn’t go into) a business event?
Answer is: it depends.
One of the best ways to start is to identify the analytics use case. Whatever the use case is, whether it is programmatic, analytical, data science or machine learning, it typically is a use case that requires some set of similar queries. If you have several different types of queries, you may end up either having a “chain of business events” or a “fan-out of business events”. The actual specifics will vary from use case to use case.
For example, in the above use case maybe the down stream use case is a generic index to lookup all bookings by amount. In such a case, you may not want to pre-calculate “year,quarter,month,day”. You could just stick everything into elastic-search since it has “year”, “month”, “day” rollups built in.
However, you might want to put this into a realtime time-series based cube like Druid. Adding in “year, quarter, month, day” there, makes that system incredibly fast and useful for slicing and dicing in real-time.
Therefore, what goes into the business events will vary from use case to use case.
- A business event is a domain event whose domain is critical for running the business.
- Business events enrich one or more source events and are optimized for their respective analytic use cases.