Real-time analytics from event sourcing
This post was written alongside Juan López López
At Clevergy, we have always kept a strategy: to ensure that any event on the platform can provide raw and processed real-time information that can be visualized, transformed, and managed by every person within the company (tech, business, product, etc.). This strategy also includes the will to avoid any kind of development required for data availability, delegating the transport and persistence work to the data platform, and avoiding also heavy classical ETL processes that require knowledge of third-party applications.
In many companies, there is a data team processing information, but often this team lacks enough context about data changes and is prone to making mistakes. To solve this problem and to facilitate the flow of information among all actors in the company, at Clevergy we have decided to adopt the strategy of data democratization.
Through a model based on the idea of Data Mesh, each domain exports every event of creation or modification of domain entities from the operational plane (operational services in the application) to the analytical plane, making all information, raw, from the platform available in the data warehouse.
Our short but intense story about domain events
We have always focused on agile development from the very beginning of the platform design process, allowing us to have business functionalities as quickly as possible, trading off the technical debt generated to test the features with the users. We have always asked ourselves when we should refactor each piece of software, taking into account the growth and projection needs of the platform. Regarding event design, the application started with an entity-action event design, which we recently decided to refactor into an entity-based design model with the operation field.
Event-based architectures can become unwieldy and difficult to manage when dealing with large amounts of data or complex systems. By switching to an entity-based design, it may be easier to organize and analyze the data, implement changes and updates, and generate insights and analytics. In addition, entities can be separated for security reasons and published to different topics, allowing for modularity and reuse of components across the system. High-throughput events can also be published to dedicated topics for optimal performance.
Regarding the analysis point, we have always tried to dump every single event on the platform into an analytics database, allowing the history of each business entity to be available. With the current entity-based topic design, we have naturally managed to group information by domain entities, making it more easily accessible.
The domain-specific information is preserved across the entire information channel, spanning from the operational to the analytical domain. However, the system must remain resilient to changes, and because of that, we have implemented a strategy that involves using Protobuf schemas as a contract for each event. This allows us to version the models and decouple the infrastructure layers.
Data-informed decision making
A data-informed approach enables organizations to make informed decisions based on data insights, rather than relying on intuition or assumptions, or even only on data. We use data to understand user behavior, improve user experience, optimize performance, and create better product offerings. To summarize, Clevergy manages data as a critical input for decision-making and development.
With each change our domain entities have, an event is launched to the platform and dumped into an analytics database. This event finishes in a table, organized by domain entities, in our data warehouse. With that, we have created real-time dashboards for every important business metric. User conversion, application usage metrics, module usage metrics, user usage patterns, and more have been turned into information boards by business users themselves, allowing us to measure the impact of changes and establish guidelines for future functionality development.
Free IT Crowd
Data is a valuable asset that can provide critical insights into customers, markets, and operations. However, analyzing and interpreting data can be a time-consuming and complex task, which can be a challenge for small businesses with limited resources. Our solution to this is to empower the whole organization to do the data analysis by learning SQL. With that, tech staff, always busy, is not a blocker to that anymore improving the overall efficiency of the organization.
One of the primary advantages of giving the autonomy of data analysis to product staff is the ability to leverage their domain expertise. Business and product people are often the ones who understand the data best, as they have direct experience with users, markets, and operations. By empowering them to analyze the data, businesses can gain more accurate and relevant insights that can drive informed decision-making.
When everybody is able to analyze data, they are more likely to share their insights and collaborate with colleagues across departments. This can lead to new ideas and innovations that can drive business growth and competitiveness.
Real-time analytics as part of our observability strategy
Observability refers to the ability to measure, monitor, and understand the internal state of a system or application based on its external outputs or behaviors. Not only an exception or an error in our code shows a problem in our system, but if we are not having new users or no users are doing transactions at a specific time, probably your platform has to experiment with some problems.
For us, the value of having real-time data is not to have real-time analysis of the features (they need some time to understand if they work well with the users). The value of real-time data is to use this data to understand if our system is working properly.
The next step here is to build an alert system to inform us when we have this lack of business events as we have with code errors. Now we are checking it manually.
GCP as a key part of the strategy
Google Cloud Platform (GCP) is a cloud computing service that provides a wide range of tools for building, deploying, and managing applications and services. Two of the most valuable services offered by GCP are Pub/Sub and BigQuery, which allow users to store, process, and analyze large amounts of data in real time. The combination of these services can provide significant advantages for storing and analyzing event data produced by changes in the domain entities.
All good and agile data strategy relies on the possibility of using a data platform that facilitates work and avoids extra developments. In our case, we use Google’s cloud infrastructure to deploy services and connect Pub/Sub topics to an analytics database in BigQuery via subscribers. We use model versioning for the structure of the topics and link the event schema to the BigQuery tables. All updates to any domain model are available in the table with the same name, and we maintain the nomenclature in both the topics and tables to facilitate maintenance.
Conclusions
At Clevergy, we have sought and tested a way to obtain data from the platform that allows us to make business and development decisions.
To summarize the main ideas we have talked about:
- Entity-based design models with a topic by entity have better scalability.
- Data-informed decision strategy is perfect to build a product in an agile way, but, obviously, you need the data. And, ideally, democratized.
- SQL is an incredible weapon to empower everybody to understand how the software is working and to think of a good product strategy.
- Real-time data is a key part of the observability strategy.
- GCP has an incredible suite to work with data.
We have tested this strategy over the last year and we are proud to be able to share with everyone this line of work that allows for improved performance and work speed for similar projects that want to adopt it.