EventSync: the event-driven management missing piece.

Event driven has been becoming very popular those past years especially with cloud platforms and their elasticity; led by serverless products.

Even if there are many different event management products on Google Cloud, there is one missing piece. But, before going to the pain point, I would like to recap from the basics.

Actually, events are pretty simple:

  • Something occurs somewhere

The event production

On Google Cloud, services like Cloud Storage or Firestore can generate native events.

Many others cannot, but Eventarc has been designed and released to listen to the audit logs and generate events when the pattern matches.

The event consumption

Before reaching the consumer, the events are transported through a channel. The medium used on Google Cloud is Cloud PubSub.

Cloud PubSub can deliver messages, the events in that case, by HTTP push. Thereby, Cloud Run, Cloud Functions, App Engine, and, in fact, any other HTTP servers are able to receive and consume the events.

In case of high throughput, it’s recommended to pull the messages from PubSub. The developers have a better customization on the message flow control.

The fan-out pattern

One strength of PubSub is the fan-out, i.e. the fact to duplicate messages to several destinations.

In detail, the messages, or events, are published on a PubSub topic.
Then, on a topic, one or several subscriptions can be created.
The consumers get the messages from their subscription.

Every message published in a topic is duplicated in each subscription.

Like that, the fan-out pattern is achieved!

The missing fan-in pattern

However, the opposite pattern, i.e. the fact to wait several events to generate a new one after synchronization, is missing.

So, how to synchronise several events on Google Cloud?

The event orchestration option

One solution is to not take the events on-the-fly, but to orchestrate all the operations, i.e. to always have the control on each step and event in the runtime environment.

This solution implies to manage the fan-out AND the fan-in in the same pipeline. Like that, the pipeline knows exactly what it has been created and therefore what it has to wait.

You can find 2 great videos made by Martin Omander with Mete Atamel about “Choreography” and with Guillaume Laforge about “Orchestration

Orchestration does not fit all use cases

As always, one-size-does-not-fit-all, and it’s the same with orchestration and that solution doesn’t always work.
I will take a real world example from my company.

The Carrefour group is composed of several countries.
Each country has their own local data platform.
The group also has its own data platform that aggregates the data from the different countries.

One of our use cases is to aggregate the products sold in our stores and to group them by suppliers to track which products have been sold, where, when, how many,…
The suppliers like this king of aggregates and are ready to buy them!

Because our stores do not close at the same time, especially because there are different time zones (between France and Brazil for example), we would like to perform the aggregate, only once, and when all the local data platforms have been updated with the daily data.

Here, the solution is to wait for an event from each local data platform, and, when all the events have been received, to trigger a new one to start the aggregation processing.

You can easily imagine that it is not possible to set up a global orchestrator that will run all the data ingestion in all countries, and then wait for completion!

Here is the missing piece; and EventSync fills the gap!

The EventSync Project

To solve the fan-in pattern, I usually always implement the same solution based on Cloud Run and Firestore.

  • I log the events received on a dedicated endpoint in Cloud Firestore

The EventSync project is only a generic, customizable and ready-to-use version of that implementation I repeated several times.

Event synchronization use case

The first purpose is to wait for different types of events and when all are received, an event sync message is generated.

The events must all be received in a defined time window (named observationPeriod)

Get events over a period use case

Sometimes, you don’t want an automatic trigger, but a tool that aggregates the events over an observationPeriod and delivers them on demand.

That’s the second use case. The /trigger API generates an event sync message on demand, even if the trigger conditions are not met.

How does it work?

The application is designed to run on Cloud Run. The container is publicly available here: gcr.io/gblaquiere-dev/eventsync
You can also use the version if you want a specific one. Refer to the GitHub tags section

You must provide a JSON config in the CONFIG environment variable to define the number of event sources that you expect (also called Endpoints) and other app options.
The endpoint’s configuration contains an eventKey that will determine the URL endpoint to reach by each event source, typically /event/<eventKey>
The security is ensured by the Cloud Run platform and the IAM service.

Be sure to have granted the correct permission to the Cloud Run runtime service account (firestore, firestore admin and PubSub publisher) and deploy the app.

More detail in the project documentation

To send event to the endpoints, you have to configure your event sources:

  • With PubSub, configure the push subscription to the HTTP URL

Pretty convenient!
However, be sure to have granted the correct permission (roles/run.invoker) on the event source environment to be able to reach the EventSync Cloud Run service, and therefore, the endpoints.

As output, when an event sync message is generated, all the events in the observationPeriod are included in the output message, grouped by endpoint.

The consumer app will be able to extract the useful value from that structure. An integration report in our case, to know the quality and freshness of the data source.

More details in the project documentation

EventSync is here now, for free!

You have now a solution for your fan-in use cases. It could help you in many situations.

By default, it costs nothing. The free tier of Cloud Run, Firestore and PubSub largely cover the cost of this app.

I already planned several updates and increments on the product (see bellow).
If it does not fit your expectations, don’t hesitate to open an issue on GitHub to discuss it for a future implementation.
Or even to contribute to that project!

Known limitations

The limitations are documented here

Advanced config and miscellaneous

Here a list of current and future feature of the service:

  • Automatic trigger deactivation (Included): in that case, only the /trigger API can deliver events to the target

--

--

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
guillaume blaquiere

GDE cloud platform, Group Data Architect @Carrefour, speaker, writer and polyglot developer, Google Cloud platform 3x certified, serverless addict and Go fan.