Azure Event Hubs — Azure Databricks

Let’s implement a quick and simple IBOR scenario using Azure Event Hubs and Azure Databricks

Laurent Bananier
4 min readAug 15, 2020

IBOR (Investment Book of Record) has been used for a while in some OMS (Order Management system), Data Management solutions specialized in Finance and Back Office systems. There is/was a challenge between front and back office to reconcile all events, provide real time information and have a flexible solution.

An IBOR solution will allow portfolio analysis in real time. The goal is to be able to consume any event and evaluate any possible impact on a portfolio.

More information about IBOR can be found here:

https://www.cutterassociates.com/cutter-advantedge/issue-2014-9.cfm

We could summarize the process as below:

For the sake of simplicity, let’s just focus on a scenario based only on transaction events to calculate positions in real time for a specific portfolio.

We will build a simple case with Azure Event Hubs and Azure Databricks.

Our target design is as follow:

What is required?

  1. A mechanism to send events to Azure Event Hubs
  2. Azure Event Hubs resource for ingestion
  3. Azure Databricks resource for processing

Sending Events

In order to send events to Azure Event Hubs, let’s modify an example available from the Microsoft documentation:

Example with modifications available on github.

It will allow the transactions to be sent to event hubs.

The transaction event format will used the following JSON file:

Ingest Events

Let’s create an event hub from the Azure portal.

First, create a resource group:

Choose the closest region from your location.

Then create an event hubs namespace:

Next step is to create the event hubs:

Then save the value of the connection string key. It will allow events from the .Net application to be sent to the event hubs and also consume them from Azure Databricks.

Process Events

Next, an Azure Databricks resource needs to be created:

After creating the Azure Databricks resource, under the resource group previously created, there should be the following items:

Databricks configuration

Within Databricks, a cluster has to be created. For this example, the event hubs library should be added to establish a connection to Azure Event Hubs.

Next, create the notebook in Azure Databricks for processing events from the event hubs.

Below is the notebook created:

Full notebook is available here on github: Full Ibor notebook

After sending a couple of events with the .net application the portfolio valuation (cash and asset) should be impacted. When running the following cell from the notebook each message sent will trigger a calculation of the portfolio valuation (cash and asset position).

In the example, portfolio is initially funded with $1,000,000. After sending 16 events simulating transactions (buy and sell), above is the result from Azure Databricks.

Conclusion:

This scenario shows how simple Azure Event Hub and Databricks can be used together. More advanced scenarios can be built.

I hope it was useful!

References

https://docs.databricks.com/spark/latest/structured-streaming/streaming-event-hubs.html

--

--