Sitemap
Provenance Blockchain

Purpose-built to transform financial services, Provenance Blockchain enables institutions and fintechs to seamlessly and securely issue, transact, and service digitally-native financial assets at scale on a public blockchain, delivering material business and customer value.

Aggregating Provenance Blockchain Data: A Developer’s Story

6 min readApr 27, 2022

--

Rise & Shine — The Requirements

As you begin taking another sip of your coffee, you read the message that was sent, asking if you are able to collect block data from block height one to current, specifically transactions, from the Provenance Blockchain and massage that data so that it is reportable to non-technical stakeholders. You go to check how much data you will be dealing with so you open the forty third tab on your browser to see the current block height from the Provenance Blockchain Explorer, seeing that it is now way above a million, you take another sip of your coffee acknowledging that you accepted the challenge, squint your eyes, and lightly murmur under your breath, “Big data”. You respond to your manager with a thumbs up emoji, then you get up for a quick break.

Before Noon Break — The Event Stream

As we begin to piece together the requirements, what we do know is that we are building something that is transactional-based, so our main focus is listening for transaction events at every block height, which we can read more on here and here from the Provenance Blockchain docs.

By now we have done a pretty good amount of reading and are ready to code, you crack open your can of La Croix, while browsing the Provenance Blockchain Github to find a repository called the event-stream. As the project name suggests and with a quick scroll through the README, you are confident that this is the library that you need to stream events from the Provenance Blockchain.

You squint your eyes with a light smirk and whisper that is loud enough for anyone sitting a few feet away from you can hear, “I’m hungry, what’s for lunch?”.

Afternoon — The Setup

After a satisfying catered lunch you then read through the event-stream’s README again, but more thoroughly this time and notice that it already provides easy to use out of the box code:

val decoderAdapter = moshiDecoderAdapter()blockDataFlow(netAdapter, decoderAdapter, from = 1, to = null)
.onEach { log.info { "received: ${it.height}" } }
.collect()

Taking another sip out of a newly opened can of La Croix, you do notice there are some prerequisites before you can take advantage of the block data flow, such as establishing a net adapter, which represents your web socket connection to the node, and the decoder adapter that serializes the data into a JSON readable format. As you tear into your Clif bar you notice that the event-stream configuration already has default objects for the netAdapter and decoderAdapter only needing to supply the variables, such as the node address and custom options if you want to stream in-between specific block heights. So then you quickly check the time, lift your arms up for a quick stretch and decide to take a quick walk to get your brain freshed up for the next steps.

After The Afternoon Walk — The Data

Once you get back from your refreshing walk, you come back to your desk with another ice cold La Croix in your hand, a different flavor this time, you study the return of the blockDataFlow which is simply the data class of BlockData. We can then use the BlockResultsResponseResultto collect transaction events that we want to filter into more concrete data classes that you can define. The engine that drives this service seems to come entirely from this single Flow call. Simple enough, time for a bathroom break.

Back At It — The Storage

The next issue that comes to mind is dealing with the exponential amount of data and where to store it. Ideas begin to fly in and out of your head, such as data warehousing, specifically one that is cheap and does not require us to manage. On the contrary, we can also setup a database onto a cluster, but if owning a data warehouse isn’t too expensive and is one less thing we have to manage then that would seem as the better route. Of course we need to make sure our code has the option to switch back or forth in case requirements changes in the future. You nod your head in agreement with yourself.

Afternoon Snack — The Less Configuration

As you chug on a bag of lightly salted cashews with another cup of freshly brewed coffee, you finally have a design in mind. We know that many cloud data warehouse providers has a connector to a cloud provider such as AWS. We want to have our already transformed Provenance Blockchain data in a simple format (.CSV or JSON) so that we can keep them in a cheap data store like S3 for the data warehouse to automatically pick off and store. Since we are dealing with tons and tons of data, S3 would provide a cheap data store if in the event all data is lost in the data warehouse and we don’t want to reprocess the data from a service side.

Lastly, we don’t want to deal with a whole bunch of configurations, especially when it involves trying to configure our cloud service, if we can throw all the component configuration into a CI/CD pipeline to automate, then we can top this project off with a chef’s kiss. With cloud technology being so popular we are given a handful of frameworks to work with such as Serverless, Zappa and Terraform to name a few. Munch munch.

C.O.B — The Aggregate Service

As you look at the time and notice its about the end of the day, you look for a good stopping point and take notes on what you plan on tackling tomorrow. Right before you are about to call it a day you take one more look at the Provenance Blockchain Github, because you happened to have it open and notice a repository with a familiar sounding name that was just freshly updated.

You click on it only to find that everything you had planned for this data aggregation is already done, you table flip your entire desk in your mind and give out a big sigh and let out a word that rhymes with luck.

Essentially you can fork this repository and make it your own, the service can be easily customizable and runs as a simple jar, a developer would only need to add or modify the data models found here. The project currently uses the serverless framework with an example here, but of course this is totally based on preference. Then when it comes to the data warehouse, you could either go the self-managed route with this, where you can modify what you want/need to store, or go through the data warehouse path where you just feed the data into S3 and let the data warehouse pick off from the bucket.

The End.

References:

ROBERT CHAING

Robert is a Software Engineer in the Research & Development team at Figure. If he wasn’t doing what he is doing now, he would be on boat fishing.

--

--

Provenance Blockchain
Provenance Blockchain

Published in Provenance Blockchain

Purpose-built to transform financial services, Provenance Blockchain enables institutions and fintechs to seamlessly and securely issue, transact, and service digitally-native financial assets at scale on a public blockchain, delivering material business and customer value.

Provenance Blockchain Foundation
Provenance Blockchain Foundation

Written by Provenance Blockchain Foundation

The public open-source blockchain used by over 60 financial institutions. Billions of dollars of financial transactions have been executed on Provenance.

Responses (2)