Professional Đapp Architecture

Building Scalable and Upgradeable Systems

Last week, while in Toronto for the ‘Blockchain Futurist Conference’, I had the opportunity to sit down with a few Toronto-based blockchain companies and work through some of the architectural issues that they were dealing with.

Toronto was very humid, if you can’t tell from my shirt

While there were some companies with a pretty firm handle on their tech (Ex: Horizon Games), I found myself gobsmacked at how little some of the better-funded and well-known ventures had invested in their technical architecture. This isn’t too surprising, though, as most of the developers working for these businesses were fresh out of college and had little to no experience building scalable or performant systems.

In fact, one of the developers told me that the rest of the technical team thought of his lack of development experience outside of blockchain as a strength because he had not been sullied by the ‘old way’ of doing things — this was the most absurd thing that I had heard the entire week.

Regarding the technical architecture of these businesses, one of the most glaring problems (at its root) was the conflation of data storage with business logic — this is akin to storing the database in the programs that access and manipulate it. The difficulty with this approach is that it often provides little flexibility for the architecture to grow and evolve over time — and I can’t say that I know of many large systems that were built perfectly the first time.

Note: Some prerequisite CS knowledge is needed to follow through the technical parts of the article. If you’ve used Redux before, this should be a breeze.

In this article, I will be explaining a foundational architecture that I advise to many of the businesses in the space due to its simplicity, modularity, and platform agnosticism. And as an added benefit, the event-sourced design is closely related to Redux, providing familiarity to many existing JavaScript developers. Here is a generalized / high-level diagram of what we will be reviewing:

Simple!

Now, nothing is a silver bullet, but this architecture can likely serve most of the needs you will have building your decentralized applications.

Inspiration

This architecture was developed with the intention of addressing the following problems:

  • Latency (On-chain confirmation time can hinder user experience)
  • Cost (Storing data on Ethereum is expensive)
  • Flexibility / Upgradability (Upgrading smart contracts can be difficult)

Before I get into how each of these are addressed, I’m going to give a brief overview of Relational Databases and Event Sourcing (how data is persisted within this system).

Relational Databases

If I was to ask you what your personal data looked like as it was stored in the databases for all the companies of which you were a customer, you may think of yourself as a number of entries in a database. Let’s take your account at Wells Fargo, for instance — maybe you look something like this to them:

And there’s certainly more that we can add here, the idea being that we have separate entities and relate them using unique identifiers (foreign and primary keys). This is known as a relational database. Entities defined in these databases often have schema associated with them to denote the fields / other entities that each will have associated with it.

Drawbacks

And these are great, but when you execute an update to an entity (such as changing the name), you are simply updating a row in a table. So, what if I wanted to know the complete history of a customer’s name? Well, implementers of relational databases partly address this by creating separate entities for the things of greater importance (ex: deposits / withdrawals). But again, this doesn’t persist everything in the system — keeping us from getting a truly longitudinal view of the history of an account.

Another difficulty here relates to backwards-compatibility — if we update how we store things in our database, will it still work for our existing clients / users? Businesses will often leverage API versioning and thorough documentation to curtail this, but then there arise difficulties of having to maintain multiple versions of the software by the team (I’m quite sure that Salesforce knows a lot about this). And when your largest client has a bug on a version which is over two years old, I can tell you that it is not a fun time.

Improving

We can do better! Event Sourcing provides a complete audit trail of all updates made to your persistence layer while enabling greater flexibility in how these systems are implemented and upgraded.

Event Sourcing

If you want to know the current value of a wallet on a blockchain (ex: Ethereum), you need to work up to it from the wallet’s history — playing back all past transactions to get the current balance of the wallet, this, is an example of event sourcing. At a high level, event sourced systems store all events (state updates) and then playback the event history to create the current state of the object. Below, we can see a bunch of transactions that can be thought of as an event history. Each transaction specifies its type (IN, OUT), value (in ETH), as well as additional metadata.

As these transactions run through the EVM, the current state of the wallet is updated. The EVM will look at the type of transaction being made and deterministically update state (Ex: any OUT transaction will be the current value of the wallet less the value of the transaction and vice versa for any IN transaction).

Solidity Example

A good solidity (what a juxtaposition) example of this behavior can be found in Counterfactual’s counting app:

In this code, we can see that we have a function for updating state, applyAction (event and action are synonymous here). This function will take a state as well as an action to be applied to that state and update state according to the action’s actionType as well as the payload, the payload in this case being byHowMuch.

One of the great things about this sort of architecture is that it allows for the easy bifuraction of the data layer from business logic. This means that you can easily make updates to the way that data is stored as well as how it is processed in a completely independent manner. Some will argue that this encourages a monolith datastore, and that this is an anti-pattern. But hey, without this architecture, blockchains wouldn’t be verifiable.

Example Application

Implementing this is quite straightforward, let’s say that we have an application that allows a user to create an account with their Ethereum address and send direct messages to other users.

So, a user comes to the application and signs a transaction with their private key to create an account — this deploys a smart contract to the blockchain known as an ‘EventStore’. This smart contract is an array of pointers that point to events stored in some public / private off-chain database. This storing of pointers instead of data greatly decreases on-chain storage and allows events / user data to be hosted on private resources (you should never store someone’s PII on public resources).

Code

Our EventStore smart contract:

Note: tx.origin is used here in place of msg.sender as we use a factory contract to generate event stores.

The EventStore contract is very simple, it has an owner and is storing key and value pairs in store, which is provided by the EventStoreLib.

Note: The bytes32 values stored in the key and value fields point to objects in IPFS (content-addressed).

The EventStoreLib library stores these events with the metadata of the event index (to keep them well-ordered) as well as the address of the event sender. When creating these events, you will be required to follow the key, value format where each key and value point to an encrypted / unencrypted (based on your needs) JSON object in IPFS.

Key, Value Events

The process for creating one of these events in Ethereum is to have the client send something like this to the storage interface:

{    
"key":
{
"type": "user",
"id": "2"
},
"value":
{
"type": "ACCOUNT_CREATED",
"name": "Timmy Tomkins"
}
}

The storage interface (this can be all on-client, if you want) will take this event and store both the key and value in IPFS, which will return a multihash back for both. The interface will then convert these hashes to bytes32 values and write them as an event to the EventStore smart contract.

After the event has been confirmed on Ethereum and added to the events array in the EventStore, the client will catch the value emitted by the EventStore with a reducer to create the following state:

{
"user":
{
"id": "2",
"name": "Timmy Tomkins"
}
}

Now that we have our user, say that they want to update their name and send a message to their friend. We would expect to see a stream of events like this:

{    
"key":
{
"type": "user",
"id": "2"
},
"value":
{
"type": "USER_UPDATED",
"name": "Timothy Tomkins"
}
},
{
"key":
{
"type": "conversation",
"id": "1"
},
"value":
{
"type": "CONVERSATION_CREATED",
"partipants": [0, 2]
}
},
{
"key":
{
"type": "message",
"id": "0"
},
"value":
{
"type": "MESSAGE_CREATED",
"conversation": "1",
"sender": "2",
"message": "Howdy!"
}
}

If we were to create a reducer to apply all of these events to the state we show above, we would get a new state like this:

{
"user":
{
"id": "2",
"name": "Timothy Tomkins"
},
"conversations":
[{
"id": "1",
"participants": [0, 2],
"messages":
[{
"id": "0",
"body": "Howdy!"
}]
}]
}

Migrations / Updates

And at some point in the future, when we want to modify our users, we can simply update all future events to suit our needs. We can also easily modify or add reducers to create ‘projections’ (different ways of applying events). For instance, I can have a projection created from a reducer that only cares about events having to do with conversations or messages, which would look like:

{
"conversations":
[{
"id": "1",
"participants": [0, 2],
"messages":
[{
"body": "Howdy!"
}]
}]
}

Wrapping Up

Event sourcing is a simple and very powerful model as it allows us to have all our data in one place and freely manipulate how it is presented — which is great because that logic can even be implemented on the client-side, greatly decreasing the amount of work needing to be done with deploying new business-logic smart contracts.

So, this architecture can be shopped around in a number of different flavors, but at its simplest consists of a client, database, and blockchain. Integration is simple, if the client is an existing app (mobile / web) already using Redux to manage state and is keeping everything local, only the following would need to be done to leverage this architecture (or you could just use this framework):

  1. Pass events from Redux on to the storage interface.
  2. The storage interface will store the key / value pairs in IPFS and create bytes32 identifiers for them.
  3. The storage interface will store this event in the user’s EventStore smart contract.
  4. The EventStore will emit an event once the event has been confirmed.
  5. The event will be caught on the client and the local state will be updated.
  6. The client caches all state up to that point and only needs to collect updates (as long as the cache is not cleared).

Improvements

This architecture can also be coupled with Eventual Consistency, whereby the client assumes the events it is waiting on will eventually confirm and continues on with its business. This leaves the app in a state which is ahead of the blockchain, allowing for an improved user experience (I believe some Đapps like Peepeth do this). This can be quite problematic, however, if state management is not done well on the client (considering the case where a transaction fails or confirmation takes much longer than the normal amount of time).

There are naturally a multitude of other ways to improve this (Ex: merklizing events stored in IPFS to reduce the number of entries in the EventStore), but this was just a general review, we will dive into further implementations and extensions in later articles.

Getting Started

In terms of a framework that does this all for you, check out the Transmute Framework — it’s a great starting point. If you have any questions, feel free to reach out!

Further Reading

Feedback / Suggestions

If there’s something about this that I can improve, please don’t hesitate to let me know in the comments section, here. And if you’ve gained something from this article, please spam the clap button and share it (these simple actions really make a huge difference). Here’s the link to do so:

Find me on Twitter or Github!


I’d like to thank Georgios Konstantopoulos, Hershy Ash, and Kames for their feedback and contribution to this article!