0chain DevOps
In this post I am going to talk a bit about the development and operations process we have at 0chain. Traditionally I come with a strong enterprise apps background where 15 to 20 yrs back the team responsible for development of a product has nothing to do with the operations. I have seen this leading to a mentality of not caring much about the performance of the product or it’s operational cost even among people making high level strategic decisions. As SAAS became mainstream, the companies responsible for building the products also are responsible for hosting them and that required careful planning of the product from an operational perspective as well. As IT moved not just standard applications like HR and CRM but also their internal applications to the cloud, it became imperative that those who are responsible for developing those products are also responsible for hosting and operating them. At minimum, the developers need to fully understand the kind of tech stack they are dealing with which is largely based on several well integrated micro-services.
At 0chain we are building a platform from ground up. We plan to have this platform available in different modes. In the publicly deployed 0chain blockchain, anyone can participate. We also expect two other types of blockchains to our enterprise customers. Private blockchains and consortium blockchains. Private blockchains are completely owned by individual companies and they use these to either increase availability and/or auditability. Consortium blockchains are useful when a small group of organizations want to optimize the communication and traceability requirements in their business process pipeline by interacting via a decentralized, permissioned blockchain.
As we need to cater our blockchain to these different use cases and scenarios, it is very important that we design a flexible platform that is feature rich but also easy to deploy and operate.
Given our focus on fast finality, we choose a micro-services based architecture where we use different types of storage systems for the miners and sharders. For example, we use redis (with and without persistence), we use rocksdb and we also use Cassandra (but this might change as we experiment with even more scalable architectures by tweaking our protocols). If we process a block per second with 10K transactions, that is 315.36 billion transactions per year. This is a large number that requires every trick up the sleeve to get that level of scalability and also operational reliability.
I use a Mac laptop on which I keep running our blockchain. Our development configuration is 3 miners and 1 sharder. Each miner has 2 redis databases and a rocksdb database. Each sharder has 3 rocksdb databases and a Cassandra database. Since rocksdb is an embedded in-memory database, it is not a separate process. But rest all are separate processes (or call them services if you prefer that way). That is a total of 11 services running on a laptop.
How do you make sure all these services are running on a single laptop without having port conflicts and such issues? How do you ensure each miner is running against it’s own set of databases and nothing else? That’s where we started using Docker and containerize our miner and sharder services so it’s easy to run a mini 0chain blockchain network on a laptop with sufficient cpu and memory.
Our containerized approach during development eventually paid big time when we moved to the cloud for deploying our testnet and experimenting with several blockchain networks to try out different configurations and network topologies. Of course, we still had to spend time automating the cloud deployment, but the containerization part for supporting the operations came for free as we were already doing this as part of the development process.
Note that in the above architecture the miners and sharders are shown as web servers as well. This is because in our architecture the miners receive transactions from the clients for including them into the blockchain and the sharders subsequently provide the ability to query the transactions and blocks for confirming the transactions. So, not only these nodes are interacting among themselves, they also interact with devices connected to the internet. Some of our design choices such as using redis for temporarily storing the transactions will help us with scalability, for example, in separating out the web server from the mining server, since redis is an in-memory store but in a separate process that several other processes can interact with. By architecting our blockchain services in such modular form, we are able to achieve the desired levels of scalability.
Since we are making use of a variety of databases (each carefully chosen for the workload that it’s best at), we started off with a persistence layer abstraction that many persistence stores can implement. Once this was in place, replacing one type of db with another was a simple matter of configuration. This also helped us with rapid prototyping and zeroing on the right components without having to spend too much time writing the code.
In addition to the persistence layer, we also have a node to node communication layer. This layer abstracts how the nodes communicate to each other to send or request the various blockchain protocol artifacts. Just like the persistence layer we expect the network layer to have different pluggable implementations. It can be a simple multi-cast or a gossip protocol or a hypercube network protocol. We expect these options to be selected as part of deploying a highly configurable blockchain based on the needs and the operational budget. Because of a separate network layer, we even managed to squeeze in extra performance by using Snappy based compression when passing the messages between the nodes. Nothing prevents us from moving from Snappy to zstd based compression if someone wants to trade the CPU cycles for reduced network latency and the blockchain protocol logic doesn’t need to care about this optimization at the network layer.
Here is a video of starting our mini development 0chain blockchain network on a laptop.
DevOps is a mindset not just for developers but for all levels (developers, managers) and roles (CTO, CSO, Product Owner) of a successful organization. At 0chain, each and every design decision made in building out the platform is carefully evaluated for both the development cost and the operational ease and efficiency.