ScyllaDB and CASSANDRA BASED FULL HISTORY SOLUTION FOR EOS ECOSYSTEM

Attic Lab
Attic Lab
Published in
4 min readApr 19, 2019

Proposal for EVB (EOS Voter Bounties) by Attic Lab

Why EOS need SycllaDB / Cassandra?

We (EOS community) are very much aware of the issues that the EOS network is facing with History APIs. The blockchain is growing in leaps and bounds in terms of transactions, configuring the centralised server for logs is a daunting task given the variety of transaction we have on the blockchain. Relaying in case of any system failure is complex and a very time-consuming process. We need a solution which is both reliable and scalable to meet the demands of all the wallets, explorers and dApplicaitons.

The community have developed few innovative solutions like Hyperion, MongoDB and others since legacy history solution got deprecated. We feel if EOS scales and gains mass-adoption most of the settlements would fail to scale.

Notable Cassandra Advantages:

1. Elastic Scalable

Cassandra can be easily scaled up or scaled down. BPs/Developers could add (/remove) any number of nodes into(/from) the cluster without restarting the cluster or disturbing the network and wallet/block explorer developed on top of it. High throughput is achieved as the number of nodes are added.

2. Resilient and fault tolerant.

Cassandra supports “multiple-master.” model as compared to MongoDB and legacy history APIs “single master” model, i.e. in case master node (MongoDB/Hyperion) goes down the second alternative node take time to switch. As Cassandra is dependent on “multiple-master” or a cluster of nodes, there is no single point of failure which helps to achieve 100% service uptime.

3. Read and Write Scalable.

Cassandra with its clusters of the node could take write (transaction history) in any of its servers. Current solutions are dependent on single node acting like the writing node.

Cassandra was developed for rapid writing and lighting fast reading. These values are linear and scale effortlessly. After measuring the read/write performances required on single node cluster, it is easy to calculate how many nodes need to be added to reach certain performance levels.

4. Easy Data Distribution

Casandra provides the flexibility to distribute data where you need by replicating data across multiple data centres.

5. Rich Data Model.

Cassandra has a column-oriented data model, which helps in quick slicing, hence increasing the performance. Columns are stored based on column names which can also consist of actual data, unlike traditional database where column stores metadata. Also, Cassandra rows can consist of masses of columns.

6. Gossip Protocol — Cassandra uses a gossip protocol to discover node state for all nodes in a cluster. Nodes discover information about other nodes by exchanging state information about themselves and other nodes they know about. This is done with a maximum of 3 other nodes. Nodes do not exchange information with every other node in the cluster in order to reduce network load. They just exchange information with a few nodes and over a period of time state information about every node propagates throughout the cluster. The gossip protocol facilitates failure detection.

7. Bloom Filters — A bloom filter is an extremely fast way to test the existence of a data structure in a set. A bloom filter can tell if an item might exist in a set or definitely does not exist in the set. False positives are possible but false negatives are not. Bloom filters are a good way of avoiding expensive I/O operation.

As ScyllaDB offers 10x performance in comparison to Cassandra, also, provides lower latency and throughput our DevOps team is also working parallelly to provide ScyllaDB based solution to the community. Much like Cassandra, ScyllaDB offers wide column store but supports C++17, the minute details which were programmed and introduced (like Autotuning) on ScyllaDB resulted in Dynamic Improvements.

SycllaDB/Cassandra History API Architecture for EOSIO

ScyllaDB-Cassandra Cluster

Setup details:

Apache Cassandra 3.6, level compaction, LZ4 compression, LZ4 compaction strategies.

3 servers

Configuration:

Dedicated Root Server EX42-NVMe

Intel:registered: Core:tm: i7–6700 Quad-Core Skylake

64 GB DDR4 RAM

2 x 512 GB + 1 х 1000 GB NVMe SSD

1 Gbit/s bandwidth

ScyllaDB 3.05

Native APIs

Our DevOps community is working hard to release Native API for Cassandra.

Current Implementation

The solution is now going through the alpha-testing stage.

  • Reached 38,500,000 head block; 2.5 Tb space used.
  • No major issues found.
  • Started syncing the ScyllaDB cluster.

As soon as alpha is finished, we’ll start development of the API layer for both ScyllaDB and Apache Cassandra storages and announce public beta stage in order to perform the complete testing process of the plugin.

For more information, please visit our Github directory — https://github.com/atticlab/scylladb_cassandra_history_plugin

Similar initiatives

1. Cassandra Solution by EOS Nairobi and EOS ZA Partnership.

https://docs.google.com/document/d/1IiqvHUXbav6iRTYSpJTG4wJgSdTAYKi72dco1azmMuI/edit#

2. Chronicle Project
https://github.com/EOSChronicleProject/eos-chronicle

Parallel projects working on History solutions on EOS Mainnet.

Block Producers and developers have released several solutions to support the community with reliable history APIs, a few notable contributions are-

1. Elasticsearch plugin (by Attic Lab and EOSLAOMAO)

* AtticLab ES Plugin(developed using GO) — https://github.com/atticlab/eos-es-historyapi

* LaoMao ES Plugin — https://github.com/EOSLaoMao/elasticsearch_plugin

2. Hyperion History Plugin (by EOSRIO)

https://github.com/eosrio/Hyperion-History-API

3. EOS Chronicle Project

https://github.com/EOSChronicleProject/eos-chronicle

4. Legacy EOS History Plugin

https://github.com/EOSIO/eos/tree/master/plugins/history_api_plugin

5. EOSIO MongoDB plugin (by Cryptolions)

https://github.com/CryptoLions/EOS-mongo-history-API

6. Hapi branch of history plugin(by Greymass)

https://github.com/greymass/eos/tree/hapi-production/plugins/history_plugin

Setup Costs

As we mentioned, the current configuration of setup includes -

3 Servers

  • Dedicated Root Server EX42-NVMe
  • Intel® Core™ i7–6700 Quad-Core Skylake
  • 64 GB DDR4 RAM
  • 2 x 512 GB NVMe SSD
  • 1 Gbit/s bandwidth
  • ScyllaDB 3.05

The setup cost is around €180.

We are investing back the rewards earned back to support the EOS ecosystem and will really appreciate the support of the proxies to help to run and maintain the clusters for the project. We believe Cassandra/ScyllaDB solution would leave behind the issues of history and community/developers could take the leverage of the solution without paying anything for the service.

Follow us!

Website: http://atticlab.net/eos/
Twitter: https://twitter.com/atticlab_it
Facebook: https://www.facebook.com/atticlab/
Reddit: https://www.reddit.com/user/atticlab_it
Steemit: https://steemit.com/eos/@attic-lab
Medium: https://medium.com/eosatticlab
Golos: https://golos.io/@atticlab
Telegram Chat: https://t.me/atticlabeosb

--

--