Polkascan Development Update #4

Emiel Sebastiaan
Polkadot Network
Published in
18 min readSep 12, 2019

Completion of wave one grant work

WEB3SCAN’s mission is to make multi-chain data accessible and understandable and with the Polkascan-project we are working towards a multi-chain exploration & data analytic platform. This development update is part of the work related to the Web3 Foundation Grant that ensures that the Polkadot-ecosystem has access to an open-source block explorer from day one!

‘Day one’ is something we at WEB3SCAN have always taken to heart, and it shows because — due to the Polkascan-project — the recently launched Kusama-network has a block explorer already.

WEB3SCAN contributes to the Polkadot-ecosystem with the Polkascan-project by providing a generalized open-source block explorer for any Substrate-based blockchain, such as the Polkadot relay chain. This fourth development update is released together with the delivery of our fourth project milestone as announced here.

Image 0.1: WEB3SCAN’s Emiel Sebastiaan at Web3 Summit 2019.

Milestone 4 | search, optimizations and usability: The fourth milestone has extended the base system-architecture for all artifacts of the Polkascan ‘open-source block explorer’-stack with the basics of search & filter functions. Additionally this phase has implemented a wide-range of performance optimizations and usability enhancements.

1. Getting started

This section of the development update will help you get started by running the fourth milestone release of Polkascan PRE yourself.

1.1. Polkascan PRE on Github

Repositories: The source code for Polkascan PRE can be found at our Github organization. This organization consists of a number of distinct repositories which collectively form Polkascan PRE. We apply a number of conventions for branches and versioned releases across these Polkascan PRE repositories.

Image 1.1: Polkascan on Github: https://github.com/polkascan

Branches: Each repository has our most recent — but possibly unstable — work in the master-branch. The milestone 4 work can be found in the milestone4-branch.

Releases: The Milestone 4 work is released as ‘v0.4.x’.

1.2. What makes Polkascan PRE?

Polkascan PRE consists of a number of distinct software artifacts which collectively orchestrate Polkascan PRE. The Harvester transforms a Substrate node’s raw data into relational data, with the help of the Substrate Interface Library and the SCALE Codec Library. The produced relational data is disseminated by the Explorer API and in turn made accessible to end-users by the Explorer GUI. We have chosen to provide full Docker support for all our artifacts, hence all our repositories have Dockerfiles in their root.

Although these five distinct components of Polkascan PRE could be applied independently (even in other projects), we offer a sixth piece of software, called: Polkascan PRE that glues together all these components with Docker Compose.

1.3. Requirements

  1. Recommended hardware: Memory: >8GB (more is better), Storage: >100GB (SSD is better), Processor: more and faster cores is better (Intel i7 quad core), Bandwidth: more and faster is better (>10 Mbps). These requirements will go down as the project matures.
  2. Software requirements: Git, Docker & Docker Compose.
  3. Deployment of Polkascan PRE has been tested on Mac, Linux and Windows.
  4. The Explorer GUI has been tested on Safari, Firefox, Chrome, Edge and Brave.

1.4. Running Polkascan PRE

This paragraph provides a step-by-step guide on running our milestone 4 release of Polkascan PRE on your own machine. The instructions outlined in this paragraph can be found in our Github repository.

Step 1: Clone the repository:

git clone https://github.com/polkascan/polkascan-pre.git

Step 2: Go to the new folder:

cd polkascan-pre

Step 3: Check available releases:

git tag

Step 4: Checkout the latest release in the v0.4.x range (replace ‘x’ with highest number):

git checkout v0.4.x

Step 5: Initialize and update submodules:

git submodule update --init --recursive

Step 6: Build and initialize the MySQL container:

docker-compose -p dev -f docker-compose.yml up -d mysql

Step 7: Build and initialize the other containers:

docker-compose -p dev -f docker-compose.yml up --build

Result: The Explorer GUI should now be available in a browser: http://127.0.0.1:8080

Image 1.2: Polkascan PRE in a browser: http://127.0.0.1:8080

1.5. Interact through Polkadot-JS Apps

Polkascan PRE is ‘block exploration & data analytic’-technology and does not offer features to actively interact with a blockchain. However, the Docker Compose setup ships with Polkadot-JS Apps, which allows you to interact with the chain — for example by composing and submitting transactions. Polkadot-JS Apps should now be available in a browser: http://127.0.0.1:8081

Image 1.3: Polkadot-JS Apps in a browser: http://127.0.0.1:8081

1.6. Cleaning up

The following docker[-compose] commands should help you clean up.

  • Stop all containers of the Docker Compose file.
docker-compose -p dev -f docker-compose.yml down
  • Remove all unused containers and images (caution!!).
docker system prune
[confirm] Y
  • Remove all containers and images (caution!!).
docker system prune -a
[confirm] Y
  • Remove all unused volumes (caution!!).
docker volume prune
[confirm] Y

2. Exploring Polkascan PRE

The milestone 4 release of Polkascan PRE offers various performance optimizations and usability enhancements. The release provides more, better and richer details on useful data entities that were already supported by milestone 3, such as: logs, accounts, indices, transfers, sessions, validators, nominators, proposals and referenda. And the release provides more, better and richer details on the more abstract entities that were already supported by milestone 2, such as: blocks, extrinsics, transactions, inherents, events and runtime specification entities. Basic definitions of these notions can be found in the Substrate’s official documentation.

2.1. Exploring the GUI

The Explorer GUI is an Angular (mobile friendly) application that consumes data from the Explorer API. The application’s landing page is accessible at the following URL when running the steps in the previous paragraph: http://127.0.0.1:8080.

The top section of the landing page displays a navigation menu to the following sections: chain, account, staking, governance, analytics & runtime. The Polkascan-logo returns you to this landing page.

The header sections shows you the name of the network you are on, optionally supported by a color-code if it is a known (pre-configured) network. Additionally this section allows you to search by block number, transaction hash or account address.

The next section of the landing page lists a number of key indicators, such as: the most recent finalized blocks, the number of transactions, the number of module events, the number of active accounts and the number of distinct runtimes that have been harvested.

Furthermore, the bottom section of the landing page lists some details of the most recently harvested blocks and of the most recently harvested balance transfers. Both these list have buttons that allow you to find more details.

Below you can find a list of useful sections of the Explorer GUI:

Chain menu entities

Blocks: http://127.0.0.1:8080/block
Extrinsics: http://127.0.0.1:8080/extrinsic
Transactions: http://127.0.0.1:8080/transaction
Inherents: http://127.0.0.1:8080/inherent
Events: http://127.0.0.1:8080/event
Logs: http://127.0.0.1:8080/log

Account menu entities

Accounts: http://127.0.0.1:8080/account
Account Indices: http://127.0.0.1:8080/indices/account
Transfers: http://127.0.0.1:8080/balances/transfer
Contracts: http://127.0.0.1:8080/contracts/contract

Staking menu entities

Sessions: http://127.0.0.1:8080/session/session
Validators: http://127.0.0.1:8080/session/validator
Nominators: http://127.0.0.1:8080/session/nominator

Governance menu entities

Democracy Proposals: http://127.0.0.1:8080/democracy/proposal
Democracy Referenda: http://127.0.0.1:8080/democracy/referendum

Analytics menu entities

Search: http://127.0.0.1:8080/analytics/search

Runtime menu entities

Runtime modules: http://127.0.0.1:8080/runtime-module
Runtime type registry : http://127.0.0.1:8080/runtime-type
Runtime upgrade history: http://127.0.0.1:8080/runtime

2.2. Exploring the API

The Explorer API is a Falcon-application that disseminates data from the relational database that is maintained by the Harvester. Falcon offers a fast RESTful API and we apply the JSON-API standard as message envelope.

Image 2.1: Exploring the API

We use NGINX proxy rules to embed the Explorer API in the routing schema of the Explorer GUI. Below you can find a list of useful API endpoints of the Explorer API:

Chain API endpoints

Blocks : http://127.0.0.1:8080/api/v1/block
Extrinsics: http://127.0.0.1:8080/api/v1/extrinsic
Transactions: http://127.0.0.1:8080/api/v1/extrinsic?filter[signed]=1
Inherents: http://127.0.0.1:8080/api/v1/extrinsic?filter[signed]=0
Events: http://127.0.0.1:8080/api/v1/event
Logs: http://127.0.0.1:8080/api/v1/log

Account API endpoints

Accounts: http://127.0.0.1:8080/api/v1/account
Account Indices: http://127.0.0.1:8080/api/v1/accountindex
Transfers: http://127.0.0.1:8080/api/v1/balances/transfer
Contracts: http://127.0.0.1:8080/api/v1/contract/contract

Staking API endpoints

Sessions: http://127.0.0.1:8080/api/v1/session/session
Validators: http://127.0.0.1:8080/api/v1/session/validator
Nominators: http://127.0.0.1:8080/api/v1/session/nominator

Governance API endpoints

Democracy proposals: http://127.0.0.1:8080/api/v1/democracy/proposal
Democracy referenda: http://127.0.0.1:8080/api/v1/democracy/referendum

Runtime API endpoints

Runtime modules: http://127.0.0.1:8080/api/v1/runtime-module
Runtime type registry: http://127.0.0.1:8080/api/v1/runtime-type
Runtime upgrade history: http://127.0.0.1:8080/api/v1/runtime

2.3. Exploring the DB

Polkascan PRE is DBMS-agnostic. That said, we have set up the Docker Compose configuration for this milestone with recent version of MySQL. You can connect to the MySQL-DBMS with you favorite tool. The default connection details are listed below:

Host: 127.0.0.1
Port: 33061
Database: polkascan
Username: root
Password: root
Image 2.2: Exploring the DB

The database consists a number of tables which are outlined below:

alembic_version: version data to enforce data-definition by SQLAlchemy.
data_account: account data.
data_account_audit: account audit data.
data_account_index: indices account data.
data_account_index_audit: indices account audit data.
data_block: block data.
data_block_total: additional block data.
data_contract: contract data.
data_democracy_proposal: democracy proposal data.
data_democracy_proposal_audit: democracy proposal audit data.
data_democracy_referendum: democracy referendum data.
data_democracy_referendum_audit: democracy referendum audit data.
data_democracy_vote: democracy vote data.
data_democracy_vote_audit: democracy vote audit data.
data_event: event data.
data_extrinsic: extrinsic (and transaction and inherent) data.
data_log: digest log data.
data_session: session data.
data_session_total: additional session data.
data_session_nominator: session nominator data.
data_session_validator: session validator data.
runtime: runtime specification data.
runtime_call: runtime specification data of call functions.
runtime_call_param: runtime specification data of call function parameters.
runtime_constant: runtime specification data of constants and their values.
runtime_storage: runtime specification data of storage functions.
runtime_event: runtime specification data of events.
runtime_event_attribute: runtime specification data of event attributes.
runtime_module: runtime specification data of modules.
runtime_type: runtime specification data of types.

2.4. Exploring Docker Compose

This paragraph documents and explains the components that can be found in the Docker Compose setup.

Version

version: '3.2' | At one point this version was chosen for compatibility with AWS-deployments of the stack. This may need to be revisited.

Services

substrate-node: this instance of the image runs a Polkasource-distribution of the Substrate-node.
mysql: this instance of the image runs the MySQL-dbms.
redis: this instance of the image runs the Redis key/value-store as message broker.
harvester-worker: this instance of the image runs Celery as distributed task queue and listens to the Redis message broker for available tasks.
harvester-beat: this instance of the image runs the task scheduler engine for Celery-processes.
harvester-api: this instance of the image runs Falcon which serves RESTful endpoints with control-functions for the harvester.
harvester-monitor: this instance of the image runs Flower as a front-end application for Celery.
explorer-api: this instance of the image runs Falcon which serves RESTful endpoints following the JSON-API specification to serve data from the database.
explorer-gui: this instance of the image serves a SPA Angular-application.
polkadot-ui: this instance of the image runs a Polkasource-distribution of the Polkadot-JS Apps.

Volumes

db-data: this volume provides a storage-volume for the (MySQL-)dbms.
substrate-data: this volume provides a storage-volume for the Substrate-node’s data.

3. Usability enhancements

The milestone 4 release has implemented a number of usability enhancements.

3.1. Menu-structure

The menu-structure groups the various sections of the explorer into coherent clusters, such as: ‘Chain’, ‘Accounts’, ‘Staking’, ‘Governance’ and ‘Runtime’.

Image 3.1: Polkascan PRE menu-structure

3.2. Color-coding

The default Substrate-node set-up in the Docker Compose configuration is a ‘development’ configuration for the latest Kusama-runtime. The color-coding used for this default configuration is ‘grey’. Other network configurations have a distinct color-code, such as: Kusama (kusama-pink), Alexander (alexander-pink), Edgeware (blue) and Joystream (turquoise).

3.3. Tab-structure

Tab UI-controls can be found on detail pages such as the ‘block detail’-page. The tab UI-control allows for a better overview of the many 1-to-n relationships an entity may have. A block for example may have many transactions, inherents, logs, etc. This UI-control minimizes a need for vertical page scrolling.

Image 3.2: Polkascan PRE tab-structure

3.4. Data-rendering

In many sections of the Explorer GUI complex and nested data-structures are rendered. In many cases Substrate’s metadata provides proper decoding context for the data-structure. The examples serve as a proof-of-concept and this feature will be implemented in many more sections during our future development cycle.

Account address

Whenever an account-address is encountered it is rendered in such a way that it may include an identicon, resource deep-links and abbreviation-rules.

Image 3.3: Rich account-address rendering

Referendum proposal

Whenever a referendum takes place, votes need to be cast on a proposal. Substrate allows anything that can be expressed as a call to be voted on through a proposal. Since the available calls are defined dynamically by the Substrate runtime, no objective decoding context exists for proposals. The way the proposal needs to be rendered is therefore subject to the runtime’s dynamic metadata and can even be recursive with nested-structures. The example shows a democracy proposal through which the free balance of a specific account is set to a specific value. Recursion is shown with the rendering of the account address.

Image 3.4: Rich democracy-referendum-proposal rendering

3.5. Filters

Basic filters have been added to the transaction, inherent and event overview pages. These filters allow to easily filter for the respective records that match the dynamic categories set by the runtime. The examples serve as a proof-of-concept and this feature will be implemented in many more sections during our future development cycle.

Image 3.5: Basic filters on transaction overview page

3.6. Search

Basic search is available on the landing page and on a specific page in the ‘Analytics’-menu. Search is available for the following entities and their attributes: block hash, block number, account address, account index, transaction hash and inherent id. A search-match will direct to the particular resource’s detail page.

Image 3.6: Basic search page

Additionally the landing page is a placeholder for future advanced search functions with logical search and a results-page for many more indexed data-points.

4. Performance optimizations

The milestone 4 release has implemented a number of performance optimizations.

Through our work on ‘multi-chain exploration & data analytic’-technologies we learned that performance is a continuous challenge. Any single application or any single attack vector can push a blockchain to its capacity limits. In general, if you require the explorer to provide near real-time data — and thus keep up with the chaintip of the blockchain — you need to be able to perform all data processing and storage operations within the constraints of the blocktime.

4.1. Basic benchmark

The milestone 4 release has undergone a basic benchmark test with a preliminary conclusion that the Polkascan-stack will be able to scale-up and handle Substrate-based blockchains at very high capacity.

The setup for this benchmark test used a mock-up of Substrate’s JSON-RPC to serve a million blocks with an increasing order of magnitude for the number of distinct extrinsics (transactions) per block to simulate a Substrate-based chain at high capacity.

Vanilla Polkascan PRE will be able to process up to 5000 transactions per block at a constant rate on consumer-grade hardware. Up-scaling performance beyond this constant capacity threshold can be achieved through horizontal scaling.

4.2. Horizontal scaling

The Polkascan-stack allows for horizontal scaling on multiple levels.

Horizontal scaling is achieved on a high-level by dividing the work per blockchain. I.e. as long as the harvester — on average — is capable of processing the blockchain’s data within the blocktime of the chain, it will be able to keep up with the chaintip and serve real-time data. The multi-chain explorer of the Polkascan-project consists of one harvester application per blockchain.

Further horizontal scaling is achieved on a more fine-grained level within a single blockchain through asynchronous multi-threading by Celery-workers on a Redis message-broker architecture. The Harvester has two distinct processes, which we call the accumulator, which harvests stateless data and the sequencer which builds stateful data based on the already available stateless data. The accumulator’s tasks scale really well because they can be parallelized and executed in any random order. Only very few tasks have to be executed by the sequencer.

4.3. Database optimizations

Through our previous work on our generalized EVM block explorer we learned that managing very large database is a continuous challenge. In general, if you require the explorer to provide analytics over all data in a particular blockchain then your databases will grow very large. The database for our Ethereum mainnet explorer is >10TB in size. The following strategies have proven to successfully deal with such large databases.

Transaction-based commits

The Harvester applies a transaction-based commit strategy. Such strategy writes all data of an entire block or fails through exception. Such strategy ensures data integrity per block.

Deterministic keys
The data-definition of the explorer’s data-structures have deterministic primary keys. In many cases this has resulted in composite key-structures. Examples are extrinsics and events, for which the primary key is a composite key of the block_id and the record’s index within that block. Deterministic keys — and thus deterministic data for the entire database — result in the ability to perform data integrity checks across databases. Please note that some entities have yet to be optimized.

Write-once

Write-once database-strategies are applied to ensure that data is only written once to the data-structure without the need for additional reads. This implies that in almost all cases no update-statements are required, thus limiting database-i/o.

Bulk-writes

Whenever possible bulk-write database-strategies are applied. E.g. when a block consists of ‘n’-transactions, one insert-statement with ‘n’ transaction-records is more efficient than ‘n’ insert-statements with one transaction-record, thus limiting database-i/o.

Index optimizations

Additional database-indices result in higher read-performance (querying, filtering & sorting), but result in lower write-performance. Optimization-strategies have been applied through which the added index’ footprint is justified by the additional functionality.

Additionally a hard separation between the database-models of Harvester and the Explorer API has been made. The Harvester inherently has the task of writing data to the database and the Explorer API inherently has the task of reading data from the database. Although both applications currently use the same database-model, these models have currently been duplicated over the two repositories. This separation will eventually allow for different database index-strategies for each application. The Harvester’s index-strategies could be optimized for database-writes and the Explorer API’s index-strategies could be optimized for database-reads.

Partition-ready

Blockchain databases have an inherent property that the bulk of its data is stateless, which implies that is does not change. Partitioning-strategies per block-range allow for optimizations of service management activities for large static partitions of data. Although the partition-strategies have yet to be implemented, the data-structure for the bulk of the data can be partitioned by block-range of say 100.000 blocks.

4.4. API Caching

A simple argument for implementing cache-strategies for any web-application is to limit database-i/o. Filesystem-i/o or memory-i/o is more efficient than database-i/o. As stated earlier blockchain databases have an inherent property that the bulk of its data is stateless. Any stateless resource that is served more than once merits from a cache-engine.

Cache-requirements are amplified whenever web-applications are publicly serviced and have high traffic. A significant number of technical preparations have been made to effectively deal with cache- and traffic-control, such as: json-output caching, mutex locking, resource-based TTL, rate-limiting & throttling and access control. This will be further explored in a future blog post.

4.5. Application optimizations

The Explorer GUI is an Angular-application which uses Typescript as primary language to transpile into pure Javascript. The application has a so-called SPA-architecture that runs entirely in the browser and interacts with the user by dynamically rewriting the current page rather than loading entire new pages from a server. The application’s data is served entirely through external — local or remote — API-endpoints (Explorer-API).

The architecture can fairly easily be made compatible with many of the libraries that are available in the ecosystem.

The Bulma-framework is implemented to ensure cross-browser compatibility of user-interface components and controls and applies responsive-design standards to allow screen-layout optimizations for various devices, including: web, tablet and mobile.

The Angular-architecture is mobile ready and is ready to be shipped as cross-platform hybrid and progressive web apps for Android and iOS and their respective App Stores with the Ionic-framework.

5. Hacking on Polkascan PRE

The Polkascan-stack is based on Python frameworks for the Harvester and the Explorer API and the Angular framework for the Explorer GUI. Our application-stack is glued together with Docker and Docker-Compose which allows it to run virtually everywhere. Many cool hacks can be achieved by hacking the docker compose file we referenced earlier in this document. We have documented many cool hacks in a previous development update, such as: customizing ports, running other networks, using your own Substrate-node and using your own database server. Our workshop at Web3 Summit 2019 explored this a bit further.

Image 5.1: WEB3SCAN’s Arjan Zijderveld at Web3 Summit 2019: Hacking on Polkascan PRE.

6. About Polkascan

The Polkascan-project contributes to the Polkadot-ecosystem by providing a generalized open-source block explorer, called Polkascan PRE. Polkascan PRE offers a block explorer for any Substrate-based blockchain, such as the Polkadot relay chain (Kusama network). This block explorer harvests and decodes data from Substrate-nodes, stores the decoded data in a relational database and disseminates the data through an API to be used in a block explorer user interface. The Web3 Foundation Wave One Grant ensures that the Polkadot-ecosystem has access to an open source block explorer from day one!

6.1. Polkascan and the ecosystem

Our development activities are closely aligned with key ecosystem-organizations (Web3 Foundation and Parity Technologies). We support various new networks and future parachains (Edgeware, Robonomics, Joystream and many more to come). We are very excited about Parathreads and we expect Polkascan PRE to be fully compatible with any Parathread blockchain. We are in ongoing dialog with other ecosystem service providers (wallets, faucets, clients) about integration with our platform.

We have had our first community code-contributions to our repositories and we have been granted a first — of hopefully many — Web3 Foundation Bounties for scoped feature-development activities of our open-source repositories. The first Gitcoin bounty funds the internationalization and localization of the Explorer GUI which should allow for multi-language support of Polkascan (English, Chinese, Russian, Japanese, etc). This project will add to wide-spread adoption of our multi-chain exploration & data analytics technology.

WEB3SCAN is building multi-chain exploration and data analytic technology of which the Polkascan-stack is a key component. The Polkadot-ecosystem already shows signs of other block explorers. Particularly interesting is Polkadot JS Apps. In contrast to Polkascan this project has a focus on the current state of the blockchain, rather than focusing on full and multi-chain analytics. Like Polkascan however this project has developed its entire stack from the ground up and is like Polkascan a project that has been around from day one. A key difference is that the Polkadot JS Apps project has a Typescript/Javascript codebase and the the Polkascan project has a Python codebase. This is a logical choice given the different goals and objectives of both projects. The other block explorers in the ecosystem all re-use major parts of the technology-stack of either Polkadot JS Apps project (Polka.io) or the Polkascan-project (Subscan and Boka).

6.2. Polkascan-project updates

This post wraps-up our work related to Web3 Foundation’s Wave One Grant. We will continue working on the open-source components of our block explorer stack and we will continue providing updates to our multi-chain explorer platform polkascan.io.The following public resources enable tracking of progress of the project: Medium, Twitter & GitHub. We encourage you to reach out if you would like to collaborate especially if you intend to be a Substrate implementer or ecosystem service provider. You can find us on the Riot channels on a daily basis. Come say hello and talk to us on how to get involved.

7. About WEB3SCAN

7.1. WEB3SCAN | Service Provider

WEB3SCAN — the organization behind the Polkascan-project — offers professional services around blockchain data and blockchain information management, including but not limited to providing full-service multi-chain exploration & data analytic technologies, consultancy services & systems integration services.

Image 7.1: WEB3SCAN’s mission: making multi-chain data accessible and understandable

7.2. Polkascan.io | The Polkadot-ecosystem multi-chain explorer

WEB3SCAN is working on a multi-chain explorer called Polkascan MC of which polkascan.io is one instance. This multi-chain explorer aims to make multi-chain data accessible and understandable. In order to further our goals with the Polkascan Multi-chain Explorer we are developing block explorers that works well with many individual chains. Block explorers for these individual chains are — in turn — aggregated into the Polkascan Multi-chain Explorer.

Image 7.2: Polkascan.io the multi-chain block explorer for the Polkadot-ecosystem. (link)

7.3. WEB3SCAN | Media

Learn more about WEB3SCAN and reach out to us if you have any questions.

Blogs

Podcasts

Presentations

Image 7.3: WEB3SCAN’s Dave Hoogendoorn at DotCon 0.5 2019.

--

--