High Availability in Distributed Systems

Antoine Toulme

Published in

Whiteblock

8 min readNov 26, 2019

This article was a collaboration between Antoine and Kevin (Bapecoin).

Definition of high availability

High availability (HA) can be achieved when systems are equipped to operate continuously without failure for a long duration of time. The phrase suggests that parts of a system have been thoroughly tested and accommodated with redundant components to remain functional when a failure occurs.

Numerous modern applications are often distributed and cloud-based. Highly-available architectures facilitate scaling in order to meet demand and make them more resilient to failure. These applications usually have frequent deployments. The increased amount of moving parts during this process also increases the risk of something going wrong. To meet these challenges, application and infrastructure teams may adopt a deployment strategy suitable for their use case such as Blue-Green or Canary deployment. However, we will not be discussing high-availability as part of the development process in this article.

History of HA and when it became important

HA is a relatively new term that matches the expansion and professionalization of Web 2.0 companies. Back in 2004, it was common to take down websites for maintenance. The story of how HA became important is synonymous with the story of distributed databases. Historically, a database was either up or down, there was no such thing as a partial failure. As the internet was adopted globally, downtime began to directly impact users and business. Concurrently, there was a disconnect between development and operation teams. They worked as two independent squads, each with its own objective. Operators had little understanding of the code base and developers had little understanding of the operational practices. Developers were concerned with shipping code and operators were concerned with reliability.

This lack of communication between teams impacted the product and ultimately, end users. Since reliability is the single most important feature of any system today, a set of practices and culture coined as DevOps was created to improve communication and build better products. It subsequently became one of the most critical positions in every company, along with Site Reliability Engineers. The SRE role is a very opinionated concrete implementation of DevOps principles. Without reliability, users will not trust the system and thus not use it, leaving organizations with an expensive system that’s useless.

HA in traditional client-server architecture

Middleware HA — load balancer
A number of high-availability solutions exist such as load balancing and basic clustering. Load balancing also known as horizontal scaling can be achieved by adding new nodes with identical functionality to existing ones, redistributing the load amongst them all. By exclusively redirecting requests to available servers, reliability and availability can be maintained.

Database master/slave set up
Other solutions such as PostgreSQL Master-Slave architectures enable us to maintain a master database with one or more standby servers ready to take over operations if the primary server fails. Replication is established between the master and slave via SQL statements or via internal data structure modifications. When a main server fails, a standby server that is now equipped with almost all of the data of the main server can swiftly act as the new master database server.

Learn more at the official PostgreSQL site, https://www.postgresql.org/.

Database clustering
A data cluster is when multiple nodes or instances connect to a single database. Database clustering enables redundancy, availability, scalability and monitoring. Several tools from IBM and Oracle exist to manage a variety of configurations that can be used to provide high availability. For example, if a node or Oracle instance process fails in an Oracle RAC configuration, other Oracle instances can provide service as they have access to a common database file system across interconnect, a private high-speed network for messaging between nodes.

NoSQL revolution towards modern web architecture

MongoDB
Different database technologies have been developed in response to a rise in volume of data stored about users, objects and products. To handle the scale and agility challenges that face modern applications and capitalize on the cheap storage and processing power available today, architectures like MongoDB’s sharded cluster were developed. This architecture consists of three main components: a shard, mongos and config servers. A shard is a replica set or a single Mongod instance that holds the data subset used in a sharded cluster. If you directly connect to a shard, only a fraction of the data contained in a cluster can be viewed. Consider deploying a sharded cluster when your system’s dataset outgrows the storage capacity of a single mongodB instance otherwise it will add unnecessary complexity.

Learn more about MongoDB’s sharded clusters here: https://docs.mongodb.com/manual/core/sharded-cluster-components/

Cassandra
Apache Cassandra is another notable example of a “NoSQL database.” By using the peer to peer gossip communication protocol, Cassandra enables every node in the cluster to communicate state information about itself and other nodes.

More details can be found in the official documentation here:http://cassandra.apache.org/. It’s also important to point out that Cassandra is a distributed open source database so we encourage you to contribute as well.

ElasticSearch
Equipped to handle large amounts of data, ElasticSearch is an open source analytics and full text search engine. It’s often used to enable search functionality for web applications as well. It does this by facilitating shard allocation and cluster-level routing. Indexes are split into shards and every shard is a self contained instance of Apache Lucene. Shards are spread across multiple nodes so when more capacity is needed, one can simply add more machines so that the load can be spread more efficiently. By communicating what document you’re interested in with a given server on your cluster for ElasticSearch, it can hash that into a particular shard ID. A mathematical function is performed that can very quickly figure out which shard owns the queried document.

In the example above we can view how ElasticSearch maintains resilience to failure. If node 1 fails we’ll lose primary shard 1, and replica shard 0. If this took place, ElasticSearch would elect one of the replica nodes on 2 or 3 to be the new primary. We can then continue to service read requests and accept new data.

Learn more at Elastic.co.

Redis
Redis cluster is a data sharding solution with automatic management, failover and replication. In cluster mode all nodes are communicating with each other. Note that every redis cluster node requires two TCP connections open, the standard Redis port used to serve a client plus the port obtained by adding 10,000 to the TCP port. “Redis Cluster uses a master-slave model where every hash slot has from 1 (the master itself) to N replicas (N-1 additional slaves nodes).” In the example below, there is a cluster of three masters, A, B, C, each with 1 slave used to replicate each master. The final cluster is then composed of A, B, C that are masters nodes, and A1, B1, C1 that are slaves nodes, the system is able to continue if node B fails.

Learn more at Redis.io: https://redis.io/topics/cluster-tutorial.

Big Data and high throughput systems

Apache Kafka
Streaming technology processes new data as it’s generated into your cluster. Instead of dealing with it in batches, big data can be handled in real-time and published directly to your cluster. Kafka streams is an easy data processing and transformation library that offers data parallelism, distributed coordination, fault tolerance, and operational simplicity. Kafka servers can be set up so that they store messages from publishers and publish them to anyone that wants to consume them. Messages are associated with topics and they represent a specific stream. A consumer subscribes to one or more topics and receives data as it’s published. Kafka can then be spread out on a cluster of its own avoiding a single point of failure. Multiple servers can then run multiple processes and distribute the storage of the data on your topics and the processing of all the producers and consumers.

- The anatomy of an application that uses the Kafka Streams library. (kafka.apache.org)

Learn more at https://kafka.apache.org/23/documentation/streams/architecture

Apache Spark

The Spark framework is used for the distributed processing of large datasets. A driver program is developed and built around the SparkContext object. A cluster manager is ran on top of that which is responsible for distributing the work defined by the driver script among multiple nodes. The node might have an executor process with its own cache and list of tasks which is then given back to the cluster manager which is responsible for deciding what happens next.

Learn more at https://spark.apache.org/docs/latest/cluster-overview.html.

Distributed systems

Bitcoin and Ethereum clients
Blockchain technology has changed the way we think about data and privacy. Notably, blockchains allow all actors on the network to have access to the entire data set on their computer. Ethereum is a distributed public blockchain network like Bitcoin, however, they have different purposes and capabilities. In both cases, the nature of this distributed network enables high availability. If one node fails, all of the other nodes in the network are equipped with the full data set, therefore bringing down the network would require bringing down every node providing computation power.

Light clients and quick response time
Light clients expose a browser to the web3 interface that allows decentralized applications to access the Ethereum network or other distributed systems. These are useful if one is operating under volume constraints. In order to achieve quick response time light clients do not interact directly with the blockchain. Instead, they use full nodes as intermediaries. Another advantage is that it takes significantly less resources and storage in comparison to running a full node to run a light node without comprising that much security. Currently, Parity Ethereum and Geth are the most popular light clients. As mentioned by the Parity team, “They help users access and interact with a blockchain in a secure and decentralized manner without having to sync the full blockchain.” Learn more about light clients here: https://www.parity.io/what-is-a-light-client/

The future of HA
The emergence of blockchain technology and development of distributed systems has introduced several capabilities that we’re still creating use cases for. Today, mobile applications can run offline. As the demand for highly available applications increases, offline applications like Scuttlebutt are redefining the term itself. Unlike Twitter or Discord, there is no social separation created by those who run servers and those who use the application. Instead, every user runs the application directly on the device. There are some tradeoffs as messages can only be passed directly between friends via peer-to-peer. As we contemplate the future of high-availability, it’s a good idea to take a closer look at events such as Fortnite’s final showdown. The entire game was down and although no one could play, millions still remained connected waiting for the rumored new map to be uploaded. This poses the question — is high availability the end all goal?

Originally published at https://whiteblock.io on November 25, 2019.