Distributed Databases Concepts

Learn more about distributed databases in these articles written with Dr. Patrick Valduriez, one of the world’s leading experts on distributed databases.

Member-only story

The Case for Shared Nothing

Shared-nothing has become the dominant parallel architecture for big data systems, such as MapReduce and Spark, analytics platforms, NoSQL databases and search engines [Özsu & Valduriez 2020]. The reason is simple: it is the only architecture that can provide scalability at reasonable cost, typically within a cluster of servers. In the context of cluster computing, scalability can be further characterized by the terms scale-up versus scale-out. Scale-up (also called vertical scaling) refers to adding more power (processor, memory, IO devices) to a server and thus gets limited by the maximum size of the server, e.g. 32 processors. Scale-out (also called horizontal scaling) refers to adding more servers, called “scale-out servers”, in a loosely coupled fashion, to scale almost infinitely.

But what does shared-nothing mean? The term was first proposed by ACM Turing Award Professor Michael Stonebraker in 1985 (the ACM Turing Award is the equivalent of the Nobel prize in Computer Science) to characterize an emerging class of parallel database systems [Stonebraker 1985]. The problem faced by conventional data management has long been known as “I/O bottleneck,” induced by high disk access time with respect to main memory access time (typically hundreds of thousands times faster) and ever growing processor speeds. Then, the solution used by parallel database systems is to increase the I/O bandwidth…

--

--

Distributed Databases Concepts
Distributed Databases Concepts

Published in Distributed Databases Concepts

Learn more about distributed databases in these articles written with Dr. Patrick Valduriez, one of the world’s leading experts on distributed databases.

Prof. Ricardo Jimenez-Peris, PhD in CS
Prof. Ricardo Jimenez-Peris, PhD in CS

Written by Prof. Ricardo Jimenez-Peris, PhD in CS

Professor and scientist on distributed databases. Founder and CEO of LeanXcale

No responses yet