101 On Aerospike Data Base for Beginners

Prabhu Rajendran
Everything at Once
Published in
4 min readFeb 16, 2020

Before diving in to Aerospike, we should know about what is RDBMS vs NoSql.

RDBMS (Relational Database) — Structured Schema dealing with Relational databases , which is effectively used in insert,search,update and delete database records.

  1. RDBMS is right option for ACID problems.
  2. Support Dynamic Queries.(Great choice for complex queries).
  3. Predefined schema to determine the structure of data.
  4. SQL Databases are Vertically scalable.(Example : MySQL,PostgreSQL,Oracle…)

No SQL (Non SQL/Relational) — Database that provides a mechanism for storage and retrieval of data, which includes simplicity of design , simpler horizontal scaling to clusters of machines and control over availability (in favor of speed and partition tolerance).

  1. High Scalability : No SQL Databases use sharding for horizontal scaling (Partitioning of data and placing it on multiple machines). — Example : MongoDB,Cassandra, Aerospike..etc..
  2. High Availability : Auto Replication feature in No SQL databases makes it high available in case of any failure data replicates itself to the previous consistent state.
  3. Types of No SQL Db :- Key Value store, Tabular, Graph databases or wide column and document based.

Yess! , Now we got basic difference of RDBMS and No SQL Databases , It time to jump on Aerospike.

What is Aerospike ?- is a distributed database supporting key-value store and document oriented data models. — Providing robustness and strong consistency with no downtime.

  1. Scalability :- Flash and Hybrid Memory Architectures allows the aerospike database to scale Petabytes of data.
  2. Speed :- Low Latency is maintained at high scale (which makes better decision in real time).
  3. Ease of Deployment and Management.
  4. Low Total Cost Of Ownership :- Fueled by a hybrid memory architecture and compression, Aerospike provides significantly lower TCO than first generation No SQL and relational databases.
  • In this modern times of transformation , organizations are now required to make lightning-fast decisions which powers applications like recommendation engines,digital payments, fraud detection's…

Technology Behind Aerospike Database :

  1. Real Time Transaction Engine
  2. Data Distribution
  3. Smart Cluster Management
  4. Dynamic Data Rebalancer
  5. High Performance Storage Engine
  6. Cross Data Center Replication.
  1. Real Time Transaction Engine : — fully capabilities of available technology , Aerospike’ s real time engine delivers the maximum performance possible and can scale millions of transactions per second at sub-millisecond latency.

i. Responsible for Reading and Writing data upon request while providing consistency and isolation (which involves synchronous and asynchronous replication).

ii. Requests to alternate node if a nodes becomes unavailable as well as conflict/duplicate resolution after node rejoins the cluster.

iii. Multiple Core System — Improves latency by reducing data across multiple regions by grouping multiple threads per CPU socket.

iv. Context Switch — Some Operations are run in network thread itself without giving to CPU.

v. Data Structure Design — Safe and concurrent read ,write and delete access to index tree without holding multiple locks (does not involve acquiring multiple locks at each level by having its both reference count and its own lock).

vi. Scheduling and Prioritization— In addition to key value store operations, Aerospike supports batch queries,scans and secondary index queries.Prioritization via throughput is achieved via job partitioning based on type (CPU effort required and controlling the load generated).

vii. Memory Allocation — while minimizing the need for memory de-fragmentation, System resources are leveraged by keeping the index packed RAM.Fragmentation is minimized by grouping data objects by same namespace — so the long term creation, access ,modification and deletion are optimized.

2. Data Distribution :- Data Partitioning that has uniform distribution of keys in the digest space ,”Avoiding the creation of hots pots during data access” — which helps in achieving high level scale and fault tolerance.

i. Application work load is uniformly distributed

ii.Performance of database operations is predictable.

3. Cross-data center Replication: supports different replication topologies, including active-active, active-passive, chain, star, and multi-hop configurations.

  1. load sharing
  2. Remote cluster management
  3. Data shipping
  4. Pipe-lining

4. Storage Engine: — It is not just the throughput and latency characteristic, but also the ability to store and process large swaths of data that defines the ability of a DBMS to scale up. Aerospike has been designed from the ground up to leverage SSD technology. This allows Aerospike to manage dozens of terabytes of data on a single machine with sub-millisecond record access times. Aerospike supports three kinds of storage structures: Hybrid-Memory, All-Flash, and In-Memory.

5. Dynamic Data Rebalancer : — The data re-balancing mechanism ensures that the transaction volume is distributed evenly across all nodes and is robust in the event of node failure happening during re-balancing itself. The system is designed to be continuously available, so data re-balancing doesn’t impact cluster behavior.

6. Smart Cluster Management (Self Healing Cluster Management) — adding and removing nodes seamlessly to the cluster.

i. Hear Beat subsystem.

ii. Clustering subsystem.

iii. Exchange Subsystem.

These are overview of aerospike key points which makes more powerful distributed database engine, let see development and different memory architecture in next parts.

Thanks for reading!

Resources :

  1. https://www.aerospike.com/technology/#1545435074921-0-15246-ac1d

--

--