Apache Cassandra | Spring Data for Cassandra — Cassandra Native Driver CRUD Performance Comparison

Fatih Yıldızlı
turkcell
Published in
5 min readNov 28, 2020

✅ My Purpose

In this article, I will compare the performance of two different drivers in rest architecture on spring boot as Spring Data for Cassandra & Cassandra Native Driver. The behaviors of CRUD operations in REST API format were compared. This project was coded for proofing of concept without any high level architecture or any software pattern. The main goal is to compare the performance of the two drivers for 1M transaction loop. This article contains only informational results for local Cassandra Cluster. It may vary under different conditions.

🔗 Github Repo link :

❔ What is Apache Cassandra ?

Apache Cassandra is a highly-scalable partitioned row store. Rows are organized into tables with a required primary key.

Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster.

Row store means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.

🌐 History

Developed by: Apache Software Foundation

Stable release: 3.11.9 / August 31, 2020; 2 months ago

Original author(s): Avinash Lakshman, Prashant Malik / Facebook

Written in: Java

License: Apache License 2.0

Initial release: July 2008; 12 years ago

📢 Prerequisites:

  • Local Cassandra Cluster ( local or docker instance) bind default port 9042

http://cassandra.apache.org/download/

  • Execute initial script as below:
CREATE KEYSPACE local WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '4'}  AND
durable_writes = true;

USE local;

create table local.dummy
(
id bigint primary key,
column_1 text,
column_2 text
)
with caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
and compaction = {'max_threshold': '32', 'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
and compression = {'class': 'org.apache.cassandra.io.compress.LZ4Compressor', 'chunk_length_in_kb': '64'}
and dclocal_read_repair_chance = 0.1;
  • JDK 1.8

📝Dependencies

What is Spring Data for Cassandra ?

The primary goal of the Spring Data project is to make it easier to build Spring-powered applications that use new data access technologies such as non-relational databases, map-reduce frameworks, and cloud based data services.

The Apache Cassandra NoSQL Database offers many new capabilities for teams seeking a solution to handle high velocity, high volume and variable data flows. This new way of thinking introduces new concepts and a learning curve that can be intimidating to team members and team managers. Spring Data for Apache Cassandra offers a familiar interface to those who have used other Spring Data modules in the past.

The learning curve for developing applications with Apache Cassandra is significantly reduced when using Spring Data for Apache Cassandra. With the power to stay at a high level with annotated POJOs, or at a low level with high performance data ingestion capabilities, the Spring Data for Apache Cassandra templates are sure to meet every application need.

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-cassandra</artifactId>
</dependency>

❔ What is Cassandra Native Driver ?

A modern, feature-rich and highly tunable Java client library for Apache Cassandra® (2.1+) and DataStax Enterprise (4.7+), and DataStax Astra, using exclusively Cassandra’s binary protocol and Cassandra Query Language (CQL) v3.

<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
</dependency>

☑️ CRUD OPERATIONS

Basic table fields:

Dummy Data

✏️1 Million Row Insert Comparission

📍Cassandra Native Driver

Cassandra Native Driver 1M insert

📍Spring Data for Cassandra

Spring Data for Cassandra 1M insert

✏️1 Million Row Select Comparission ( * )

📍Cassandra Native Driver

Cassandra Native Driver 1M select

📍Spring Data for Cassandra

Spring Data for Cassandra 1M select

✏️1 Million Row Update Comparission (with where condition)

📍Cassandra Native Driver

Cassandra Native Driver 1M update

📍Spring Data for Cassandra

Spring Data for Cassandra 1M update

✏️1 Million Row Delete Comparission

📍 Cassandra Native Driver

Cassandra Native Driver 1M delete

📍Spring Data for Cassandra

Spring Data for Cassandra 1M delete

⚠️ Conclusion

The main goal is to compare the performance of the two drivers on spring boot. This article contains only informational results for local Cassandra Cluster. It may vary under different conditions.

👁‍🗨 Performance Ranking for CRUD 1 million row

Create

Cassandra Native Driver      (19723ms)
Spring Data for Cassandra (3170808ms)

Read

Cassandra Native Driver      (8668ms)
Spring Data for Cassandra (50455ms)

Update

Cassandra Native Driver      (16691ms)
Spring Data for Cassandra (3125532ms)

Delete

Cassandra Native Driver      (30055ms)
Spring Data for Cassandra (3093708ms)

Hope you’ve enjoyed!

Thank you for reading, please press clap button for me 👏

🔗 References:

--

--