SSTables : The secret sauce that behind Cassandra’s write performance..
Whats SSTables ?
- Sorted Strings Table (SSTable) is a file format used by Apache Cassandra, ScyllaDB, Bigtable to store data
- It organizes data for efficient insertion, making them particularly well-suited for write-heavy workloads.
A database like Cassandra has 3 Main Components:
- Write-ahead Log (WAL): Sequentially records all write operations to ensure durability and maintain write order.
- Memtable: In-memory data structure that caches recent write operations for fast access.
- Sorted String Table (SSTable): On-disk data structure containing sorted key-value pairs, usually generated from flushed memtables
Sorted Strings Table (SSTable) is a persistent file format used to take the in-memory data stored in memtables, and store it on disk in a persistent, ordered, immutable set of files.
- Immutable means SSTables are never modified. They are later merged into new SSTables or deleted as data is updated.
What is the Use of SSTable in Cassandra?
The purpose of a database is to persistently and efficiently store data. That storage needs to be durable, so that the data isn’t lost when the system is shut down .
Options:
- In Memory: Keeping all the data only in memory would be fast, but not durable.
- Normal Disk : Writing every update to storage immediately would be very slow and inefficient at scale.
- B tree : B-tree-based databases use balanced tree structures which provide efficient read and write operations (Used in : PostgreSQL, MySQL.). It provides more balanced performance for sequential and random access.
- SSTable
Benefits of SSTable
- SSTables are quick and efficient to read and write since they are append only and do not overwrite existing data to disk.
- They use binary search and index files to locate blocks and keys within blocks.
- Merging SSTables is similar to doing a merge sort
For details See: How Cassandra works?
Indexing in SS table :
- To find if a key exists we don’t need an index of all the keys in memory, instead we can keep an index for every few kilobytes and then perform a scan (sparse index). So rather than store keys individually, they gain speed by maintaining a selective or sparse index, keeping index entries for only a subset of the data rather than every individual record.
Alternatives to SSTable:
- B-tree-based : B-tree-based databases use balanced tree structures which provide efficient read and write operations without the need for compaction processes. Used in : PostgreSQL, MySQL.It provides more balanced performance for sequential and random access. It is efficient for range queries as traversing the tree can quickly identify a range of keys.
- Columnar Databases: Columnar databases like Apache Parquet or Apache ORC store data in column-oriented format, which can offer significant performance benefits for analytical workloads, though they may not be suitable for all use cases where SSTables are used.
tldr:
SSTable (Sorted Strings Table) is a file of key/value string pairs, sorted by keys
- It provides a persistent,ordered immutable store for keys,values pair.
- Internally, each SSTable contains a sequence of blocks .
- A index block (stored at the end of the SSTable) is used to locate blocks
- The index is loaded into memory when the SSTable is opened.
- A lookup can be performed with a single disk seek: we first find the appropriate block by performing a binary search in the in-memory index, and then reading the appropriate block from disk.