Cloud Spanner — Demystifying load-based Splitting

Published in

Google Cloud - Community

7 min readMay 7, 2024

As a distributed database, Cloud Spanner divides your tables and indexes into chunks called “splits.” A split holds a range of contiguous rows, where the start and end keys of this range are called “split boundaries”.

Cloud Spanner can automatically add and remove split boundaries based on a strategy of load-based splitting. This means that in addition to dividing spits by size (when it reaches approximately 8 GB), Spanner could detect hotsplits and divide them to distribute load across different servers.

This sounds great but, what is the heuristic behind this feature? Does Spanner detect the load among a range of keys and then add new split boundaries based on this? Or does it just detect that a single split is overloaded and divide it in two or more divisions? How much time does it take to detect and solve these hotspots?

Let’s do some tests to solve these doubts.

Tests description

For each test, I created a table with 1k rows to ensure that all data were stored in a single split.

I use YCSB tool (link) to load data and run read-only workloads (to avoid splitting based on size) over these tables.

To detect when Cloud Spanner adds new split boundaries, I check the remote_server_calls statistics, scheduling a query doing a full table scan every minute to obtain this statistic.

As Cloud Spanner adds new split boundaries we should see how the number of remote_server_calls increase.

Snippet

def query_profile(db,query,tzinfo):
   #try:
      logger = logging.getLogger(__name__)
      logging.basicConfig(filename='query_autosplit.log', level=logging.DEBUG)
      with db.snapshot() as snapshot:
         results = snapshot.execute_sql(
            query,
            query_mode='PROFILE',
         )

         for row in results:
            continue
          
         logger.info("{} || {}: {}".format(datetime.now(tzinfo),"remote_server_calls",results.stats.query_stats["remote_server_calls"]))

Uniform distributed workload

As an initial test, I’m going to start a workload C (queries reading a single row) increasing the traffic by 5k QPS every 10 minutes. The traffic has an uniform distribution across the 1k rows.

Load command:

python2.7 bin/ycsb load cloudspanner -P cloudspanner.properties -P workloads/workloadc -p recordcount=1000 -p cloudspanner.batchinserts=1000 -p zeropadding=1 -p insertorder=ordered -threads 1 -s

Workload command:

python2.7 bin/ycsb run cloudspanner -P cloudspanner.properties -P workloads/workloadc -p recordcount=1001 -p operationcount=1000000000 -p zeropadding=1 -p insertorder=ordered -p insertstart=1 -p insertcount=1000 -p requestdistribution=uniform -threads 100 -target 5000

Results

After hit around 24k QPS and 40% CPU usage (10:30hs) the number of remote_server_calls increased from 0 to 2, so Cloud Spanner added new split boundaries based on load:

INFO:__main__:2024-04-26 10:24:54.379478-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 10:25:54.446098-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 10:26:54.498320-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 10:27:54.646448-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 10:28:54.733668-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 10:29:54.844556-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 10:30:55.009776-03:00 || remote_server_calls: 2/2  <<<<<
INFO:__main__:2024-04-26 10:31:55.097217-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 10:32:55.187852-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 10:33:55.333891-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 10:34:55.445702-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 10:35:55.552410-03:00 || remote_server_calls: 2/2

Now, let’s check the new split boundaries. Using key visualizer we confirm an uniform load distribution across the keys:

Hence the range keys for these new splits should have a similar size. Here I use a custom python script to check remote_server_call involved in a range of keys:

Snippet:

def query_profile(db,query):
   
      with db.snapshot() as snapshot:
         results = snapshot.execute_sql(
            query,
            query_mode='PROFILE',
         )

         for row in results:
            continue
            

         
         print("{}: {}".format("remote_server_calls",results.stats.query_stats["remote_server_calls"]))

Output:

>select id from usertable
remote_server_calls: 2/2
>select id from usertable where id <= 'user49';
remote_server_calls: 2/2
>select id from usertable where id < 'user49';
remote_server_calls: 0/0
>select id from usertable where id >= 'user49';
remote_server_calls: 0/0

It is confirmed that the keys were distributed evenly across the new splits:

Split boundaries: [user0,user489] ; [user49, user999]

Hotspot workload

Now, I’ll run again a workload C, but this time most of the traffic will read a hotkey.

To achieve this, I’m loading a subset of rows (50 rows) with key prefix user0000n (zeropadding=5), and running the workload over these ones with a request distribution hotspot

Load commands:

python2.7 bin/ycsb load cloudspanner -P cloudspanner.properties -P workloads/workloadc -p recordcount=200 -p cloudspanner.batchinserts=200 -p zeropadding=12 -p insertorder=ordered -threads 1 -s
python2.7 bin/ycsb load cloudspanner -P cloudspanner.properties -P workloads/workloadc -p recordcount=51 -p cloudspanner.batchinserts=51 -p zeropadding=5 -p insertstart=1 -p insertcount=50 -p insertorder=ordered -threads 1 -s
python2.7 bin/ycsb load cloudspanner -P cloudspanner.properties -P workloads/workloadc -p recordcount=751 -p cloudspanner.batchinserts=750 -p zeropadding=1 -p insertstart=1 -p insertcount=750 -p insertorder=ordered -threads 1 -s

Workload commands:

 python2.7 bin/ycsb run cloudspanner -P cloudspanner.properties -P workloads/workloadc -p recordcount=6 -p operationcount=1000000000 -p zeropadding=5 -p insertorder=ordered -p insertstart=1 -p insertcount=5 -p requestdistribution=hotspot -threads 100 -target 5000

Results

As before, after hit 24k QPS and 40% CPU usage (11:35hs) Cloud Spanner added new split boundaries:


INFO:__main__:2024-04-26 11:29:46.789904-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 11:30:46.944606-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 11:31:47.055084-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 11:32:47.193315-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 11:33:47.324391-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 11:34:47.596390-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 11:35:47.738780-03:00 || remote_server_calls: 2/2  <<<<
INFO:__main__:2024-04-26 11:36:47.829214-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 11:37:47.998237-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 11:38:48.066118-03:00 || remote_server_calls: 2/2

Using key visualizer tool we can confirm the hotkey (user00001):

This time, Cloud Spanner didn’t create split boundaries with similar size. Its were created based on the hotkey (user00001):

>select id from usertable;
remote_server_calls: 2/2
>select id from usertable where id <= 'user00001';
remote_server_calls: 2/2
>select id from usertable where id < 'user00001';
remote_server_calls: 0/0
>select id from usertable where id >= 'user00001';
remote_server_calls: 0/0

Split boundaries: [user0,user000000000199] ; [user00001, user999]

Something really interesting is that after 4 minutes (11:39hs) Cloud Spanner added a new split boundary. The number of remote_server_calls now increased from 2 to 3:

INFO:__main__:2024-04-26 11:35:47.738780-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 11:36:47.829214-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 11:37:47.998237-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 11:38:48.066118-03:00 || remote_server_calls: 2/2
INFO:__main__:2024-04-26 11:39:48.190337-03:00 || remote_server_calls: 3/3  <<<<<
INFO:__main__:2024-04-26 11:40:48.349421-03:00 || remote_server_calls: 3/3
INFO:__main__:2024-04-26 11:41:48.475653-03:00 || remote_server_calls: 3/3
INFO:__main__:2024-04-26 11:42:48.631019-03:00 || remote_server_calls: 3/3

It seems to be that the hotkey was isolated in a dedicated split, meaning that Cloud Spanner actually detect and mitigate hotspot issues:

>select id from usertable;
remote_server_calls: 3/3
>select id from usertable where id <= 'user00001';
remote_server_calls: 2/2
>select id from usertable where id < 'user00001';
remote_server_calls: 0/0
>select id from usertable where id > 'user00002';
remote_server_calls: 0/0

Final split boundaries: [user0, user000000000199]; [user00001]; [user00002,user999]

Scan reads workload

As a last test, I’ll run a workload E. So, instead of running queries which read a single row, now queries scan a range of keys.

Load command:

python2.7 bin/ycsb load cloudspanner -P cloudspanner.properties -P workloads/workloade -p recordcount=1000 -p cloudspanner.batchinserts=1000 -p zeropadding=1 -p insertorder=ordered -threads 1 -s

Workload command:

 python2.7 bin/ycsb run cloudspanner -P cloudspanner.properties -P workloads/workloade -p recordcount=1001 -p operationcount=1000000000 -p updateproportion=0 -p insertproportion=0 -p zeropadding=1 -p insertorder=ordered -p insertstart=1 -p insertcount=1000 -p requestdistribution=uniform -threads 30 -target 5000

Results

Unlike previous tests where Cloud Spanner added new split boundaries when it hit a large number of QPS and CPU usage, in this case we don’t see any splitting operation.

INFO:__main__:2024-04-26 13:03:03.614536-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:04:03.746457-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:05:03.863365-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:06:04.033387-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:07:04.187267-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:08:04.312385-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:09:04.434225-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:10:04.534277-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:11:04.664288-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:12:04.790135-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:13:04.964234-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:14:05.099777-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:15:05.208324-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:16:05.366656-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:17:05.540617-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:18:05.676499-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:19:05.823225-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:20:06.076858-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:21:06.232584-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:22:06.411329-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:23:06.606161-03:00 || remote_server_calls: 0/0
INFO:__main__:2024-04-26 13:24:06.709737-03:00 || remote_server_calls: 0/0

Why didn’t Cloud Spanner divide this split?

I think as these queries are reading multiple rows by execution, Cloud Spanner detects that adding divisions could cause multiple splits access in a query, hence this could incurs greater overhead degrading performance

Conclusions

Load-based splitting heuristic considers either the number of operations a single split is serving and how data is accessed in it.
This means that even a single split is serving a large number of QPS, it won’t be divided if it doesn’t provide any performance improvement.
New split boundaries are decided based on the workload distribution across the keys.
Cloud Spanner can detect and isolate hot keys in dedicated splits.

Cloud Spanner — Demystifying load-based Splitting

Tests description

Uniform distributed workload

Results

Hotspot workload

Results

Scan reads workload

Results

Conclusions

Written by Pablo Arrojo