Cloud Spanner — Table Interleaving use cases

Pablo Arrojo
Google Cloud - Community
7 min readOct 3, 2023

Spanner’s table interleaving is a good choice for many parent-child relationships. With interleaving, Spanner physically co-locates child rows with parent rows in storage. That means Spanner will try to keep the parent row and its child rows in the same split.

Co-location between tables can significantly improve performance, but in what situations would we see these improvements? Are these improvements related to write latency or query latency?

Here are some tests to try to understand how table interleaving works and when we would obtain performance improvements.

Spanner Instance

All tests run in a Regional Instance (us-east4).

Instance size: 1 node.

Schema

I’ve created two pairs of tables:

  • table_parent & table_child (non-interleaved)
  • table_parent_i & table_child_i (interleaved)
CREATE SEQUENCE sequence_id_child OPTIONS (
sequence_kind = 'bit_reversed_positive'
);

CREATE SEQUENCE sequence_id_parent OPTIONS (
sequence_kind = 'bit_reversed_positive'
);

CREATE TABLE table_parent (
id STRING(MAX), ---> UUID
field1 STRING(MAX),
field2 STRING(MAX),
) PRIMARY KEY(id);

CREATE TABLE table_child (
id STRING(MAX),
id_child FLOAT64 DEFAULT (GET_NEXT_SEQUENCE_VALUE(SEQUENCE sequence_id_child)),
field1 STRING(MAX),
field2 STRING(MAX),
field3 STRING(MAX),
field4 STRING(MAX),
field5 STRING(MAX),
field6 STRING(MAX),
field7 STRING(MAX),
field8 STRING(MAX),
) PRIMARY KEY(id, id_child);

CREATE TABLE table_parent_i (
id STRING(MAX), ---> UUID
field1 STRING(MAX),
field2 STRING(MAX),
) PRIMARY KEY(id);


CREATE TABLE table_child_i (
id STRING(MAX),
id_child FLOAT64 DEFAULT (GET_NEXT_SEQUENCE_VALUE(SEQUENCE sequence_id_child)),
field1 STRING(MAX),
field2 STRING(MAX),
field3 STRING(MAX),
field4 STRING(MAX),
field5 STRING(MAX),
field6 STRING(MAX),
field7 STRING(MAX),
field8 STRING(MAX),
) PRIMARY KEY(id, id_child),
INTERLEAVE IN PARENT table_parent_i ON DELETE CASCADE;

Write Tests

Let’s start checking how interleaving could affect write latency.

Writing Parent+Childs rows in same tx

In this test, I’m writing in the same transaction a parent row + 3 child rows using mutations.

Non-Interleaved tables

Interleaved tables

As you can see, using interleaving we significantly reduced commit latency (3ms vs 7ms).

This improvement is justified checking avg_participants metric (I’ve written about that here). With interleaving, parents and their childs are co-located in the same split, so the number of participants is just one avoiding the 2PC and obtaining a lower commit latency (keep in mind that adding secondary indexes or change streams could affect this latency improvement).

However, in our test with non-interleaved tables two leader splits are involved in the transaction (one split for table_parent and another one for table_child) hence we have an avg_participants of two (commit latency affected by 2PC).

Writing child rows

Test similar to the previous one, but this time I’m just writing rows on child tables (appending child rows to existing parents). Three rows per commit.

Non-Interleaved tables

Interleaved tables

In this case there is no improvement. It makes sense as we’re writing in a single table (child) so data is expected to be co-located. This is demonstrated by an avg_participants of one in both transactions.

Query Tests

Documentation explains that table interleaving could improve queries performance where parent and childs tables are joined on by Primary Key.

So, is table interleaving an improvement in all cases that meet this condition ? Let ‘s check it out.

Filtering by Parent table ID

In this test I’m joining parent and child tables by Primary Key, looking for a single parent id and retrieving the parent row and all their child rows:

SELECT p.*, 
ARRAY(SELECT AS STRUCT c.* FROM table_child c WHERE c.id = p.id) as c01,
FROM table_parent p WHERE p.id in ('002498b4-16bc-4ae8-9e24-879f04297753');

SELECT p.*,
ARRAY(SELECT AS STRUCT c.* FROM table_child_i c WHERE c.id = p.id) as c01,
FROM table_parent_i p WHERE p.id in ('0017a4a1-5e34-4983-bf7a-72f6959a018a');

Non-Interleaved tables

Interleaved tables

This query is faster with interleaving (20ms vs 31ms).

Checking the execution plans we can notice two distributed operations for the first execution but a single local operation for the last one over interleaved tables.

To obtain more details about these executions, I run queries again using PROFILE mode.

With the returned query stats we can confirm that query executed over interleaved tables is solved by root server (remote_server_calls: 0):

   "queryStats": {
"optimizer_version": "5",
"runtime_creation_time": "0.43 msecs",
"statistics_load_time": "0",
"cpu_time": "41.14 msecs",
"remote_server_calls": "0/0", <<<<<<<<<<<
"elapsed_time": "41.21 msecs",
"rows_returned": "1",
"query_text": "SELECT p.*,ARRAY(SELECT AS STRUCT c.* FROM table_child_i c WHERE c.id = p.id) as c01, FROM table_parent_i p WHERE p.id in ('0017a4a1-5e34-4983-bf7a-72f6959a018a');",
"deleted_rows_scanned": "0",
"filesystem_delay_seconds": "0 msecs",
"query_plan_creation_time": "4.54 msecs",
"locking_delay": "0 msecs",
"bytes_returned": "8149054",
"optimizer_statistics_package": "auto_20230925_04_09_10UTC",
"rows_scanned": "2001",
"data_bytes_read": "444819"
}

However, for query execution over regular tables we can see remote servers calls:

 "queryStats": {
"rows_returned": "1",
"query_plan_creation_time": "3.61 msecs",
"rows_scanned": "2001",
"cpu_time": "46.76 msecs",
"locking_delay": "0 msecs",
"deleted_rows_scanned": "0",
"runtime_creation_time": "0.64 msecs",
"elapsed_time": "46.64 msecs",
"data_bytes_read": "503138",
"remote_server_calls": "1/1", <<<<<<<<<
"filesystem_delay_seconds": "0 msecs",
"optimizer_statistics_package": "auto_20230925_04_09_10UTC",
"query_text": "SELECT p.*, ARRAY(SELECT AS STRUCT c.* FROM table_child c WHERE c.id = p.id) as c01, FROM table_parent p WHERE p.id in ('002498b4-16bc-4ae8-9e24-879f04297753');",
"optimizer_version": "5",
"bytes_returned": "8149054",
"statistics_load_time": "0"
}

These metrics demonstrate the improvement achieved by data co-location for this kind of queries.

Filtering by Secondary Index on Parent table

What if we are joining on a parent and child table by PK but we have to filter parents by a non-id field?

In that case we have to create a secondary index to avoid a full table scan over the parent table. Would interleaving provide some improvement?

Let ‘s see.

Indexes:

create index field2_idx_s on table_parent(field2) storing (field1);
create index field2i_idx_s on table_parent_i(field2) storing (field1);

Queries:

SELECT p.*,
ARRAY(SELECT AS STRUCT c.* FROM table_child c WHERE c.id = p.id) as c01
from table_parent p where p.field2='field2_test';

SELECT p.*,
ARRAY(SELECT AS STRUCT c.* FROM table_child_i c WHERE c.id = p.id) as c01
from table_parent_i p where p.field2='field2_test';

Non-Interleaved tables

Interleaved tables

In this case there is no difference either in latency and execution plans between the executions.

Both executions are doing two distributed operations. First filtering the parent rows doing an index-only scan and then looking for matched child rows in the child table.

We can confirm this by running these queries using PROFILE mode. In both cases we can see the same number of remote server calls:

    "queryStats": {
"deleted_rows_scanned": "0",
"elapsed_time": "38.86 msecs",
"cpu_time": "40.75 msecs",
"rows_returned": "1",
"bytes_returned": "8201526",
"rows_scanned": "2014",
"optimizer_statistics_package": "auto_20230925_04_09_10UTC",
"runtime_creation_time": "0.74 msecs",
"query_text": "SELECT p.*,ARRAY(SELECT AS STRUCT c.* FROM table_child c WHERE c.id = p.id) as c01 from table_parent p where p.field2='field2_test';",
"optimizer_version": "5",
"locking_delay": "0 msecs",
"data_bytes_read": "669826",
"filesystem_delay_seconds": "0 msecs",
"remote_server_calls": "1/1", <<<<<<<
"query_plan_creation_time": "3.78 msecs",
"statistics_load_time": "0"
}

Interleaved tables:

 "queryStats": {
"data_bytes_read": "669786",
"elapsed_time": "45.91 msecs",
"cpu_time": "46.1 msecs",
"runtime_creation_time": "0.56 msecs",
"filesystem_delay_seconds": "0 msecs",
"query_plan_creation_time": "3.78 msecs",
"statistics_load_time": "0",
"rows_scanned": "2014",
"query_text": "SELECT p.*,ARRAY(SELECT AS STRUCT c.* FROM table_child_i c WHERE c.id = p.id) as c01 from table_parent_i p where p.field2='field2_test';",
"optimizer_version": "5",
"deleted_rows_scanned": "0",
"optimizer_statistics_package": "auto_20230925_04_09_10UTC",
"bytes_returned": "8201526",
"remote_server_calls": "1/1", <<<<<<<<<
"rows_returned": "1",
"locking_delay": "0 msecs"
}

Based on these results, we can say that for this kind of query table interleaving doesn’t provide any improvement.

Important consideration: Split Size

As we have seen Spanner keeps parent and child table rows co-located in the same Split, but is important to keep in mind that (according to documentation) Split max size is about 8Gb. So, What happens if the total size for parent+child rows is larger than 8Gb?

Let ‘s check it out.

The below ID has more than 4.5M rows in child table:

As row size is 4Kb, total size for child rows is about 18Gb, that means we need more than 2 Splits to store them.

Here is a simple test adding child rows to the id 00000168-efd7–4588-a8a8-ce342f76657a. I’m writing three rows per commit.

Look at the avg_participants metric, it is about 2.5. So for this parent id their child rows are all distributed across three Splits hence we’re losing the benefit of data co-location that interleaving should provides.

Conclusion

According to the results of our tests, table Interleaving is a good choice when:

  • Parent and child rows are written together in the same transaction.
  • Queries where parent and child tables are joined by primary key, and filter parents by id.

Table interleaving doesn’t provide any improvement when:

  • Appending child rows to an existing parent.
  • Queries that need to filter data using a secondary index over parent or child tables. Despite of parent and childs tables are joined or not.
  • Total size for parent and its child rows is larger than 8Gb.

--

--