Cloud Spanner: All about Spanner in one blog!

Vasu Mittal
Google Cloud - Community
7 min readAug 5, 2024

I know there are multiple data services in Google Cloud Platform(GCP) that we are discussing parallely and it might be confusing for you to select the one which is right for your use-case. Hence, I thought it’s better to explain each one of them one by one and believe me this one, Cloud Spanner has no comparable offering anywhere in any cloud. Excited to learn more? Let’s dive in!

Cloud Spanner

Cloud Spanner is the only enterprise-grade, globally-distributed and strongly consistent managed database service built for cloud specifically to combine the benefits of relational database structure with non-relational horizontal scale.

Spanner is a fully managed database service. It is very simple and super fast to create spanner instances and all the maintenance is managed for you. It has virtually unlimited capacity, automatic replication and is also can be distributed across multiple regions and zones.

Spanner scales horizontally by adding nodes. Spanner databases are replicated across multiple zones in a region and can be replicated across regions.

To summarize, key spanner capabilities are:

  • Highly Available(99.999% SLA)
  • Global Deployments
  • Strongly Consistent
  • Horizontally Scalable
  • Automatic Replication

Spanner databases deployed to a single region offer a 99.99% availability SLA. Spanner databases deployed across multiple regions offer a 99.999% availability SLA.

Spanner scales horizontally i.e it is designed to scale up to any number of machines across multiple zones and regions and can easily scale up and down to meet demand.

Spanner Architecture

Spanner is a distributed database in which nodes provide the compute and storage capacity. Spanner Tables are automatically divided into pieces called splits. Splits are replicated automatically. These replicas are stored in multiple zones and possibly regions. Splits are sorted by Primary Key. Splits will automatically be merged and rebalanced overtime.

Let’s create our first spanner instance!

Creating Cloud Spanner Instance via Console

  1. Go to the Google Cloud Console, on the search bar on the top, write “Cloud Spanner”.
  2. You are now on the home page for Cloud Spanner. Click on “Create a Provisioned Instance”. Creating an instance is a 3 Step Process:

Step 1 — Name your Instance: On the Instance Name field, write your Instance Name. In our usecase, we will write “mediadirectory”. You will notice that the Instance Id is automatically populated. You can leave it as it is and click on “Continue”.

Step 2 — Configure your Instance: Now “Choose a configuration” for your instance from Regional, Dual-Region and Multi-Region options. We will select “Regional” in our case. In “Select a configuration”, we will select “us-east1” in our case and will click on “Continue”.

Step 3 — Allocate Compute Capacity: Now we will select the compute capacity defined in terms of the number of “Nodes” in Spanner. We will select “ 1” in our case and will click on “Create”.

and in less than a min your Spanner Instance in ready to be used!

Create Database in Spanner

Now, you are redirected to the Instance Overview Page automatically and you will see “Create Database” option. You can create a Database in your spanner instance by clicking on “Create Database”. Creating a Database is 2 Step process:

Step 1 — Name your Database: Give a name to your database in “Database name” field. In our case, we will give “mediadatabase”.

Step 2 — Select a Database Dialect: You can select between “Google Standard SQL” and “PostgreSQL”. We will select “Google Standard SQL” in our case and will click on “Create”.

Please note that, you will have an option to specify DDL as well but we will skip this for now as we are going to create tables manually in the next step. Hence, we will click on “Create”.

Now your Database is created and you will land at you Database Overview page automatically. You can validate your Database Name, Database Dialect, Encryption type etc.

Now click on “Create Table” and you can write your create table DDL here. We will write the below in our case:

CREATE TABLE Singers(
SingerId INT64 NOT NULL,
FirstName STRING(1024),
LastName STRING(1024)
) PRIMARY KEY (SingerId);

Now let’s insert a couple of records in this table as below:

Insert into Singers(SingerId, FirstName, LastName) Values(101,"Mike","San");
Insert into Singers(SingerId, FirstName, LastName) Values(102,"John","Tyler");

and now let’s select some data:

select * from Singers where SingerId = 101;

and you will get the details about the Singer with SingerId — 101. Similarly like this you can create databases, tables, views and write SQL queries for your business use-cases.

Spanner Databases

Spanner instances can have one or more databases. Databases can be created using Google Cloud Console, Command Line Interface(CLI), Terraform or your preferred programming language. While creating the database you will have an option to select the SQL dialect between Google Standard SQL or PostgreSQL.

Google Standard SQL is the common SQL Syntax used for Spanner and Bigquery with extensions. It supports ANSI 2011 SQL Standard.

PostgreSQL is the compatible with PostgreSQL tools such as psql command line tool. Please note that some PostgreSQL features such as stored procedures and triggers might not be supported.

Defining Relationships in Spanner

In Spanner, there are two ways of defining relationships:

  1. Primary Key-Foreign Key Constraint: This is similar to what we do in other traditional databases as well. But in this case as the data grows, splits can be added anywhere in the table and this may lead to parent-child data being stored on different splits which might slow down reads. Use Primary key -Foreign key constraint when you are more frequently querying the parent or child data separately. You can create a Foreign key constraint in spanner as below:
ALTER TABLE Orders
ADD FOREIGN KEY (ProductID) REFERENCES Products (ProductID);

2. Interleaved Tables: Parent-Child data is stored together on the same split. An interleaved table is a table that you declare to be an interleaved child of another table because you want the rows of the child table to be physically stored with the associated parent row.

Use Interleaved tables when you are more likely to get the child records along with the parents.

You can create interleaved tables as below:

CREATE TABLE Singers (
SingerId INT64 NOT NULL,
FirstName STRING(1024),
LastName STRING(1024)
) PRIMARY KEY (SingerId);

CREATE TABLE Albums (
SingerId INT64 NOT NULL,
AlbumId INT64 NOT NULL,
AlbumTitle STRING(1024)
) PRIMARY KEY (SingerId, AlbumId),
INTERLEAVE IN PARENT Singers ON DELETE CASCADE;

Interleaved tables are stored on the same splits as the parent but are treated as separate tables. You can use “join” just like any other relational database to retrieve data from the interleaved tables. For example:

SELECT s.FirstName, a.AlbumTitle
FROM Singers AS s JOIN Albums AS a ON s.SingerId = a.SingerId;

Spanner Indexes

In Spanner, Indexes are stored in the same way as tables. Indexes are used to speed up queries and sorts. An index is always created for the Primary Key(s). Secondary Indexes can also be created from one or more fields within a table. For example:

CREATE INDEX SingersByFirstLastName ON Singers(FirstName, LastName);

Views in Spanner

A View is a virtual table defined by a SQL query as in other relational databases as well. When a query that refers to a view is executed, Spanner creates the virtual table by executing the query defined in the view, and that virtual table’s content is used by the referring query.

CREATE OR REPLACE VIEW SingerNames
SQL SECURITY INVOKER
AS SELECT
CAST(Singers.SingerId AS INT64) AS SingerId,
CAST(Singers.FirstName AS STRING) || " " || CAST(Singers.LastName AS STRING) AS Name
FROM Singers;

Choosing Primary Key

If the Primary key of a table keeps increasing(or decreasing) sequentially, then one node will be responsible for all the writes to a table. This will create a hotspot. Hence, you should choose a primary key that will be evenly distributed across the table splits.

A good option is having UUIDs as Primary Key. UUID stands for universally unique identifier. It is a long, randomly generated, unique string. UUID version 1 strings use timestamps to generate unique values. Hence, they monotonically increase, so they should not be used. UUID version 4 uses random numbers to generate the ID and hence, it does not monotonically increase and therefore, will make a good spanner primary key.

Sometimes timestamps along with any other field can be combined to create a composite primary key. You can also hash monotonically increasing keys to create primary key for spanner.

Summary

Using Cloud Spanner you can create realtime mission critical applications at scale. Spanner is a Fully Managed Relational Database, with unlimited scale, strong consistency and upto 99.999% availability. It uses TrueTime, which provides a global notion of time allowing it to scale horizontally, while still delivering strong external consistency across rows, regions and continents. You can serve data with low latency and deliver a unified experience. You can expect everything from a relational database, schemas, SQL Queries and ACID Transactions. In Spanner, data is encrypted at rest and in transit providing you full security. Spanner is a perfect choice for a high-scale online transaction processing and realtime decision making workloads. So, next time you want to deliver mission critical apps at scale and a unified experience with low maintenance do explore Cloud Spanner!

Keep Learning, Keep Growing!!!

--

--