GCP Crash Course: Storage and Databases
Basic Concepts
Storage
Disks? Not so simple…
Common Disks for Cloud are called Block Storage and are usually made of slices of several disks “virtually” seen as one. And replicated behind the scene.
You may decide type (SSD, HDD), speed and location (latency). Regional.
Different Costs and Performances.
But you may also have RAM Disks and Distributed File Systems
key value Storage (similar to Drive, Dropbox) → Cloud Storage.
Cloud Storage (similar to S3) may only contain files (not applications,for example)
and it is inexpensive and full of nice functions (versioning, lifecycle, archiving..)
Databases
We are talking about Managed Services by GCP, that is, a pret-a-porter DB.
But they usually have automated backups and failover capabilities. With minimum or no effort.
Amazing! Pay attention…it is not a free meal. DBs are usually the most expensive Services.
Main Features/Products:
Relational
Cloud Sql (mySql and Postgres): small (mySql) to medium (Postgres) DB. Inside a Region.
Spanner: big, multiregion and highly scalable RDBS
noSql
Cloud Datastore: economic document noSQL (like Mongo)
with like SQL query language, scalable but it has not always strong consistency.
BigTable: at a same time a DB and a BigData tool. Unique in its kind.
A key value noSQL DB, but the values are structured in hierarchical columns.
Used by Google for Gmail, Maps ecc.
Petabytes in ms.
Ask yourself
What does it mean RPO and RTO? Why are they so important?
A Wordpress Site, a Gaming Startup and a Big Corporation or Institution.
Which are the requirement for Database and Security? Which products are suitable? Why?
How can I build a Static Web Site with Cloud Storage? Which are the benefits?
Cheatsheet
No big explanations but only the relevant info with links. More info than you actually need for certification. Just pick what you need.
DISKs — Block Storage
Zonal standard persistent disk and Zonal SSD persistent disk: max 64 TB IOPS: 3,000/60,000
Regional persistent disk and regional SSD persistent disk: max 64 TB replicated in two zones. IOPS: 3,000/60,000 — greater latency
Local SSD: more expensive ephemeral max 3TB Instance IOPS: 280,000/680,000
Create a file server or distributed file system on Compute Engine to use as a network file system with NFSv3 and SMB3 capabilities.
Mount a RAM disk within instance memory to create a block storage volume with high throughput and low latency.
Snapshots — Backups incremental encryption with system-defined keys or with customer-supplied keys
Cloud Storage
Objects and Buckets 99.999999999% durability
gsutil tool
encryption at rest
object versioning, object notification, access logging,
lifecycle management, per-object storage classes,
composite objects and parallel uploads.
storage-transfer service transfers data from an online data source to a data sink. Your data source can be an Amazon Simple Storage Service (Amazon S3) bucket, an HTTP/HTTPS location, or a Cloud Storage bucket. Your data sink (the destination) is always a Cloud Storage bucket
Classes:
Multi-Regional Storage 99.99%
Regional Storage 99.99%
Nearline Storage 99.9%
Coldline Storage 99.9% $0.007 Month/GB
Useful functions:
Cloud Pub/Sub Notifications for Cloud Storage Cloud Functions:
Storage-transfer vs transfer appliance (snowball)
SQL DBs
MySQL: Regional 1st Generation up to 5.5–2nd Generation 5.7
Max 16 GB of RAM and 500 GB data storage
sql-proxy → secure external connections
Data replication + zones; Mysql client External applications
Kubernetes Engine → connect to
HA configuration Cluster: primary instance failover replica (different zone only one)
semisynchronous replication replication lag.
up to 416 GB of RAM and 64 CPUs
Regional 2nd Generation 9.6 -. Data replication + Zones
no Point-in-time recovery (PITR)
PostgreSQL Extensions PL/pgSQL SQL procedural language
Applications running on Compute Engine
Applications running on Kubernetes Engine
HA configuration Regional instance located in a primary and secondary zone + standby instance
synchronous replication
multi-zone and multi-region replicated configuration
noSQL document GQL → SQL-like query language cheap+free tier
Table→ Kind — Row → Entity — Field → Property key
ACID properties
Cloud Bigtable is a key/value store. It does not support joins, nor does it support transactions except within a single row
ZONAL regional
Cloud Bigtable performs best with 1 TB or more of data.
noSQL that scale to billions of rows and thousands of columns, enabling you to store terabytes or even petabytes of data. Managed but not-so-easy to configure/optimize.
Instance is a container for your clusters and nodes
single-keyed data with very low latency integrates Apache Bigdata Apache HBase library for Java
replicate→ add a second cluster all automatic
Column families data→ raw byte strings 8-byte big-endian
Front end server pool → node (pointers) → data (colossus)
sharded into blocks of contiguous rows, called tablets
Understanding Cloud Bigtable Performance Choosing a row key
design a Cloud Bigtable schema
Design time series hotspotting
fully managed, mission-critical, relational database SQL (ANSI 2011 with extensions) automatic, synchronous replication for HA.
Replication globally synchronous replication always the most up-to-date data read-write replicas, read-only replicas, and witness replicas.
Cloud Spanner divides your data into chunks called “splits”, where individual splits can move independently from each other and get assigned to different servers, which can be in different physical locations.
Read-only replicas are only used in multi-region instances, are not eligible to become a leader and may serve stale reads without needing a round-trip to the default leader region.
Regional configurations contain exactly three read-write replicas.
Multi-region configurations contain more replicas but not all have voting rights.Instead Witness replicas vote for commit but don’t have all the data.
Study Material
Links may not work if you are not enrolled to Coursera
AWS Storage short and free
Demos Videos
Don’t get scared: many videos last just 1 minute…Only a little demo. If you are in a hurry, they can replace labs.
Uploading Files and Folders to Google Cloud Storage
Google Cloud Storage: Massive Scalability Plus More | Google Cloud Labs
How to Create a Bucket with the Google Cloud Storage Browser
Using Google Cloud SQL with Compute Engine
Getting Started with Cloud SQL for MySQL
Getting Started with Cloud SQL for PostgreSQL
Connect to Google Cloud SQL on the Command Line
Connecting to Google Cloud SQL with the Cloud SQL Proxy
PostgreSQL instance with Cloud Launcher
Deploying Microsoft SQL Server to Google Compute Engine
Getting Started: Cloud Spanner
First Steps with Google Cloud Spanner
Labs Qwiklabs
Migrate a MySQL Database to Google Cloud SQL
Loading Data into Google Cloud SQL
Cloud SQL for PostgreSQL: Qwik Start
Cloud Spanner: Qwik Start
Bigtable: Qwik Start — Command Line