Choose the right database service for your application on GCP

Nirav Kothari
GDGCloudMumbai
Published in
5 min readJun 23, 2020

Google Cloud Platform provides number of services to store application data. The portfolio includes different database and data warehouse services to suit your application requirement. Be it database or data warehouse, relational database or No SQL, document database or columnar, Scalable, highly available, low latency — GCP Database services has solution for all of these.

In this blog I will try to give insights on how you could go about choosing the right database service for your application. Many times it is observed that there are multiple alternative services which fulfill the primary goal of the application’s database need, in such scenarios finding the most appropriate database service can be a bit challenging. I hope this blog will help you to overcome that challenge. In the previous blogs I expressed my thoughts on

Choose the right GCP compute service for your workload and

Choose the right storage for your apps in GCP

Cloud SQL:

It’s a fully managed RDBMS implementation on Google infrastructure. You can choose between MySQL, PostgreSQL and Microsoft SQL Server implementation. Provisioning this service on GCP can be done with just a few clicks. Being a fully managed service, database provisioning and storage capacity management, updates and patches are automatically handled. It provides high reliability, high scalability, high security, auto backups and point-in-time recovery.

Cloud SQL supports 2 types of replication — Read replica and failover replica. Read replica helps in mitigating high read traffic. And failover replica helps in achieving high availability. It provides 99.95% availability SLA.

Cloud SQL supports auto-scaling well. It scales automatically when the storage is close to its capacity. This spares application owners from estimating future storage needs and spending in unused capacity. It can vertically scale upto 30TB of storage capacity with 400GB of RAM, 64 processor core per instance and up to 60000 IOPS. These numbers are ever changing though.

The downside of Cloud SQL service is it cannot scale horizontally. And it needs a strict schema to store the data.

This service should be used when a relational database out of MySQL, PostgreSQL or Microsoft SQL server is needed. Typical use cases are websites, blogs, CMS, ERP, CRM, eCommerce applications and OLTP workloads.

Cloud Spanner:

Cloud Spanner is another fully managed, relational database service. It implements home grown RDBMS. It is specifically designed to support horizontal scaling along with the benefits of relational database and SQL query language.

It provides strong consistency in horizontal scaling, meaning that data written to a node will immediately be replicated to other nodes as well. This avoids data inconsistency issues.

It allows multi regional instances to be created and hence providing high availability (99.999%), high durability and low latency.

It can be utilized in mission critical and high transactions application scenarios where RDBMS is needed with scalability and consistency. Possible use cases are AdTech, Financial, Global Supply chain, Retail etc.

Cloud Firestore / Datastore:

It’s a fully managed, serverless, NoSQL, schema-less, document database service to store non-relational data of your application. It supports 2 modes — Native mode and Datastore mode. Basically this service was named as Datastore earlier, but now GCP has re-branded it under Firestore title. The re-branded version offers few advantages like strong consistency in horizontal scaled instance and enhancements in client libraries like live synchronization and offline support. This enables the Firestore instance to be connected directly from web servers as well as the mobile devices. It does supports ACID transactions.

It supports automatic horizontal scaling in and out based on the load. It also supports multi-region replication along with strong consistency for high availability, high durability, high performance and low latency. The service also provides automatic upgrade without the downtime.

Typical use cases include storing semi structured and hierarchical data like user profile data, application state, product catalog etc. Also if the mobile application is supposed to connect directly to database then Firestore is a good choice.

Cloud BigTable:

Its a fully managed, scalable, NoSQL, Columnar database. Its a high performance database service capable of handling millions of requests per second. High performance and low latency makes it a good choice for use cases involving data analytics.

The reason it can quickly process queries is because it stores the data column wise in the storage system, which in case of RDBMS is stored row wise. So querying on particular column reduces the data to be scanned by the database.

It scales automatically without downtime. The instance can be replicated to hundreds of nodes across multiple regions, to cater to high demand and also providing high availability.

Typical use cases are IoT, User personalization, AdTech, FinTech, recommendations etc.

Cloud BigQuery:

It’s a fully managed, highly scalable, low cost data warehouse solution from GCP. It’s an enterprise data warehouse usually used to process data from various sources in streaming and batch fashion and then apply analytics and dashboard on top of it to help you take the informed business decisions. It can store and analyze Petabytes of data quickly. It uses ANSI SQL as a query language. BigQuery can be integrated with multiple external sources ex. transactional databases like Clod SQL, Spanner and BigTable, Cloud Storage for Parquet and ORC Open Source file formats or as simple as spreadsheets. For processing these data, you do not need to move the data.

It supports building ML models and applying these models on stored data just like an SQL commands. This eliminates the need of exporting the data from BigQuery and hence increases speed and aids in HIPAA compliance.

It provides auto-scaling, high availability, low latency, serverless service. Typical use case include the real-time analytics, predictive analytics etc.

Final Words:

These points will help you in deciding the right database service needed for your application. Finally, this decision tree can help you in decision making.

Photo by Markus Spiske on Unsplash

--

--

Nirav Kothari
GDGCloudMumbai

#Developer #SolutionArchitect #NLP #ML #DataMining #IoT #Automation #GoogleCloud. Actively managing @GDG_Cloud_Mumbai