Preparing for GCP Cloud Digital Leader Exam [Part 1]

Shravan Shenoy
6 min readJun 15, 2023

--

GCP Cloud Digital Leader certification exam is a popular exam taken by many who are new to Google Cloud Platform(GCP) and want to develop a deep understanding of core Google Cloud products and services and how they can benefit organizations.

If you are one of those who is planning to write the exam, you have come to the perfect place!

In this series of articles, we will cover key concepts covered in the Cloud Digital Leader learning path, followed by some questions which will help you solidify your concepts.

In this article, we will cover topics related to Module 2 of the Cloud Digital Leader learning path i.e. Innovating with Data and Google Cloud

What topics will be covered?

  • Firstly, we will cover about Data Storage i.e. how data is stored, and the different storage options Google cloud provides
  • Secondly, we will cover about Getting Value from Data i.e. different BI and ML solutions provided by Google Cloud and how organizations can use them to make money

Data Storage

Data can be stored in following places
- Database : Collection of data generally stored as tables. They efficiently ingest large amounts of real time data and are used to support Online Transaction Processing (OLTP)
- Data Warehouse : Used to combine disparate data sources to analyze multidimensional datasets. A data warehouse is a giant database that is optimized for analytics.
- Data lake : Centralized repository that stores raw data.

Different storage options for structured data

The below 2 diagrams gives a good summary of the different storage options provided by GCP for structured data.

Source : https://k21academy.com/google-cloud/google-cloud-storage-and-database

Points to Note:

  • Cloud Spanner vs Cloud SQL — Cloud Spanner supports strong consistency, which means that all reads and writes to the database are consistent and reflect the latest state of the data. Cloud SQL, on the other hand, supports eventual consistency, which means that data may be temporarily inconsistent after a write operation
  • Bigtable vs Firestore — Bigtable is useful for extremely write heavy workloads and when massive amounts of data generated over x units of time such as in IOT devices, analytics (fraud detection, personalization, recommendations), and ad-serving (where every microsecond counts). Firestore is generally used for mobile and web applications (since it is well integrated with firebase ecosystem)

Different storage options for unstructured data

Cloud Storage is a used to store unstructured data, especially large files. Different storage classes are available based on frequency of data usage — Standard(for hot data), Nearline (data accessed or modified once per month or lesser), Coldline(once every 3 months or lesser), Archive(once a year or lesser)

Firestore and Cloud Bigtable can also be used to store unstructured data, along with structured and semi-structured data.

Getting Value from Data

To get value from data, it is important to ensure data integrity. This can be done by:
- By implementing a set of rules when database is designed
- Through error checking and validation routine as data is collected

Characteristics of High Quality data
- Coverage, completeness, cleanliness

Google’s AI Platform

This is a platform which provides various ML services for data scientists and developers.

Google AI Platform (Source: https://www.forbes.com/sites/janakirammsv/2019/04/16/why-you-should-consider-google-ai-platform-for-your-machine-learning-projects/?sh=17229ce22ab7)

Some of the key components of the AI platform are:

  • AI Hub: Catalog of reusable plug-and-play AI models/components that can be quickly deployed to one of the execution environments of AI Platform
  • Deep Learning VM: Enables user to quickly and easily instantiate a virtual machine(VM) image containing the most popular deep learning and machine learning frameworks on a Google Compute Engine instance(GCE is one of the many Google compute options, refer part 2 for more details).

Other AI and ML Products

Pretrained APIs: Useful if you don’t have specialized data scientists but do have business analysts and developers. Low effort approach, but less customizable.
Some API’s include
- Vision API — offers pre-trained and Machine Learning models, which use Google data to automatically detect faces, objects, text, and even sentiment in images.
- Natural Language API — discovers syntax, entities and sentiment in text
- Dialogflow API — Build conversational interfaces (for example, chatbots, and voice-powered apps)
- Recommender API — enables retailers to deliver highly personalized product recommendations

Business solutions
Contact Center AI : For contact centers to speak with customers
Document AI : To get insights from documents
Cloud Talent solution : For job search

Vertex AI
Vertex AI is a platform for creating custom end-to-end AI models

Auto ML
To train ML model on your data without code

Tensorflow
An open-source machine learning platform to build and deploy their own custom machine learning applications using TPU

The following flowchart gives a high level idea of which service is suitable for what kind of audience

Points to Note

  • Vision API vs AutoML Vision — Vision API can be used to detect objects and faces, read printed and handwritten text, and build valuable metadata into your image catalog, while AutoML Vision can be used to automate the training of your own custom machine learning models. AutoML Vision automates the training of custom machine learning models, whilst Vision API offers powerful pre-trained machine learning models through REST and RPC APIs

When to use ML

Following are the scenarios when ML can be used
1. Replacing rule based system
2. Automating process
3. Understanding unstructured data
4. Personalizing applications

Practice Questions

Note: Most of the questions are from https://www.examtopics.com/exams/google/cloud-digital-leader/ , so I will share the question number so that you can verify the answer

Question 1

Your organization is developing an application that will capture a large amount of data from millions of different sensor devices spread all around the world. Your organization needs a database that is suitable for worldwide, high-speed data storage of a large amount of unstructured data.
Which Google Cloud product should your organization choose?

A) Firestore
B) Cloud Data Fusion
C) Cloud SQL
D) Cloud Bigtable

Question 2

An organization is looking for a data warehouse for analysis and reporting with seamless scaling. Which Google Cloud product or service should the organization use?

A) Cloud Storage
B) Dataflow
C) BigQuery
D) Looker

Question 3

Your organization is developing an application that will manage payments and online bank accounts located around the world. The most critical requirement for your database is that each transaction is handled consistently. Your organization anticipates almost unlimited growth in the amount of data stored.
Which Google Cloud product should your organization choose?

· A. Cloud SQL
· B. Cloud Storage
· C. Firestore
· D. Cloud Spanner

Question 4

Your organization wants an economical solution to store data such as files, graphical images, and videos and to access and share them securely.
Which Google Cloud product or service should your organization use?

· A. Cloud Storage
· B. Cloud SQL
· C. Cloud Spanner
· D. BigQuery

Question 5

Your organization needs to categorize objects in a large group of static images using machine learning. Which Google Cloud product or service should your organization use?

· A. BigQuery ML
· B. AutoML Video Intelligence
· C. Cloud Vision API
· D. AutoML Tables

Question 6

An organization is searching for an open-source machine learning platform to build and deploy their own custom machine learning applications using TPUs.
Which Google Cloud product or service should the organization use?

· A. TensorFlow
· B. BigQuery ML
· C. Vision API
· D. AutoML Vision

Question 7

An organization needs to categorize text-based customer reviews on their website using a pre-trained machine learning model.
Which Google Cloud product or service should the organization use?

· A. Cloud Natural Language API
· B. Dialogflow
· C. Recommendations AI
· D. TensorFlow

Answers :

  1. Refer Q20 of https://www.examtopics.com/exams/google/cloud-digital-leader/
  2. Refer Q32 of above site
  3. Refer Q36 of above site
  4. Refer Q37 of above site
  5. Refer Q28 of above site
  6. Refer Q69 of above site
  7. Refer Q67 ofabove site

--

--