Get Certified: Google Cloud Platform Professional Data Engineer

Denys van Kempen
7 min readSep 13, 2020

--

In this blog post series, you will find some personal suggestions about how to get certified on the Google Cloud Platform.

The PDE exam is particularly challenging as it requires familiarity with both data processing and machine learning, including data warehousing (BigQuery), messaging (PubSub), NoSQL (Cloud BigTable), ETL (Dataproc, Dataflow, Dataprep), databases (Cloud SQL, Cloud Spanner), AI platform, and the different Google ML technologies.

Any good? Post a comment, share on social media, and/or give a like. Thanks!

Previously on…

To host its platform-as-a-service, SAP partners with cloud providers like Google. We already covered the “Embrace Project” in the previous blog (see blue box above) and SAP’s multicloud strategy, no need to repeat ourselves here, but should you need an update of today’s multicloud world (Cloud Foundry, Kubernetes, Kyma, Hybrid, Neo, etc.), see

SAP on Google Cloud

For general information about running SAP on Google Cloud, there is some excellent material available in blog, video, and other formats:

Although not included in any certification (yet), the SAP on GCP Documentation provides a quickly growing library of guides for planning, deploying, and operating SAP solutions.

Google Cloud Certification Program

To promote its platform, Google has a number of programs to make it easier to get certified, see inthecloud.withgoogle.com.

For the argument why you might want to take up a certification (should you need any convincing, or want to get some funding from your manager), see Google Cloud Certified Program | Boost your career.

The Data Engineer is one of the professional-level exams with 3+ years industry experience recommended, including 1+ years on Google Cloud. Machine Learning and Cloud DevOps were recently added (2020). For the program, see cloud.google.com/certification.

On the exam overview page, you find all the information you need about the exam, how to prepare, how to register, etc. > cloud.google.com/certification/data-engineer.

As with most other certifications these days, you no longer have to travel to a location but can take the exam online as well.

Exam Guide

The exam guide lists exactly the topics you need to know. This is an extensive and comprehensive list that you can take quite literally: you can expect questions about each of the points mentioned.

For the data engineer exam, the topics are

  1. Designing data processing systems, incl. BigQuery, Cloud Composer, Cloud Dataflow, Cloud Dataproc, Apache Beam, Apache Spark and Hadoop ecosystem, Cloud Pub/Sub, Apache Kafka)
  2. Building and operationalizing data processing systems, incl. Cloud Bigtable, Cloud Spanner, Cloud SQL, BigQuery, Cloud Storage, Cloud Datastore, Cloud Memorystore
  3. Operationalizing machine learning models, incl. Cloud Machine Learning Engine, BigQuery ML, Kubeflow, Spark ML, Dialogflow, and the ML APIs
  4. Ensuring solution quality, incl. IAM, Stackdriver

The practice exam (32 questions) gives a very accurate indication of the type of questions you can expect. Both the question and the answers can be quite verbose and sometimes very similar in wording. You just might need the full 2 hours allocated for real exam to answer the 55 question.

Documentation

You will need to familiarise yourself with the documentation of the Google Cloud products covered in the exam, e.g. BigQuery, Bigtable, Cloud Composer, Dataproc, Dataflow, PubSub, As documented Google Cloud.

As you can spent days jus reading the docs for BigQuery, you will need to scan/browse the pages and make notes.

Online Courses

Coursera

Google training courses are available online/on-demand from Coursera. Currently, the first month access is free, so if you carve out some time to prepare, this should give you a good head start. The training consists of 6 course with an exam study guide to wrap it up.

Note that although these course will help you understand where the material is about, it will not be enough to pass the exam. Complement with documentation, trial exams, and other resources.

Not to be missed is exam-prep course: Preparing for the Google Cloud Professional Data Engineer Exam

Pluralsight

Google courses are also available on the Pluralsight. You can get a free 10-day trial when you sign up but your company may also have a corporate subscription (like at SAP). The PDE path contains 18 hours of video content.

Linux Academy

The Linux Academy (now part of A Cloud Guru) also offers a complete course: Google Cloud Certified Professional Data Engineer . Again, with a free 7-day trial.

Google Machine Learning Crash Course

Although only a handful questions are about machine learning in the exam (possible less then before as there is now also a Professional ML Engineer exam), you are expected to be quite familiar with the terminology and know when to use L1 or L2 regularisation, for example, when your model is over or underfitting.

Qwiklabs

An essential study aid is provided by the hands-on labs integrated into the courses provided by Google. There are well over 400 labs (no point doing them all) and close to 90 challenges (great learning resource) to test your knowledge. The labs provide free access to the Google Cloud Platform for the duration of the lab (30–120 minutes) and although you typically finish the exercises in a fraction of the time, if you pay close attention to what you are doing and why, you should be able to answer (almost) any exam question without problems.

You can signup for free and although most labs require credits, you get these for free as well when you accept the challenges posted by Qwiklabs on LinkedIn, Twitter, and Facebook.

Official Google Cloud Certified Study Guides

Highly recommended reading is the official study guide by Dan Sullivan. Well-written and comprehensive: gives you a good overall idea about all the material you need to study. As mentioned by Dan in the introduction, the book alone will not suffice to pass the exam but it certainly is a great reference and resource. The book includes two sample exams very similar to the real exam. You can get the book on Amazon or the publisher Wiley.

Dan Sullivan has also published a course on Udemy with 6 hours of video and another test exam: Google Cloud Professional Data Engineer: Get Certified 2020

Free Tier and Free Products

To help you prepare for your study, you can make use of the $300 free credit. In addition, there are also some products which are always free (up to monthly limits).

Normal usage, this is enough for your exam training.

Certified Directory

Once you get your certification(s), you can choose to be listed in the Google Cloud Credential Holder Directory.

Ten months ago there were 5,600 credential holders. Now, time of writing, this has tripled to 15,400 (3,700 data engineers).

Wait, There is More

For all GCP products, see

See also Awesome GCP Certifications on github

Post a comment, Share on Social Media, Like

Any good? Post a comment, share on social media, and/or give a like. Thanks.

If you would like to receive updates, connect with me on

Best,

Denys van Kempen

--

--