Author avatar

Néstor Campos

Cloud Certifications: GC Professional Data Engineer

Néstor Campos

  • Jul 13, 2020
  • 8 Min read
  • 552 Views
  • Jul 13, 2020
  • 8 Min read
  • 552 Views
Cloud
Cloud Data and AI
Data and Analytics
Google Professional Data Engineer

Introduction

This guide provides information and resources to prepare for the GC Professional Data Engineer certification. It will give you advice based on the experience of other certified data engineers and architects.

Who This Certification Is For

This certification and the examination to obtain it are designed for data engineers with at least three years of experience using GC technologies for massive information processing, both in real time and in batch. Candidates must also be able to apply best practices in each technology and know how they are connected to each other for scalable solutions.

This certification is recommended if you are a data engineer with experience executing business intelligence, machine learning, or big data projects with GC and you know the concepts behind technologies like HDFS, Spark, Kafka, Hadoop, Streaming, etc.

What This Certification Is For

This certification is useful for many reasons, including:

  • Demonstrating your GC capabilities to solve real problems concerning massive data processing
  • Reinforcing your knowledge of certain aspects behind each technology
  • Understanding the path that the industry will follow in the coming years and how to focus your career
  • Increasing your recognition in the industry as you search for new job opportunities due to the high demand for professionals in this field and the value of being certified for the market

Applicable Exams

For this certification, you only need to take the Professional Data Engineer exam. This exam assesses your ability to design data products, put them into production, and monitor and protect them to ensure scalable performance for high-performance solutions. In addition, it also measures your ability to design machine learning models using the range of options available in GC, including the use of pre-existing models (such as Vision or Speech API).

Your preparation for the exam should look like this:

  • Analyze and study the contents of the Pluralsight and GC courses.
  • After studying one of the technologies, practice with your own GC account by creating and deleting resources and connecting them with other GC services, so that you discover the steps and the configurations that you will have to do. Practice on this exam is essential, as many questions are understood only through experience.
  • If you can, you should take a practice exam. GC offers one here.
  • Implement a complete data solution in your GC account, working with both real-time and batch data.

Prerequisites

You can take this certification exam directly without taking any any others, but do not take the exam if you have never had the opportunity to work with GC. You should also study quite a few topics that you may never have had to develop, but that are still part of the technologies and standards.

Because this certification focuses very heavily on technical details, you should have at least five years of work experience with data solutions (ETL, machine learning, etc.), and two to three years of experience implementing projects with the GC technologies that you'll be tested on in this exam.

Skills Measured

Skills

The skills that will be measured during the exam may vary, but can consist of:

  • Understanding business requirements for implementing solutions
  • Collecting data in batch, real time, and near-real time processes
  • Online and batch predictions in machine learning
  • Data processing, both at the file and database levels (SQL and NoSQL)
  • Creating conversational experiences for users with machine learning
  • Information analysis and visualization
  • Implementing pre-built machine learning models and custom models
  • Applying security, encryption, and management to each part of processes
  • Notions of infrastructure, especially in a hybrid environment
  • Monitoring every process and data movement

In other words, you must understand and be able to practice throughout the life cycle of a data project.

Technologies

The technologies that you must understand and use to successfully complete the exam are:

  • BigQuery

  • Cloud Dataflow

  • Cloud Dataproc
  • Apache Beam
  • Apache Spark
  • Hadoop (and its ecosystem)
  • Cloud Pub/Sub
  • Apache Kafka

  • Cloud Composer

  • Data Transfer Service
  • Transfer Appliance
  • Cloud Networking
  • Cloud Bigtable
  • Cloud Spanner
  • Cloud SQL
  • Cloud Storage
  • Cloud Datastore
  • Cloud Memorystore
  • Cloud Dataprep
  • ML APIs (such as Vision, Speech)
  • AutoML Vision
  • Auto ML text
  • Dialogflow
  • Other ML technologies, such as Cloud Machine Learning Engine, BigQuery ML, Kubeflow, Spark ML.
  • Cloud IAM
  • Data Loss Prevention API
  • Stackdriver
  • Key management and encryption

As you can see, there are many technologies you'll need to understand, so it could also be considered a prerequisite to have experience with these technologies and how they connect with each other.

Resources

Pluralsight Courses

Pluralsight has very good courses on each of the technologies mentioned, developed by experts in the industry. All the content relevant to this exam can be found here.

GCP Pluralsight

It would also be a good idea to follow Pluralsight's Data Engineer path to study for this exam because it involves studying many technologies in detail, similar to what you will see in the Professional Data Engineer exam.

Hands-on Practice

GC has a platform called Qwiklabs that allows you to practice in a guided way, which can help you get to know the technologies in a practical way, but it is not a substitute for true practical experience. You can find these exercises here.

Qwiklabs

Compensation and Employment Outlook

The benefits of obtaining this certification include:

  • Being able to participate in projects of high technical complexity related to issues of massive data processing

  • Recognition from the industry and your colleagues

  • Direct benefits from GC, such as being registered in the Google certificates directory where anyone can find you, badges to share with the community, and other recognitions and discounts
  • Qualification for better jobs—the certification is valid anywhere in the world

According to Paysa, the annual salary of someone with the GC Professional Data Engineer certificate on average reaches US$ 152,428, or more if you have additional experience and certifications.

The Certification Path

As mentioned, you can take this certification without having another one as a prerequisite. But if you have never taken a certification exam, I recommend that you start with a less complex exam (Foundation, for example) so that you understand the dynamics of the exam.

Certification list

Conclusion

Finally, some advice for your certification plan for becoming a GC Professional Data Engineer:

  • It's good to learn the technologies, but you should also focus on how they connect to each other on a global and high availability platform.

  • You should also study the best practices that GC suggests for its technologies and understand various use cases for them.

  • Be very disciplined in your study and don't assume that you know a technology well if you have not studied it, since there are always details that can deepen your understanding.
  • Be an engineer with multiple facets—that is, with skills to process data, generate machine learning models, ensure high availability of your solutions, and understand infrastructure restrictions, among others, to reach the goal that you want.

Achieving this certification will mean taking a firm step in your professional career, wherever you want to develop. Congratulate yourself for taking this step, and give your 100% to become a certified and recognized engineer.

I wish you a lot of success in becoming a GC Professional Data Engineer!

6