Google Cloud Certified Professional Data Engineer

Paths

Google Cloud Certified Professional Data Engineer

Author: Google Cloud

The foundation of Professional Data Engineer mastery is with the real-world job role of the cloud data engineer. Along with relevant experience, the training in this learning path... Read more

What you will learn:

  • Design and build data processing systems on Google Cloud
  • Lift and shift your existing Hadoop workloads to the Cloud using Cloud Dataproc
  • Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow
  • Manage your data pipelines with Data Fusion and Cloud Composer
  • Derive business insights from extremely large datasets using BigQuery
  • Learn how to use pre-built ML APIs
  • Enable instant insights from streaming data

Pre-requisites

Learners should be familiar with the fundamentals of cloud computing and relevant practical experience. Recommended having 3+ years of industry experience including 1+ years designing and managing solutions using Google Cloud to attempt the Professional Data Engineer exam.

Preparing for Google Cloud Professional Data Engineer (PDE) Exam

Along with relevant experience, the training in this learning path will help you prepare for the Professional Data Engineer (PDE) exam, better understand the areas covered by the exam, and navigate the recommended resources: https://cloud.google.com/certification/data-engineer.

For more information about the exam and to register for, and pass the official Google Cloud certification exam, visit cloud.google.com/certification/data-engineer.

Google Cloud Platform Big Data and Machine Learning Fundamentals

by Google Cloud

Jun 29, 2020 / 4h 55m

4h 55m

Start Course
Description

This 1-week accelerated on-demand course introduces participants to the Big Data and Machine Learning capabilities of Google Cloud Platform (GCP). It provides a quick overview of the Google Cloud Platform and a deeper dive of the data processing capabilities.

Table of contents
  1. Course Overview
  2. Introduction to Google Cloud Platform
  3. Recommending Products using Cloud SQL and Spark
  4. Predict Visitor Purchases Using BigQuery ML
  5. Create Streaming Data Pipelines with Cloud Pub/sub and Cloud Dataflow
  6. Classify Images with Pre-Built Models using Vision API and Cloud AutoML
  7. Summary

Modernizing Data Lakes and Data Warehouses with GCP

by Google Cloud

Jan 14, 2020 / 3h 34m

3h 34m

Start Course
Description

The two key components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud Platform in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. Learners will get hands-on experience with data lakes and warehouses on Google Cloud Platform using QwikLabs.

Table of contents
  1. Introduction
  2. Introduction to Data Engineering
  3. Building a Data Lake
  4. Building a data warehouse
  5. Summary

Building Batch Data Pipelines on GCP

by Google Cloud

Jan 14, 2020 / 2h 42m

2h 42m

Start Course
Description

Data pipelines typically fall under one of the Extra-Load, Extract-Load-Transform or Extract-Transform-Load paradigms. This course describes which paradigm should be used and when for batch data. Furthermore, this course covers several technologies on Google Cloud Platform for data transformation including BigQuery, executing Spark on Cloud Dataproc, pipeline graphs in Cloud Data Fusion and serverless data processing with Cloud Dataflow. Learners will get hands-on experience building data pipeline components on Google Cloud Platform using QwikLabs.

Table of contents
  1. Introduction
  2. Introduction to Batch Data Pipelines
  3. Executing Spark on Cloud Dataproc
  4. Manage Data Pipelines with Cloud Data Fusion and Cloud Composer
  5. Serverless Data Processing with Cloud Dataflow
  6. Summary

Building Resilient Streaming Analytics Systems on GCP

by Google Cloud

Jan 14, 2020 / 3h 11m

3h 11m

Start Course
Description

Processing streaming data is becoming increasingly popular as streaming enables businesses to get real-time metrics on business operations. This course covers how to build streaming data pipelines on Google Cloud Platform. Cloud Pub/Sub is described for handling incoming streaming data. The course also covers how to apply aggregations and transformations to streaming data using Cloud Dataflow, and how to store processed records to BigQuery or Cloud Bigtable for analysis. Learners will get hands-on experience building streaming data pipeline components on Google Cloud Platform using QwikLabs.

Table of contents
  1. Introduction
  2. Introduction to Processing Streaming Data
  3. Serverless Messaging with Cloud Pub/Sub
  4. Cloud Dataflow Streaming Features
  5. High-Throughput BigQuery and Bigtable Streaming Features
  6. Advanced BigQuery Functionality and Performance
  7. Summary

Smart Analytics, Machine Learning, and AI on GCP

by Google Cloud

Jan 14, 2020 / 1h 39m

1h 39m

Start Course
Description

Incorporating machine learning into data pipelines increases the ability of businesses to extract insights from their data. This course covers several ways machine learning can be included in data pipelines on Google Cloud Platform depending on the level of customization required. For little to no customization, this course covers AutoML. For more tailored machine learning capabilities, this course introduces AI Platform Notebooks and BigQuery Machine Learning. Also, this course covers how to productionalize machine learning solutions using Kubeflow. Learners will get hands-on experience building machine learning models on Google Cloud Platform using QwikLabs.

Table of contents
  1. Introduction
  2. Introduction to Analytics and AI
  3. Prebuilt ML model APIs for Unstructured Data
  4. Big Data Analytics with Cloud AI Platform Notebooks
  5. Productionizing Custom ML Models
  6. Custom Model building with SQL in BigQuery ML
  7. Custom Model Building with Cloud AutoML
  8. Summary

Preparing for the Google Cloud Professional Data Engineer Exam

by Google Cloud

Apr 10, 2020 / 2h 25m

2h 25m

Start Course
Description

This course helps prospective candidates structure their preparation for the Professional Data Engineer exam. The session will cover the structure and format of the examination, as well as its relationship to other Google Cloud certifications. Through lectures, quizzes, and discussions,candidates will familiarize themselves with the domain covered by the examination, so as to help them devise a preparation strategy. Rehearses useful skills including exam question reasoning and case comprehension. Tips. Review of topics from the Data Engineering curriculum.

Table of contents
  1. Welcome to Preparing for the Professional Data Engineer Exam
  2. Designing Data Processing Systems
  3. Building and Operationalizing Data Processing Systems
  4. Operationalizing Machine Learning Models
  5. Reliability, Policy and Security to ensure solution quality
  6. Resources and next steps