What you will learn:
- Identify the critical data roles in an organization
- Analyze large datasets with BigQuery in your lab
- Review how businesses use recommendation models
- Evaluate how and where you will compute and store your housing rental model results
- Analyze how running Hadoop in the cloud with Cloud Dataproc can enable scale
- Evaluate different approaches for storing recommendation data off-cluster
- Learn how BigQuery processes queries and stores data at scale
- Walkthrough key ML terms- features, labels, training data
- Evaluate the different types of models for structured datasets
- Create custom ML models with BigQueryML
- Identify modern data pipeline challenges and how to solve them at scale with Cloud Dataflow
- Design streaming pipelines with Apache Beam
- Evaluate how businesses use unstructured ML models and how the models work
- Choose the right approach for machine learning models between pre-built and custom
- Create a high-performing custom image classification model with no code using Cloud AutoML
- Review the solution architectures you created using Google Cloud Platform big data tools
- Understand the role of a data engineer and benefits of data engineering on GCP
- Discuss challenges of data engineering practice and how building data pipelines in the Cloud helps to address these
- Review and understand the purpose of a data lake versus a data warehouse, and when to use which
- Understand why Cloud Storage is great option to build a data lake on GCP
- Understand why BigQuery is the scalable data warehousing solution on GCP
This section introduces participants to the Big Data and Machine Learning capabilities of Google Cloud Platform (GCP). It provides a quick overview of the Google Cloud Platform and a deeper dive of the data processing capabilities.
This course introduces participants to the big data capabilities of Google Cloud. Through a combination of presentations, demos, and hands-on labs, participants get an overview of Google Cloud and a detailed view of the data processing and machine learning capabilities. This course showcases the ease, flexibility, and power of big data solutions on Google Cloud.
Table of contents
- Introduction to the Data and Machine Learning on Google Cloud Course
- Introduction to Google Cloud Platform
- Recommending Products using Cloud SQL and Spark
- Predict Visitor Purchases Using BigQuery ML
- Real-time IoT Dashboards with Pub/Sub, Dataflow, and Data Studio
- Deriving Insights from Unstructured Data using Machine Learning
In this Section, we will see what the common challenges faced by data analysts are and how to solve them with the big data tools on Google Cloud Platform. You’ll pick up some SQL along the way and become very familiar with using BigQuery and Cloud Dataprep to analyze and transform your datasets.
By the end, you’ll be able to query and draw insight from millions of records in our BigQuery public datasets. You’ll learn how to assess the quality of your datasets and develop an automated data cleansing pipeline that will output to BigQuery. Lastly, you’ll get to practice writing and troubleshooting SQL on a real Google Analytics e-commerce dataset to drive marketing insights.
In this course, we see what the common challenges faced by data analysts are and how to solve them with the big data tools on Google Cloud Platform. You’ll pick up some SQL along the way and become very familiar with using BigQuery and Cloud Dataprep to analyze and transform your datasets.
Table of contents
- Welcome to From Data to Insights with Google Cloud Platform - Exploring and Preparing your Data
- Introduction to Data on Google Cloud Platform
- Big Data Tools Overview
- Exploring your Data with SQL
- Google BigQuery Pricing
- Cleaning and Transforming your Data
This section talks about the two key components of any data pipeline, data lakes and warehouses. It highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud Platform in technical detail. This section also covers the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. Learners will get hands-on experience with data lakes and warehouses on Google Cloud Platform using QwikLabs.
The two key components of any data pipeline are data lakes and warehouses. This course highlights use-cases for each type of storage and dives into the available data lake and warehouse solutions on Google Cloud Platform in technical detail. Also, this course describes the role of a data engineer, the benefits of a successful data pipeline to business operations, and examines why data engineering should be done in a cloud environment. Learners will get hands-on experience with data lakes and warehouses on Google Cloud Platform using QwikLabs.
Table of contents
- Introduction to Data Engineering
- Building a Data Lake
- Building a data warehouse