Building Batch Data Processing Solutions in Microsoft Azure
In this course, you will learn how to perform ETL and ELT batch data processing workflows by using Microsoft Azure products and partner technologies.
What you'll learn
How can you gain business insights from data lakes and data warehouses? How can you use Hadoop, Spark, and Databricks in Microsoft Azure? In this course, Building Batch Data Processing Solutions in Microsoft Azure, you will gain the ability to implement scalable, performant, and accurate batch processing in the Microsoft Azure cloud. First, you will learn how to run batch processing jobs in Azure SQL Data Warehouse. Next, you will discover how HDInsight enables cloud-hosted Hadoop clusters. Finally, you'll explore Apache Spark and Azure Databricks, and learn how to integrate them with other Azure products. When you are finished with this course, you will have the skills and knowledge of batch data processing needed to advance your career as a data engineer.
Table of contents
- Overview 3m
- Preliminary Terminology 7m
- About Batch Processing 3m
- Azure SQL DB vs. Azure SQL DW 4m
- Demo: Create an Azure Data Lake Storage Gen2 Account 3m
- Demo: Deploy Azure SQL Data Warehouse 4m
- Understand Data Lake Storage Gen2 3m
- PolyBase 1m
- Demo: Use PolyBase in Azure SQL DW 7m
- Data Analysis Options 2m
- Azure Data Factory 1m
- Demo: Perform ETL Operations with Azure Data Factory 12m
- For Further Learning 1m
- Summary 1m
- Overview 1m
- Introducing Apache Hadoop 4m
- The Hadoop Ecosystem 3m
- Hadoop vs. Traditional RDBMS 6m
- Azure HDInsight Architecture 3m
- Demo: Create an HDInsight Cluster 8m
- Azure Data Factory/HDInsight Integration 1m
- Demo: Ingest a Dataset into Data Lake Storage 4m
- Demo: Perform Data Extraction with Hive 5m
- Demo: Perform Data Transformation with Hive 2m
- Demo: Perform Data Loading with Sqoop 3m
- About Apache Spark 1m
- About Azure Databricks 2m
- Demo: Perform Data Visualization with HDInsight and Spark 8m
- For Further Learning 1m
- Summary 2m
- Overview 2m
- Understand Azure Databricks 2m
- MapReduce vs. Spark 3m
- The Azure Databricks Ecosystem 4m
- The Notebook Paradigm in Data Analysis 3m
- Demo: Deploy an Azure Databricks Service 5m
- Demo: Define a Cluster and Workspace 4m
- Demo: Perform ETL with Azure Databricks 8m
- About Azure Event Hub 3m
- Demo: Data Processing with Event Hub and Azure Databricks 7m
- About Azure Batch 2m
- Azure Distributed Data Engineering Toolkit 2m
- For Further Learning 1m
- Summary 2m