Building Data Pipelines in Microsoft Azure

Paths

Building Data Pipelines in Microsoft Azure

Authors: Michael Bender, Reza Salehi, Tim Warner, Marcelo Pastorino

In a world increasingly dominated by data, it’s more important than ever for data engineers and scientists to build data pipeline solutions that can support both traditional data... Read more

What you will learn

  • How to use tools like Azure Databricks and HDInsight Kafka to implement data pipeline solutions
  • How to use Azure Stream Analytics to process and visualize live data
  • How to implement scalable and accurate batch processing in Microsoft Azure
  • How to use Azure Data Factory to construct ETL processes and data integration pipelines

Pre-requisites

This path is intended for learners who already already are or who want to skill up to become data scientists and data engineers. Learners should already be familiar with fundamental data concepts and the Azure portal.

Beginner

The course in this section of the path teaches you implement data pipeline solutions using Azure Databricks.

Implementing an Azure Databricks Environment in Microsoft Azure

by Michael Bender

May 27, 2020 / 2h 5m

2h 5m

Start Course
Description

Every day, we have more and more data, and the problem is how do we get to where we can use the data for business needs. In this course, Implementing a Databricks Environment in Microsoft Azure, you will learn foundational knowledge and gain the ability to implement Azure Databricks for use by all your data consumers like business users and data scientists. First, you'll learn the basics of Azure Databricks and how to implement ts components. Next, you will discover how to work with Azure Databricks during ETL (Extract, Transform, Load) operations. Then, you'll move on to performing batch scoring with machine learning models. Finally, you will explore how to work with streaming data from HDInsight Kafka. When you’re finished with this course, you will have the skills and knowledge of Azure Databricks needed to implement data pipeline solutions for your data consumers. Software required: Microsoft Azure Subscription

Table of contents
  1. Course Overview
  2. Implementing an Azure Databricks Environment
  3. Performing ETL (Extract, Transform, Load) Operations with Azure Databricks
  4. Batch Scoring of Apache Spark ML Models with Azure Databricks
  5. Streaming HDInsight Kafka Data into Azure Databricks

Intermediate

These intermediate courses will take you through some of the more intricate elements within building data pipelines in Microsoft Azure, including performing ETL and ELT batch data processing and using Azure Stream Analytics to process live data. Once you fully comprehend these topics, you’ll be ready to move on to the advanced courses.

Building Streaming Data Pipelines in Microsoft Azure

by Reza Salehi

Jun 30, 2020 / 3h 2m

3h 2m

Start Course
Description

Processing live data streams in real time can be challenging and expensive. In this course, Building Streaming Data Pipelines in Microsoft Azure, you will gain the ability to effectively use Azure Stream Analytics for your live data processing needs. First, you will learn to configure stream and reference inputs for the service. Next, you will discover how to process your data using the Stream Analytics Query Language. Finally, you will explore how to visualize Azure Stream Analytics output with Microsoft Power BI. When you are finished with this course, you will have the skills and knowledge of Azure Stream Analytics needed to turn your live stream data into meaningful, actionable information.

Table of contents
  1. Course Overview
  2. Azure Stream Analytics Overview
  3. Configure Azure Stream Analytics with Event Hub and Blob Storage Inputs
  4. Query Data Using Azure Stream Analytics
  5. Implement Azure Stream Analytics Data Visualization with PowerBI
  6. Anomaly Detection in Azure Stream Analytics
  7. Analyzing Stream Data Using Azure Data Explorer

Building Batch Data Processing Solutions in Microsoft Azure

by Tim Warner

Jun 15, 2020 / 1h 45m

1h 45m

Start Course
Description

Long-running batch data processing can be difficult to manage locally - why not use Microsoft Azure? In this course, Building Batch Data Processing Solutions in Microsoft Azure, you’ll gain the ability to perform high-scale ETL and ELT operations entirely in the Azure public cloud. First, you’ll explore hosted Apache Spark processing with Azure Databricks. Next, you’ll discover how to transfer data in bulk with Azure Data Factory and Azure Data Explorer. Finally, you’ll learn how to handle stream data processing jobs with Azure Stream Analytics. When you’re finished with this course, you’ll have the skills and knowledge of batch data processing needed to earn your Azure Data Engineer certification and be productive with the Microsoft Azure Data Platform.

Table of contents
  1. Course Overview
  2. Develop a Batch Processing Solution with Azure Databricks
  3. Develop a Batch Processing Solution with Azure Synapse Analytics
  4. Develop a Batch Processing Solution with Azure Data Explorer
  5. Implement Event Processing with Azure Stream Analytics

Advanced

In the final course in this path, you’ll learn advanced topics such as migrating data from on-premise and AWS to Azure, constructing ETL processes and data integration pipelines with Azure Data Factory, and creating real-time data pipelines.

Integrating Data in Microsoft Azure

by Marcelo Pastorino

Sep 17, 2019 / 2h 22m

2h 22m

Start Course
Description

Data-driven decision making is the path to business success. In this course, Integrating Data in Microsoft Azure, you will gain foundational knowledge to integrate data utilizing the power of Microsoft Azure.

First, you will learn how to migrate data from on-premise and Amazon Web Services to Azure. Next, you will discover how to easily construct ETL processes and create data integration pipelines using Azure Data Factory.

Finally, you will explore how to create a real-time pipeline, to ingest and process real-time events sent by IoT devices using Azure EventHubs, Azure Stream Analytics, and Power BI.

When you’re finished with this course, you will have the skills and knowledge needed to create data integration pipelines using some of the great tools that are part of the Azure ecosystem.

Table of contents
  1. Course Overview
  2. Data Integration Services on Azure
  3. Migrate On-premise Data to Azure SQL Server
  4. Migrate Data from Amazon S3 to Azure Blob Storage
  5. Create Data Pipelines with Azure Data Factory Copy Data Tool
  6. Create Data Pipelines with Azure Data Factory
  7. Create Real-time Data Pipelines with Azure EventHubs and Azure Stream Analytics
  8. Real-time Monitoring with Power BI
Offer Code *
Email * First name * Last name *
Company
Title
Phone
Country *

* Required field

Opt in for the latest promotions and events. You may unsubscribe at any time. Privacy Policy

By providing my phone number to Pluralsight and toggling this feature on, I agree and acknowledge that Pluralsight may use that number to contact me for marketing purposes, including using autodialed or pre-recorded calls and text messages. I understand that consent is not required as a condition of purchase from Pluralsight.

By activating this benefit, you agree to abide by Pluralsight's terms of use and privacy policy.

I agree, activate benefit