Microsoft Azure Data Engineer (DP-200)

Paths

Microsoft Azure Data Engineer (DP-200)

Authors: Gary Grudzinskas, Reza Salehi, John Savill, Michael Bender, Tim Warner, Marcelo Pastorino, Robert Lindley

Microsoft Azure offers a spread of services dedicated to addressing common business data engineering problems. This skill teaches how these Azure services work together to enable... Read more

What you will learn

  • Selecting the appropriate Azure data service for your specific business needs
  • Implementing relational, non-relational, hybrid, and data warehouse solutions in Microsoft Azure
  • Deploying storage solutions in Microsoft Azure
  • Managing and monitoring Azure data services
  • Implementing and deploying data processing solutions in Microsoft Azure

Pre-requisites

This path is designed for learners familiar with common data engineering concepts, such as relational databases, data pipelines, and stream vs. batch processing. It is not assumed that the learner has experience applying these concepts in Microsoft Azure.

Implementing Storage Solutions in MIcrosoft Azure

The courses in this level teach you how to build storage solutions in Microsoft Azure that are appropriate for your business needs. Specifically, building data warehouses, NoSQL and relational databases, and hybrid data solutions are addressed.

Implementing a Cloud Data Warehouse in Microsoft Azure Synapse Analytics

by Gary Grudzinskas

Dec 13, 2019 / 1h 18m

1h 18m

Start Course
Description

Microsoft Azure Synapse Analytics is a new environment that merges all the Azure data resources into one shared space. In this course, Implementing a Cloud Data Warehouse in Microsoft Azure Synapse Analytics, you'll focus on building an SQL data warehouse in Synapse. First, you'll learn what Azure Synapse Analytics does. Next, you'll discover how to deploy a data warehouse. Finally, you'll explore how to load data into your warehouse and work with it, saving money in the process. By the end of this course, you’ll have a working Synapse Analytics SQL data warehouse, loaded with data, that will let you work with all the new features.

Table of contents
  1. Course Overview
  2. Understanding Microsoft Azure Synapse Analytics
  3. Deploying a Data Warehouse in Microsoft Azure Synapse Analytics
  4. Tuning and Optimizing a Data Warehouse in Microsoft Azure Synapse Analytics

Implementing NoSQL Databases in Microsoft Azure

by Reza Salehi

Sep 10, 2019 / 3h 19m

3h 19m

Start Course
Description

Discover the main Microsoft Azure managed NoSQL offerings. In this course, Implementing NoSQL Databases in Microsoft Azure, you will gain the ability to use Cosmos DB, Azure Table Storage and Azure Data Lake Gen2 for big data. First, you will learn about the cheap and fast Azure Table Storage NoSQL database. Next, you will discover the power of Azure Cosmos DB multi-model APIs for new and migrated NoSQL workloads. Finally, you will explore how to provision an Azure Data Lake Storage Gen2 instance and upload unstructured data to it. When you are finished with this course, you will have the skills and knowledge of managed NoSQL database offerings needed to start using NoSQL databases in Microsoft Azure.

Table of contents
  1. Course Overview
  2. An Introduction on NoSQL Databases
  3. Azure Storage Accounts - Table, Blob, Queue, and File
  4. Azure Cosmos DB Overview
  5. Working with Azure Cosmos DB - SQL (Core) API
  6. Working with Azure Cosmos DB - MongoDB API
  7. Working with Azure Cosmos DB - Table API
  8. Working with Azure Cosmos DB - Gremlin (Graph) API
  9. Working with Azure Cosmos DB - Cassandra API
  10. Working with Azure Data Lake Storage Gen2

Implementing a Relational Database in Microsoft Azure SQL Database

by Reza Salehi

Aug 8, 2019 / 2h 51m

2h 51m

Start Course
Description

Azure SQL Database is Microsoft's main relational database offering in the cloud. Getting the most out of this sophisticated service can be a challenge. In this course, Implementing a Relational Database in Microsoft Azure SQL Database, you will gain the ability to quickly provision and configure your relational database in the cloud. First, you will learn which deployment option best suites your project. Next, you will discover the automatic backup and data sync features. Finally, you will explore how to create database scheduled jobs to meet your data management and other needs. When you are finished with this course, you will have the skills and knowledge of Azure SQL Database needed to deploy and run your relational databases in the cloud.

Table of contents
  1. Course Overview
  2. Getting Started
  3. Provisioning Azure SQL Database Single Database & Elastic Pool
  4. Provisioning Azure SQL Database Managed Instance
  5. Configuring Data Backup
  6. Configuring Elastic Database Jobs
  7. Configuring SQL Agent Jobs
  8. Managing Data Synchronization between Azure and SQL Server On-premises

Implementing Hybrid Data Solutions in Microsoft Azure

by John Savill

Nov 27, 2019 / 1h 30m

1h 30m

Start Course
Description

Understanding the capabilities related to hybrid use for Azure data services can be overwhelming. In this course, Implementing Hybrid Data Solutions in Microsoft Azure, you will learn foundational knowledge of Azure data services use in a hybrid solution. First, you will learn core hybrid data considerations. Next, you will discover how to utilize Azure data services in a hybrid solution. Finally, you will explore how to optimize your data solution. When you’re finished with this course, you will have the skills and knowledge of hybrid data solutions needed to include Azure data services as part of a complete solution.

Table of contents
  1. Course Overview
  2. Understanding Hybrid Data Systems
  3. Implementing Hybrid Data Systems
  4. Optimizing and Maintaining a Hybrid Data Solution

Implementing Data Processing Solutions in Microsoft Azure

The courses in this section address building data processing solutions for specific types of business problems. For example, this section of the skill teaches how to move data across Azure services using Databricks, how to implement batch and streaming data processing solutions, and how to integrate your existing data pipelines into Microsoft Azure.

Implementing an Azure Databricks Environment in Microsoft Azure

by Michael Bender

Sep 17, 2019 / 2h 8m

2h 8m

Start Course
Description

Every day, we have more and more data, and the problem is how do we get to where we can use the data for business needs. In this course, Implementing a Databricks Environment in Microsoft Azure, you will learn foundational knowledge and gain the ability to implement Azure Databricks for use by all your data consumers like business users and data scientists. First, you'll learn the basics of Azure Databricks and how to implement ts components. Next, you will discover how to work with Azure Databricks during ETL (Extract, Transform, Load) operations. Then, you'll move on to performing batch scoring with machine learning models. Finally, you will explore how to work with streaming data from HDInsight Kafka. When you’re finished with this course, you will have the skills and knowledge of Azure Databricks needed to implement data pipeline solutions for your data consumers. Software required: Microsoft Azure Subscription

Table of contents
  1. Course Overview
  2. Implementing an Azure Databricks Environment
  3. Performing ETL (Extract, Transform, Load) Operations with Azure Databricks
  4. Batch Scoring of Apache Spark ML Models with Azure Databricks
  5. Streaming HDInsight Kafka Data into Azure Databricks

Building Batch Data Processing Solutions in Microsoft Azure

by Tim Warner

Aug 23, 2019 / 2h 34m

2h 34m

Start Course
Description

How can you gain business insights from data lakes and data warehouses? How can you use Hadoop, Spark, and Databricks in Microsoft Azure? In this course, Building Batch Data Processing Solutions in Microsoft Azure, you will gain the ability to implement scalable, performant, and accurate batch processing in the Microsoft Azure cloud. First, you will learn how to run batch processing jobs in Azure SQL Data Warehouse. Next, you will discover how HDInsight enables cloud-hosted Hadoop clusters. Finally, you'll explore Apache Spark and Azure Databricks, and learn how to integrate them with other Azure products. When you are finished with this course, you will have the skills and knowledge of batch data processing needed to advance your career as a data engineer.

Table of contents
  1. Course Overview
  2. Developing Batch Processing Solutions with Azure SQL Data Warehouse
  3. Developing Batch Processing Solutions with Azure HDInsight
  4. Developing Batch Processing Solutions with Azure Databricks

Building Streaming Data Pipelines in Microsoft Azure

by Reza Salehi

Oct 23, 2019 / 1h 58m

1h 58m

Start Course
Description

Processing live data streams in real time can be challenging and expensive. In this course, Building Streaming Data Pipelines in Microsoft Azure, you will gain the ability to effectively use Azure Stream Analytics for your live data processing needs. First, you will learn to configure stream and reference inputs for the service. Next, you will discover how to process your data using the Stream Analytics Query Language. Finally, you will explore how to visualize Azure Stream Analytics output with Microsoft Power BI. When you are finished with this course, you will have the skills and knowledge of Azure Stream Analytics needed to turn your live stream data into meaningful, actionable information.

Table of contents
  1. Course Overview
  2. Azure Stream Analytics Overview
  3. Configure Azure Stream Analytics with Event Hub and Blob Storage Inputs
  4. Query Data Using Azure Stream Analytics
  5. Implement Azure Stream Analytics Data Visualization with PowerBI

Integrating Data in Microsoft Azure

by Marcelo Pastorino

Sep 17, 2019 / 2h 23m

2h 23m

Start Course
Description

Data-driven decision making is the path to business success. In this course, Integrating Data in Microsoft Azure, you will gain foundational knowledge to integrate data utilizing the power of Microsoft Azure. First, you will learn how to migrate data from on-premise and Amazon Web Services to Azure. Next, you will discover how to easily construct ETL processes and create data integration pipelines using Azure Data Factory. Finally, you will explore how to create a real-time pipeline, to ingest and process real-time events sent by IoT devices using Azure EventHubs, Azure Stream Analytics, and Power BI. When you’re finished with this course, you will have the skills and knowledge needed to create data integration pipelines using some of the great tools that are part of the Azure ecosystem.

Table of contents
  1. Course Overview
  2. Data Integration Services on Azure
  3. Migrate On-premise Data to Azure SQL Server
  4. Migrate Data from Amazon S3 to Azure Blob Storage
  5. Create Data Pipelines with Azure Data Factory Copy Data Tool
  6. Create Data Pipelines with Azure Data Factory
  7. Create Real-time Data Pipelines with Azure EventHubs and Azure Stream Analytics
  8. Real-time Monitoring with Power BI

Deploying Microsoft Azure Data Solutions

This section of the skill addresses deploying your storage and processing solutions to Microsoft Azure.

Deploying Microsoft Azure SQL Data Warehouse and Azure SQL Database

by Robert Lindley

Sep 17, 2019 / 1h 60m

1h 60m

Start Course
Description

Agile methodologies have radically changed application development. Databases now need to continually evolve along with our applications, especially with the focus on highly iterative, micro-service driven architectures. In this course, Deploying Microsoft Azure SQL Data Warehouse and Azure SQL Database, you will learn foundational knowledge to create and deploy databases as part of a CI/CD process. First, you'll learn to create new Azure SQL Databases, Azure Data Warehouses, and Azure Data Factory resources along with creating ARM templates for each resource. Next, you’ll learn how to set up build and release pipelines with Azure DevOps to automatically provision new resources using ARM templates and deploying data warehouse or SQL database using data-tier application packages (DACPAC). Finally, you'll explore how to migrate existing on-premises SQL Server Databases to Azure using Azure Migrate Service, or keeping your on-premises databases automatically synchronized with Azure SQL Databases using Azure Data Sync. When you’re finished with this course, you will have the skills and knowledge to use Azure DevOps to deploy both Azure SQL Data Warehouses and Azure SQL Databases as part of your Agile processes.

Table of contents
  1. Course Overview
  2. Deploying the Modern Data Warehouse Environment
  3. Choosing a Code Branching Strategy
  4. Implementing a Data Warehouse Build and Release Pipeline
  5. Managing Hybrid Azure SQL Data Warehouse Solutions

Deploying Data Pipelines in Microsoft Azure

by Marcelo Pastorino

Dec 12, 2019 / 1h 29m

1h 29m

Start Course
Description

Data engineers working with Azure Data Factory can take advantage of Continuous Integration and Continuous Delivery practices to deploy robust and well-tested data pipelines to production. In this course, Deploying Data Pipelines in Microsoft Azure, you will learn foundational knowledge to apply CI/CD methodologies to your data pipeline creation process. First, you will learn to create the right environments to fall into the pit of success when creating data pipelines in ADF. Next, you will discover how to deploy data pipelines using ADF visual tools and ARM templates. Finally, you will explore how to create a release pipeline in Azure DevOps to automate the deployment process between three distinct environments: development, staging, and production. When you are finished with this course, you will have the skills and knowledge to apply CI/CD practices to your data pipeline creation process, effortlessly.

Table of contents
  1. Course Overview
  2. Getting Started Deploying Data Pipelines in Azure
  3. Creating the Data Pipeline Deployment Infrastructure
  4. Creating Azure Data Factory Environments
  5. Integrating Azure Data Factory Pipelines with Source Control Using Azure DevOps
  6. Deploying Data Pipelines Using ARM Templates and Azure DevOps
  7. Implementing Continuous Integration and Delivery of Azure Data Factory Pipelines Using Azure DevOps

Deploying SQL Server Containers in Microsoft Azure

by John Savill

Sep 11, 2019 / 2h 18m

2h 18m

Start Course
Description

On first glance, SQL Server and containers seem complete opposites, but running SQL Server containers in Azure provides a number of business and technical benefits. In this course, Deploying SQL Server Containers in Microsoft Azure, you will gain the ability to design and deploy a container-based SQL Server deployment in Azure. First, you will learn core skills around containers and Kubernetes. Next, you will discover how to deploy SQL Server in a container. Finally, you will explore how to piece everything together by deploying SQL Server to container services hosted in Azure, while meeting key SQL Server requirements. When you’re finished with this course, you will have the skills and knowledge of deploying SQL Server containers in Azure needed to meet your business and technical goals.

Table of contents
  1. Course Overview
  2. SQL Server and Containers in Azure
  3. Running SQL Server in a Container and Kubernetes
  4. SQL Server and AKS Deployment

Monitoring Microsoft Azure Data Solutions

The final section of this skill covers monitoring your MIcrosoft Azure data environment performance. Specific attention is given to monitoring databases, monitoring data pipelines, configuring alerts, and optimizing your solutions.

Monitoring Microsoft Azure Data Storage

by John Savill

May 31, 2019 / 1h 8m

1h 8m

Start Course
Description

Understanding the right ways to gain monitoring insight to Azure Storage services and HDInsight can be challenging. In this course, Monitoring Microsoft Azure Data Storage, you will learn foundational knowledge of monitoring these services in Azure. First, you will learn core skills around Azure Monitor. Next, you will discover Azure Storage-specific aspects of monitoring. Finally, you will explore how to monitor HDInsight, including using the open source monitoring components. When you’re finished with this course, you will have the skills and knowledge of monitoring needed to achieve all the insight required for Azure Storage and HDInsight.

Table of contents
  1. Course Overview
  2. Monitoring Key Concepts
  3. Implementing Azure Storage and Azure Data Lake Storage Monitoring
  4. Implementing HDInsight Monitoring

Microsoft Azure Database Monitoring Playbook

by John Savill

Jun 6, 2019 / 1h 26m

1h 26m

Start Course
Description

Using database services in the cloud can initially present challenges as to the right way to monitor the deployments. In this course, Microsoft Azure Database Monitoring Playbook, you will gain the ability to implement and use monitoring to achieve the results required. First, you will learn key aspects of monitoring databases in Azure. Next, you will discover monitoring options for Azure SQL Database in the portal and beyond. Finally, you will explore how to monitor Azure Cosmos DB and use that monitoring to ensure proper data partitioning. When you’re finished with this course, you will have the skills and knowledge of monitoring Azure database offerings needed to ensure the smooth operation and use of Azure-based deployments.

Table of contents
  1. Course Overview
  2. Monitoring Key Concepts Refresher
  3. Implementing Monitoring for Azure SQL Database and Data Warehouse
  4. Implementing Cosmos DB Monitoring

Monitoring Microsoft Azure Data Pipelines and Processing

by John Savill

Jun 18, 2019 / 54m

54m

Start Course
Description

Organizations need to get business insights from multiple, disparate sources and it is important that the pipelines and processes that enable this insight are healthy. In this course, Monitoring Microsoft Azure Data Pipelines and Processing, you will learn foundational knowledge of the core structure of data pipeline services and how to monitor them. First, you will examine the foundation of data pipelines and processes. Next, you will discover how to monitor Azure Data Factory as a control flow solution. Finally, you will explore how to monitor the HDInsight and Databricks data flow solution. When you’re finished with this course, you will have the skills and knowledge of monitoring data pipelines and processes needed to ensure the health and accuracy of generated business insights.

Table of contents
  1. Course Overview
  2. Data Pipeline Exploration
  3. Implementing Monitoring for Data Factory
  4. Implementing HDInsight Monitoring

Microsoft Azure Alert Configuration Playbook

by John Savill

Jun 12, 2019 / 1h 6m

1h 6m

Start Course
Description

Proactive notification of issues is critical for data services. In this course, Microsoft Azure Alert Configuration Playbook, you will learn foundational knowledge of alerting technologies and options available in Azure. First, you will examine how to use the centralized alerting in Azure Monitor. Next, you will discover additional alerting capabilities when utilizing Log Analytics-based queries. Finally, you will explore how to put these skills to use for specific data services. When you’re finished with this course, you will have the skills and knowledge of alerting in Azure needed to ensure the right level of proactive communications related to incidents for data services in Azure.

Table of contents
  1. Course Overview
  2. Azure Alerting Capabilities
  3. Configuring Alerts on Azure Data Services

Optimizing Microsoft Azure Data Solutions

by John Savill

Jul 17, 2019 / 2h 42m

2h 42m

Start Course
Description

Getting the most out of cloud data service spend in terms of performance is critical for every organization. In this course, Optimizing Microsoft Azure Data Solutions, you will gain the ability to access and optimize a number of key data services in Azure. First, you will learn how to optimize Azure SQL Database and Azure SQL Data Warehouse. Next, you will discover the key optimization considerations related to Cosmos DB and the importance of partitioning data the right way. Finally, you will explore how to achieve maximum performance when using Azure Data Lake and a number of key analysis services. When you’re finished with this course, you will have the skills and knowledge of optimizing Azure data services needed to maximize your Azure data service deployments.

Table of contents
  1. Course Overview
  2. Core Optimization Concepts
  3. Optimizing Azure SQL Database and Data Warehouse
  4. Optimizing Azure Cosmos DB
  5. Optimization Considerations for Data Services