Course

Skills

Building Your First ETL Pipeline Using Azure Databricks

In this course, you will learn about the Spark based Azure Databricks platform, see how to setup the environment, quickly build extract, transform, and load steps of your data pipelines, orchestrate it end-to-end, and run it automatically and reliably.

Preview this course

What you'll learn

With an exponential growth in data volumes, increase in types of data sources, faster data processing needs and dynamically changing business requirements, traditional ETL tools are facing the challenge to keep up to the needs of modern data pipelines. While Apache Spark is very popular for big data processing and can help us overcome these challenges, managing the Spark environment is no cakewalk.

In this course, Building Your First ETL Pipeline Using Azure Databricks, you will gain the ability to use the Spark based Databricks platform running on Microsoft Azure, and leverage its features to quickly build and orchestrate an end-to-end ETL pipeline. And all this while learning about collaboration options and optimizations that it brings, but without worrying about the infrastructure management.

First, you will learn about the fundamentals of Spark, about the Databricks platform and features, and how it is runs on Microsoft Azure.

Next, you will discover how to setup the environment, like workspace, clusters and security, and build each phase of extract, transform and load separately, to implement the dimensional model.

Finally, you will explore how to orchestrate that using Databricks jobs and Azure Data Factory, followed by other features, like Databricks APIs and Delta Lake, to help you build automated and reliable data pipelines.

When you’re finished with this course, you will have the skills and knowledge of Azure Databricks platform needed to build and orchestrate an end-to-end ETL pipeline.

Course Overview

1min

Course Overview 2m

Getting Started with Azure Databricks

41mins

Setting up Your Databricks Environment

22mins

Module Overview 1m
Setting up Workspace 3m
Creating Cluster 8m
Working with Notebook 3m
Configuring Security 3m
Scenario Walkthrough 3m
Summary 1m

Extracting Data from Multiple Sources

16mins

Module Overview 1m
Extracting from Azure Storage Services 7m
Reading Multiple File Formats 4m
Applying Schemas 3m
Summary 1m

Transforming and Cleaning Data

30mins

Module Overview 2m
Understanding Common Transformations 3m
Analyzing and Cleaning Data 6m
Applying Transformations 9m
Working with Spark SQL 5m
Handling Corrupt Data 4m
Summary 2m

Loading Data

17mins

Module Overview 1m
Loading to Files 9m
Working with Databricks Tables 6m
Summary 2m

Orchestrating ETL Pipeline

15mins

Module Overview 1m
Setting up Workflow 6m
Scheduling with Databricks Jobs 3m
Orchestrating with Azure Data Factory 4m
Summary 2m

Building Better Pipelines on Databricks

14mins

Module Overview 1m
Using Databricks APIs 3m
Understanding Delta Lake 8m
Summary 2m

About the author

Mohit Batra

Mohit is a Data Engineer, a Microsoft Certified Trainer (MCT) and a consultant. Mohit has 15+ years of extensive experience in architecting large scale Business Intelligence, Data Warehousing and Big Data solutions with companies like Microsoft and some leading investment banks. As an expert in his field, Mohit has often shared his knowledge in Azure, Spark, SQL Server and Power BI at various public forums and as a corporate trainer. Mohit truly loves to teach and enjoys producing high-quality,... more

See more courses by Mohit Batra

Ready to upskill? Get started

Contact Sales

Building Your First ETL Pipeline Using Azure Databricks

What you'll learn

Table of contents

About the author

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Contact Sales

Building Your First ETL Pipeline Using Azure Databricks

What you'll learn

Table of contents

About the author

Get access now

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Ready to skill up
your entire team?

Ready to skill up
your entire team?