Course

Conceptualizing the Processing Model for Azure Databricks Service

In this course, you will learn about the Spark based Azure Databricks platform. You will see how Spark Structured Streaming processing model works, and then use it to build end-to-end production ready streaming pipeline on Azure Databricks platform.

Intermediate

2h 51m

(52)

Created by Mohit Batra

Last Updated Jul 21, 2020

Get started today

Access this course and other top-rated tech content with one of our business plans.

Start a free team trial

Buy now

Try this course for free

Access this course and other top-rated tech content with one of our individual plans.

Start a free trial

Buy now

This course is included in the libraries shown below:

Cloud
Data

Course

Conceptualizing the Processing Model for Azure Databricks Service

Intermediate

2h 51m

(52)

Created by Mohit Batra

Last Updated Jul 21, 2020

Get started today

Access this course and other top-rated tech content with one of our business plans.

Start a free team trial

Buy now

Try this course for free

Access this course and other top-rated tech content with one of our individual plans.

Start a free trial

Buy now

This course is included in the libraries shown below:

Cloud
Data

What you'll learn

Modern data pipelines often include streaming data, that needs to be processed in real-time. While Apache Spark is very popular for big data processing and can help us build reliable streaming pipelines, managing the Spark environment is no cakewalk.

In this course, Conceptualizing the Processing Model for Azure Databricks Service, you will learn how to use Spark Structured Streaming on Databricks platform, which is running on Microsoft Azure, and leverage its features to build an end-to-end streaming pipeline quickly and reliably. And all this while learning about collaboration options and optimizations that it brings, but without worrying about the infrastructure management.

First, you will learn about the processing model of Spark Structured Streaming, about the Databricks platform and features, and how it is runs on Microsoft Azure.

Next, you will see how to setup the environment, like workspace, clusters, and security; configure streaming sources and sinks, and see how Structured Streaming fault tolerance works.

Followed by this, you will learn how to build each phase of streaming pipeline, by extracting the data from source, transforming it, and loading it in a sink. And then make it production ready, and run it using Databricks jobs.

You will also see, how to customize the cluster using Initialization scripts and Docker containers, to suit your business requirements.

Finally, you will explore other aspects. You will see what are the different workloads available, and how pricing works. We will also talk about best practices, in terms of development, performance, stability and cost. And lastly, you will see how Spark Structured Streaming on Azure Databricks compares to other managed services, like Flink on AWS, Azure Stream Analytics, Beam on Google Cloud etc.

By the end of this course, you will have the skills and knowledge of Azure Databricks platform needed to build an end-to-end streaming pipeline, using Spark Structured streaming.

Conceptualizing the Processing Model for Azure Databricks Service

Intermediate

2h 51m

(52)

Table of contents

Course Overview | 1m 38s

About the author

Mohit Batra

14 courses

4.7 author rating

871 ratings

Mohit is a Data Engineer, a Microsoft Certified Trainer (MCT) and a consultant. Mohit has 15+ years of extensive experience in architecting large scale Business Intelligence, Data Warehousing and Big Data solutions with companies like Microsoft and some leading investment banks.

More Courses by Mohit

Conceptualizing the Processing Model for Azure Databricks Service

Conceptualizing the Processing Model for Azure Databricks Service

Get started today

Try this course for free

Conceptualizing the Processing Model for Azure Databricks Service

What you'll learn

Conceptualizing the Processing Model for Azure Databricks Service

Course Overview 1m

Getting Started with Structured Streaming on Azure Databricks 42m

Setting up Databricks Environment 27m

Configuring Source and Sink Stores 29m

Building Streaming Pipeline Using Structured Streaming 21m

Making Streaming Pipeline Production Ready 14m

Understanding Pricing, Workloads, and Competition 15m

Customizing the Cluster 19m

2025 Forrester Wave™ names Pluralsight as a Leader among tech skills dev platforms