Course

Skills

Getting Started with Stream Processing with Spark Streaming

by Janani Ravi

The Spark Streaming module lets you to work with large scale streaming data using familiar batch processing abstractions. This course starts with how standard transformations and operations are performed on streams, and moves to more advanced topics.

Preview this course

Try for free

Get this course plus top-rated picks in tech skills and other popular topics.

$29.00

per month after 10 day trial

Your 10 day Standard free trial includes

Expert-led courses

Keep up with the pace of change with thousands of expert-led, in-depth courses.

For teams

Give up to 50 users access to our full library including this course free for 30 days

Course info

Rating

(89)

Level

Beginner

Updated

Jun 20, 2024

Duration

2h 35m

What you'll learn

Traditional distributed systems like Hadoop work on data stored in a file system. Jobs can run for hours, sometimes days. This is a major limitation in processing real-time data such as trends and breaking news. The Spark Streaming module extends the Spark batch infrastructure to deal with data for real-time analysis. In this course, Getting Started with Stream Processing with Spark Streaming, you'll learn the nuances of dealing with streaming data using the same basic Spark transformations and actions that work with batch processing. Next, you'll explore you how you can extend machine learning algorithms to work with streams. Finally, you'll learn the subtle details of how the streaming K-means clustering algorithm helps find patterns in data. By the end of this course, you'll feel confident in your knowledge, and you can start integrating what you've learned into your own projects.

Course Overview

2mins

Course Overview 2m

Getting Started with Discretized Streams

42mins

Transforming Blocks of Data with DStreams

33mins

Stateless and Stateful Transformations 3m
The updateStateByKey() Function 5m
The updateStateByKey() Implementation 5m
Sliding Window Operations 6m
The countByWindow() Transformation 3m
Summary and Inverse Functions 5m
The reduceByWindow() Transformation 3m
The reduceByKeyAndWindow() Transformation 4m

Applying ML Algorithms on DStreams

41mins

Clustering Data to Find Patterns 6m
The K-means Clustering Algorithm 7m
The Streaming K-means Clustering Algorithm 4m
Forgetfulness Using the Decay Factor 5m
Forgetfulness Using Half-life 4m
Implementing the Streaming K-means Clustering Algorithm 8m
Running the K-means Algorithm on Twitter Location Data 4m
The K-means Algorithm with a Decay Factor of Zero 4m

Building a Robust Spark Streaming Application

35mins

Checkpointing Streaming Spark Applications 6m
Connecting to the Twitter Streaming API 7m
Streaming Twitter Data to a Socket 4m
Trending Hashtags Using Spark Streaming 3m
The Driver, Executor, and Receiver 5m
Driver Fault Tolerance Using Checkpointing 6m
Executor and Receiver Fault Tolerance 5m

About the author

Janani Ravi

Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework. After spending years working in tech in the Bay Area, New York, and Singapore at companies such as Microsoft, Google, and Flipkart, Janani finally decided to combine her love for technology with her passion for teaching. She is now the co-founder of Loonycorn, a content studio focused on providing ... more

See more courses by Janani Ravi

Try for free

Get this course plus top-rated picks in tech skills and other popular topics.

$29.00

per month after 10 day trial

Your 10 day Standard free trial includes

Expert-led courses

Keep up with the pace of change with thousands of expert-led, in-depth courses.

For teams

Give up to 50 users access to our full library including this course free for 30 days

Course info

Rating

(89)

Level

Beginner

Updated

Jun 20, 2024

Duration

2h 35m

Ready to upskill? Get started

Contact Sales

Getting Started with Stream Processing with Spark Streaming

What you'll learn

Table of contents

About the author

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Contact Sales

Getting Started with Stream Processing with Spark Streaming

What you'll learn

Table of contents

About the author

Get access now

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Ready to skill up
your entire team?

Ready to skill up
your entire team?