Course

Skills

Conceptualizing the Processing Model for Apache Flink

by Janani Ravi

Flink is a stateful, tolerant, and large-scale system with excellent latency and throughput characteristics. It works with bounded and unbounded datasets using the same underlying stream-first architecture, focusing on streaming or unbounded data.

Preview this course

Try for free

Get this course plus top-rated picks in tech skills and other popular topics.

$29.00

per month after 10 day trial

Your 10 day Standard free trial includes

Expert-led courses

Keep up with the pace of change with thousands of expert-led, in-depth courses.

For teams

Give up to 50 users access to our full library including this course free for 30 days

Course info

Rating

(20)

Level

Intermediate

Updated

Nov 20, 2020

Duration

4h 30s

What you'll learn

Apache Flink is built on the concept of stream-first architecture, where the stream is the source of truth. Flink offers extensive APIs to process both batch as well as streaming data in an easy and intuitive manner.

In this course, Conceptualizing the Processing Model for Apache Flink, you’ll be introduced to Flink Architecture and processing APIs to get started on your data analysis journey.

First, you’ll explore the differences between processing batch and streaming data, and understand how stream-first architecture works. You’ll study the stream-first processing model that Flink uses to process data at scale, and Flink’s architecture which uses JobManager, TaskManagers, and task slots to execute the operators and streams in a Flink application in a data-parallel manner.

Next, you’ll understand the difference between stateless and stateful stream transformations and apply these concepts in a hands-on manner in your Flink stream processing. You’ll process data in a stateless manner using the map(), flatMap(), and filter() transformations, and use keyed streams and rich functions to work with Flink state.

Finally, you’ll round off your understanding of the state persistence and fault-tolerance mechanism that Flink uses by exploring the checkpointing architecture in Flink. You’ll enable checkpoints and savepoints in your streaming application, see how state can be restored from a snapshot in the case of failures, and configure your Flink application to support different restart strategies.

When you’re finished with this course, you’ll have the skills and knowledge to design Flink pipelines performing stateless and stateful transformations, and you’ll be able to build fault-tolerant applications using checkpoints and savepoints.

Course Overview

2mins

Course Overview 2m

Getting Started with Apache Flink

40mins

Executing and Monitoring Streaming Queries

48mins

Demo: Starting a Flink Cluster and Submitting Streaming Applications 7m
Demo: Debugging Errors Using the Flink Dashboard 1m
Demo: Exploring Default Configuration Settings 3m
Demo: Cluster and Job Specific Configuration Settings 4m
Demo: Setting up a Maven Flink Project 3m
Demo: Implementing Your First Streaming Application 6m
Demo: Configuring Job Specific Properties 5m
Demo: Explicitly Specifying UIDs 2m
Flink Clusters and Deployment 6m
High Availability with Flink 3m
Demo: Reading Streaming Data from a Text File 3m
Demo: Packaging and Submitting a Streaming Job to the Flink Cluster 5m

Performing Stateless Transformations on Streams

52mins

Stateless Transformations 2m
Demo: Performing Filter Operations on Input Streams 4m
Demo: Performing Map Operations on Input Streams 6m
Demo: Performing Flatmap Operations on Input Streams 6m
Demo: More Flatmap Operations on Input Streams 4m
Flink APIs 4m
Demo: Introducing the Dataset API 3m
Demo: Map and Filter Using Datasets 2m
Demo: Introducing the Table API 5m
Demo: Running SQL Queries on Streaming Data 2m
Demo: Reading Continuously from a File Source 4m
Demo: Writing out to a Streaming File Sink 3m
Demo: Streaming Sink with Rollover Policy 2m
Demo: Processing a File Exactly Once 2m
Fault Tolerance Guarantees 4m

Performing Stateful Transformations on Streams

46mins

Stateful Transformations 2m
Keyed Streams 4m
Demo: Keyed Streams 5m
States in Flink 4m
Keyed State Interfaces and Rich Functions 5m
Demo: Value State - Max Closing Price 7m
Demo: Value State - Rolling Average 4m
Demo: Value State - Rolling Average per Key 2m
Demo: List State - Days since Price Threshold Breach 4m
Demo: Reducing State - Rolling Average 5m
State Backends 5m

Exploring the Checkpointing Architecture in Flink

50mins

Checkpoints 2m
Stream Barriers and Aligned Checkpoints 6m
Unaligned Checkpoints 2m
Demo: Enabling and Configuring Checkpoints 5m
Demo: Default in Memory Checkpoints 4m
Demo: Persistent Checkpoints Using the Fs State Backend 4m
Demo: Configuring the State Backend for the Cluster 3m
Savepoints 4m
Demo: Manually Triggering Savepoints 4m
Demo: Restoring Applications from Savepoints 3m
Restart Strategies 4m
Demo: Restart Strategy - Fixed Delay 5m
Demo: Restart Strategy - No Restart 1m
Demo: Restart Strategy - Failure Rate 3m
Summary and Further Study 1m

About the author

Janani Ravi

Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework. After spending years working in tech in the Bay Area, New York, and Singapore at companies such as Microsoft, Google, and Flipkart, Janani finally decided to combine her love for technology with her passion for teaching. She is now the co-founder of Loonycorn, a content studio focused on providing ... more

See more courses by Janani Ravi

Try for free

Get this course plus top-rated picks in tech skills and other popular topics.

$29.00

per month after 10 day trial

Your 10 day Standard free trial includes

Expert-led courses

Keep up with the pace of change with thousands of expert-led, in-depth courses.

For teams

Give up to 50 users access to our full library including this course free for 30 days

Course info

Rating

(20)

Level

Intermediate

Updated

Nov 20, 2020

Duration

4h 30s

Ready to upskill? Get started

Contact Sales

Conceptualizing the Processing Model for Apache Flink

What you'll learn

Table of contents

About the author

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill up
your entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Contact Sales

Conceptualizing the Processing Model for Apache Flink

What you'll learn

Table of contents

About the author

Get access now

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Ready to skill upyour entire team?

With your Pluralsight plan, you can:

With your 30-day pilot, you can:

Support

Community

Company

Industries

Newsletter

Ready to skill up
your entire team?

Ready to skill up
your entire team?