Getting Started with Stream Processing Using Apache Flink

Flink is a stateful, tolerant, and large scale system with excellent latency and throughput characteristics. It works with bounded and unbounded datasets using the same underlying stream-first architecture, focusing on streaming or unbounded data.
Course info
Rating
(21)
Level
Beginner
Updated
Apr 17, 2017
Duration
2h 44m
Table of contents
Description
Course info
Rating
(21)
Level
Beginner
Updated
Apr 17, 2017
Duration
2h 44m
Description

Apache Flink is a distributed computing engine used to process large scale data. Flink is built on the concept of stream-first architecture where the stream is the source of truth. This course, Getting Started with Stream Processing Using Apache Flink, walks the users through exploratory data analysis and data munging with Flink. You'll start off learning about simple data transformations on streams such as map(), filter(), flatMap(), reduce(), sum(), min(), and max() on simple DataStreams and KeyedStreams. You'll then learn about window transformations in detail using tumbling, sliding, count, and session windows. You'll wrap up the course explore operations on multiple streams such as union and joins. All of this with hands on demos using Flink's Java API along with a real world project using Twitter's streaming API. After you've watched this course you'll have a strong foundation for stream processing concepts using Apache Flink.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
Analyzing Data with Qlik Sense
Intermediate
2h 11m
Jun 17, 2019
Using PyTorch in the Cloud: PyTorch Playbook
Intermediate
2h 21m
Apr 25, 2019
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
Hi, my name is Janani Ravi, and I'd like to welcome you to this course today. A little bit about myself, I have a masters degree in electrical engineering from Stanford, and have worked with companies such as Microsoft, Google, and Flipkart. At Google, I was one of the first engineers working on a real-time collaborative editing in Google Docs and I hold four patents for it under Line Technologies. I currently work on my own startup, Loony Corn, a studio for high-quality video content. This course focuses on Apache Flink, a distributed computing engine to process large-scale data. Flink is built on the concept of stream-first architecture where the stream of data is the source of true. This course walks users through exploratory data analysis and data managing with Flink from very first principles. Learn how to perform simple data transformations on streams such as map, filter, flat map, and reduce, understanding how Windows transformations work in great detail using tumbling, sliding, down, and session windows, all of this with hands-on demos in Java using Flink's run time live jury include a project to connect to the Twitter streaming API to pass and extract information from real-time tweets.