-
Course
- Data
Getting Started with Stream Processing Using Apache Flink
Flink is a stateful, tolerant, and large scale system with excellent latency and throughput characteristics. It works with bounded and unbounded datasets using the same underlying stream-first architecture, focusing on streaming or unbounded data.
What you'll learn
Apache Flink is a distributed computing engine used to process large scale data. Flink is built on the concept of stream-first architecture where the stream is the source of truth. This course, Getting Started with Stream Processing Using Apache Flink, walks the users through exploratory data analysis and data munging with Flink. You'll start off learning about simple data transformations on streams such as map(), filter(), flatMap(), reduce(), sum(), min(), and max() on simple DataStreams and KeyedStreams. You'll then learn about window transformations in detail using tumbling, sliding, count, and session windows. You'll wrap up the course explore operations on multiple streams such as union and joins. All of this with hands on demos using Flink's Java API along with a real world project using Twitter's streaming API. After you've watched this course you'll have a strong foundation for stream processing concepts using Apache Flink.
Table of contents
- Version Check | 16s
- Why Stream Processing? | 2m 16s
- Batch Processing vs. Stream Processing | 7m 3s
- Requirements of Stream Processing Systems | 5m 12s
- Micro-batches for Stream Processing | 2m 17s
- Introducing Apache Flink for Stream Processing | 4m 51s
- Clients, Masters, and Workers | 4m 13s
- Install and Set up Flink | 7m 43s
About the author
A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.
More Courses by Janani