Handling Streaming Data with GCP Dataflow

Dataflow is a serverless, fully-managed service on the Google Cloud Platform for batch and stream processing.
Course info
Level
Advanced
Updated
Dec 11, 2020
Duration
3h 12m
Table of contents
Course Overview
Executing Pipelines on Cloud Dataflow
Integrating Dataflow with Cloud Pub/Sub
Performing Windowing Operations on Streaming Data
Performing Join Operations on Streaming Data
Description
Course info
Level
Advanced
Updated
Dec 11, 2020
Duration
3h 12m
Description

Dataflow allows developers to process and transform data using easy, intuitive APIs. Dataflow is built on the Apache Beam architecture and unifies batch as well as stream processing of data. In this course, Handling Streaming Data with GCP Dataflow, you will discover the GCP provides a wide range of connectors to integrate the Dataflow service with other GCP services such as the Pub/Sub messaging service and the BigQuery data warehouse.

First, you will see how you can integrate your Dataflow pipelines with other services to use as a source of streaming data or as a sink for your final results.

Next, you will stream live Twitter feeds to the Pub/Sub messaging service and implement your pipeline to read and process these Twitter messages. Finally, you will implement pipelines with a side input, and branching pipelines to write your final results to multiple sinks. When you are finished with this course you will have the skills and knowledge to design complex Dataflow pipelines, integrate these pipelines with other Google services, and test and run these pipelines on the Google Cloud Platform.

About the author
About the author

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

More from the author
More courses by Janani Ravi
Section Introduction Transcripts
Section Introduction Transcripts

Course Overview
[Autogenerated] Hi. My name is Johnny Ravi. And welcome to this course on handling streaming data with GDP Data flow a little about myself. I have a master's degree in electrical engineering from Stanford on have worked at companies such as Microsoft, Google and Flip Cart at Google. A was one of the first engineers working on real time collaborative editing in Google Dogs and I hold four patterns for its online technologies. I currently work on my own startup, loony Con, a studio for high quality video content. Data flow allows developers to process and transform data using easy, intuitive APIs. Data flow is built on the A party beam architectures and unifies batch as fellas stream processing off data. In this course, you will first see how you can integrate your data flow pipelines with other GP services to use as a source of streaming data or as a sync for your finally results, you will read data from cloud storage buckets. On the pups are messaging service and right data to the Big Query Data Warehouse. You will see how you can use the data flow monitoring interface to debug slow stages in your pipeline code. Next, you will stream live Twitter feeds to the pops, are messaging service on Implement your pipeline to read and process these Twitter messages, you will perform transformations such as extracting embedded hashtags and performing sentiment analysis on tweets. You will also perform been doing operations on input streams and learn the right method to extract even time time stamps from your streaming elements. Finally, you will implement pipelines with aside and put on branching pipelines to right your final results to multiple things. You will perform joint operations on input streams on right unit tests as fellas end to end test for your pipeline code. When you're finished with this course, you will have the skills and knowledge to design complex data flow pipelines, integrate these pipelines with other Google services and test and run these pipelines on the G. C P.