Handling Streaming Data with Messaging Systems

Paths

Handling Streaming Data with Messaging Systems

Authors: Paweł Kordek, Bogdan Sucaciu, James Wilson, Axel Sirota, Ivan Mushketyk, Eugene Meidinger, Vitthal Srinivasan

Processing streaming data poses unique challenges that traditional batch data systems are unable to handle. Messaging systems are a useful platform for managing and processing... Read more

What You Will Learn

  • Processing streaming data using Kafka, Pulsar, and cloud platforms
  • Deploying Kafka clusters locally and in the cloud
  • Deploying Pulsar in the cloud
  • Designing and building stream processing solutions on AWS, Azure, and GCP

Pre-requisites

  • Data Engineering Literacy
  • Cloud Platform Literacy

Beginner

Learn the basis of processing streaming data using Kafka.

Deploying a Kafka Cluster

by Paweł Kordek

Jun 10, 2020 / 2h 30m

2h 30m

Start Course
Description

Apache Kafka is a messaging platform that reportedly is now deployed in more than a third of Fortune 500 companies. In this course, Deploying a Kafka Cluster, you’ll learn foundational knowledge about Apache Kafka. First, you’ll discover how it can be useful in a modern digital platform. Next, you’ll explore Kafka’s core concepts. Finally, you’ll learn how to deploy it in order to achieve fault tolerance. When you’re finished with this course, you’ll have the skills and knowledge of Apache Kafka needed to work with Kafka deployments and applications built around it.

Table of contents
  1. Course Overview
  2. Message Brokers
  3. High-level Kafka Architecture
  4. Producers and Consumers
  5. Fault Tolerance and High Availability
  6. Serialization
  7. Installing Kafka Manually
  8. Ecosystem Extensions
  9. Summary

Handling Streaming Data with a Kafka Cluster

by Bogdan Sucaciu

May 22, 2020 / 2h 9m

2h 9m

Start Course
Description

There are a lot of common scenarios that occur when using a streaming platform inside an organization. In this course, Handling Streaming Data with a Kafka Cluster, you’ll learn to handle a variety of different scenarios you may encounter. First, you’ll explore why Kafka makes such a great solution for handling streaming data while exploring different options in terms of optimizations and integrations with other models. Next, you’ll discover how to manage your data and perform various operations against your Kafka Cluster. Finally, you’ll learn how to secure the data streams by applying different techniques. When you’re finished with this course, you’ll have the skills and knowledge of handling streaming data with Apache Kafka needed to build and manage streaming pipelines in your organization.

Table of contents
  1. Course Overview
  2. Experiencing Kafka as a Streaming Platform
  3. Producing Data to Kafka
  4. Consuming Data from Kafka
  5. Managing Data Streams
  6. Transforming Non-streaming Models

Intermediate

Learn to migrate your streaming data systems to the cloud, using Apache Pulsar, AWS Kinesis, Azure Event Hub, and Google Cloud Pub/Sub.

Deploying Apache Pulsar to Google Kubernetes Engine

by James Wilson

Sep 23, 2020 / 2h 27m

2h 27m

Start Course
Description

Apache Pulsar has a messaging system with a cool architecture that is incredibly powerful to meet your needs. It is arguable that it is even more powerful if you have it running on Google Kubernetes Engine. Both systems together can provide incredible scalability and fault tolerance allowing you to scale your existing business or to open new possibilities for products that do not exist today. In this course, Deploying Apache Pulsar to Google Kubernetes Engine, you’ll learn to stand up Apache Pulsar onto a system that can scale. First, you’ll explore the pros and cons of Apache Pulsar compared to Apache Kafka. Next, you’ll discover installing, configuring, and managing Apache Pulsar on Google Kubernetes Engine. Finally, you’ll learn how to create producers and consumers that will utilize your Apache Pulsar system. When you’re finished with this course, you’ll have the skills and knowledge of running Apache Pulsar on Google Cloud which is needed to build a powerful messaging system that can horizontally scale around the world.

Table of contents
  1. Course Overview
  2. Messaging with Apache Pulsar
  3. Comparing Pulsar and Kafka
  4. Deploying Pulsar to Google Kubernetes Engine
  5. Configuring the Pulsar Cluster
  6. Creating Topics, Producers, and Consumers
  7. Leveraging the Pulsar Schema Registry

Handling Streaming Data with Apache Pulsar

by Axel Sirota

Sep 9, 2020 / 2h 7m

2h 7m

Start Course
Description

Real-time applications are hard to scale! They can get high volumes of data in an instant and need to route messages correctly. Apache Pulsar is a highly scalable, low latency, high throughput pub-sub system to attack this problem. In this course, Handling Streaming Data with Apache Pulsar, you’ll learn how to tame them adopting Apache Pulsar. First, you’ll explore Pulsar Functions for serverless ETL. Next, you’ll discover how to connect your Pulsar deployment to Kafka and databases with Pulsar IO. Finally, you’ll learn how to migrate from Kafka to Pulsar with the client wrapper. When you’re finished with this course, you’ll have the skills and knowledge of Apache Pulsar needed to handle high volume streaming data in your applications with ease.

Table of contents
  1. Course Overview
  2. Transforming Messages with Pulsar Functions
  3. Connecting to External Resources with Pulsar IO
  4. Inspecting Topic Data with Pulsar SQL
  5. Migrating from Apache Kafka to Apache Pulsar

Developing Stream Processing Applications with AWS Kinesis

by Ivan Mushketyk

May 13, 2020 / 4h 10m

4h 10m

Start Course
Description

The landscape of the Big Data field is changing. Previously, you could go away with processing incoming data for hours or even days. Now you need to do it in minutes or even seconds. These challenges require new solutions, new architectures, and new tools. AWS Kinesis is a service for stream processing that allows building applications that were impossible to create before. In the course, Developing Stream Processing Applications with AWS Kinesis, you'll learn the ins and outs of AWS Kinesis. First, you'll discover how it works, how to scale it up and down, and how to write applications using it. Next, you'll learn how to use a variety of tools to work with it, such as Kinesis Client Library, Kinesis Connector Library, Apache Flink, and AWS Lambda. Finally, you'll explore how to use more high-level Kinesis products such as Kinesis Firehose and how to write streaming applications using SQL queries with Kinesis Analytics. When you are finished with this course, you'll have an in-depth knowledge of AWS Kinesis that will help you to build your streaming applications.

Table of contents
  1. Course Overview
  2. Kinesis Fundamentals
  3. Reading and Writing Data to Kinesis
  4. Developing Applications Using Kinesis Client Library
  5. Implementing Advanced Kinesis Consumers
  6. Funneling Data with Kinesis Firehose
  7. Implementing Stream Analysis Applications Using Streaming SQL
  8. Kinesis in Production

Handling Streaming Data with Azure Event Hub

by Eugene Meidinger

May 15, 2020 / 2h 23m

2h 23m

Start Course
Description

Processing streaming data often requires low latencies and scalability. In this course, Handling Streaming Data with Azure Event Hub, you’ll gain the ability to use Azure Event Hubs to receive and process events in real-time. First, you’ll learn how Azure Event Hubs is designed and how it compares to other tools like Kafka. Next, you’ll see how to send and receive events. Finally, you’ll understand how to extract the data for streaming analysis or archiving. By the end of this course, you’ll know what you need to start using Azure Event Hubs.

Table of contents
  1. Course Overview
  2. Comparing Event Hubs to Other Messaging Tools
  3. Creating a Minimal Event Hubs Solution
  4. Architecting an Advanced Event Hubs Solution
  5. Extracting Events for Archiving and Batch Processing
  6. Extracting Events for Streaming Processing and Analytics

Architecting Stream Processing Solutions Using Google Cloud Pub/Sub

by Vitthal Srinivasan

Jan 8, 2019 / 1h 44m

1h 44m

Start Course
Description

As data warehousing and analytics become more and more integrated into the business models of companies, the need for real-time analytics and data processing has grown. Stream processing has quickly gone from being nice-to-have to must-have. In this course, Architecting Stream Processing Solutions Using Google Cloud Pub/Sub, you will gain the ability to ingest and process streaming data on the Google Cloud Platform, including the ability to take snapshots and replay messages. First, you will learn the basics of a Publisher-Subscriber architecture. Publishers are apps that send out messages, these messages are organized into Topics. Topics are associated with Subscriptions, and Subscribers need to listen in on subscriptions. Each subscription is a message queue, and messages are held in that queue until at least one subscriber per subscription has acknowledged the message. This is why Pub/Sub is said to be a reliable messaging system. Next, you will discover how to create topics, as well as how to push and pull subscriptions. As their names would suggest, push and pull subscriptions differ in who controls the delivery of messages to the subscriber. Finally, you will explore how to leverage advanced features of Pub/Sub such as creating snapshots, and seeking to a specific timestamp, either in the past or in the future. You will also learn the precise semantics of creating snapshots and the implications of turning on the “retain acknowledged messages” option on a subscription. When you’re finished with this course, you will have the skills and knowledge of Google Cloud Pub/Sub needed to effectively and reliably process streaming data on the GCP.

Table of contents
  1. Course Overview
  2. Getting Started with Cloud Pub/Sub
  3. Configuring Publishers, Subscribers, and Topics
  4. Using the Cloud Pub/Sub Client Library

Kafka Connect Fundamentals

by Bogdan Sucaciu

Dec 24, 2019 / 2h 14m

2h 14m

Start Course
Description

You may be wondering why the word "Connect" has suddenly sprung up next to "Kafka". Isn’t Kafka a Distributed Streaming Platform? Well, Kafka is more than that. Apache Kafka is an entire ecosystem and Kafka Connect is a part of it. In this course, Kafka Connect Fundamentals, you will gain the ability to create your own real-time ETL pipelines from and to Apache Kafka. First, you will learn what the ETL model is and how to set up your own ETL pipeline using Kafka Connect. Next, you will discover the inner details of Kafka Connect by exploring its architecture. Finally, you will explore how to successfully manage your Kafka Connect installation in a production environment. When you are finished with this course, you will have the skills and knowledge of Kafka Connect needed to set up, build, and maintain your own Kafka Connect installation.

Table of contents
  1. Course Overview
  2. Building ETL Pipelines with Apache Kafka
  3. Exploring Kafka Connect Architecture
  4. Building Your Own Connector
  5. Data Processing Using Transforms and Converters
  6. Using Kafka Connect in Production

Coming Soon

Kafka Streams and KSQL Fundamentals

Coming Soon

by Pluralsight

Advanced

Enforce data contracts and implement event logs using Kafka event processing features.

Enforcing Data Contracts with Kafka Schema Registry

by Bogdan Sucaciu

Aug 24, 2020 / 2h 29m

2h 29m

Start Course
Description

In a world of data, governance can become chaotic very quickly. In this course, Enforcing Data Contracts with Kafka Schema Registry, you’ll learn to enforce and manage data contracts in your Apache Kafka-powered system. First, you’ll explore how the serialization process takes place and why AVRO makes such a great option. Next, you’ll discover how to manage data contracts using Schema Registry. Finally, you’ll learn how to use other serialization formats while using Apache Kafka. When you’re finished with this course, you’ll have the skills and knowledge of data governance with Schema Registry needed to enforce and manage data contracts in your Apache Kafka setup.

Table of contents
  1. Course Overview
  2. Serializing Data in Messaging Systems
  3. Exploring AVRO
  4. Managing Schemas
  5. Handling Schema Evolution
  6. Using Other Serialization Formats

Implementing an Event Log with Kafka

by Ivan Mushketyk

Sep 4, 2020 / 3h 35m

3h 35m

Start Course
Description

In this course, Implementing an Event Log with Kafka, you will gain the ability to build complex microservice architectures around immutable events stored in Kafka. First, you’ll explore what issues you can encounter when migrating an application to a microsevices architecture. Next, you’ll master Kafka fundamentals and will learn how they allow you to address common issues in microservices applications. Finally, you'll learn advanced architectural patterns to work with data in Kafka. When you’re finished with this course, you’ll have the skills and knowledge needed to implement complex event-driven applications using Kafka.

Table of contents
  1. Course Overview
  2. Introduction
  3. Kafka as a Distributed Log
  4. Event Log with Kafka
  5. Kafka and Databases
  6. Analytics with an Event Log
Offer Code *
Email * First name * Last name *
Company
Title
Phone
Country *

* Required field

Opt in for the latest promotions and events. You may unsubscribe at any time. Privacy Policy

By providing my phone number to Pluralsight and toggling this feature on, I agree and acknowledge that Pluralsight may use that number to contact me for marketing purposes, including using autodialed or pre-recorded calls and text messages. I understand that consent is not required as a condition of purchase from Pluralsight.

By activating this benefit, you agree to abide by Pluralsight's terms of use and privacy policy.

I agree, activate benefit