Handling Fast Data with Apache Spark SQL and Streaming
Course info



Course info



Description
Analyzing data used to be something you did once a night. Now you need to be able to process data on the fly so you can provide up to the minute insights. But, how do you accomplish in real time what used to take hours without a complicated code base? In this course, Handling Fast Data with Apache Spark SQL and Streaming, you'll learn to use Apache Spark Streaming and SQL libraries as a great way to handle this new world of real time, fast data processing. First, you'll dive into SparkSQL. Next, you'll explore how to catch potential fraud by analyzing streams with Spark Streaming. Finally, you'll discover the newer Structured Streaming API. By the end of this course, you'll have a deeper understanding of these APIs, along with a number of streaming concepts that have driven the API design.
Section Introduction Transcripts
Course Overview
Hi, my name is Justin Pihony, and welcome to my course, Fast Data Handling with Apache Spark SQL and Streaming. Being a top contributor of Apache Spark answers on Stack Overflow, as well as the developer support manager at Lightbend has given me a lot of insight into how to maximize Spark's power, while sidestepping possible pitfalls. Fast data is the next big thing in the world of data. Nowadays, we want valuable business insights now, not after having to wait for batch jobs to complete, and we're now at a point where we can build these systems able to reactively handle our needs at scale. In this course, we're going to see how to use Spark in its SQL and streaming capabilities to build these fast data applications without breaking a sweat. Some of the major topics that we'll cover include a deep dive into Spark's SQL library, learning both the untyped side via DataFrames and the type-safe side via datasets, as well as covering Spark's take on streaming via both the older, more-stable Spark Streaming library and its modernized, up-and-coming structured streaming library. By the end of this course, you'll have extensive knowledge of Spark's SQL and streaming APIs, knowing how to utilize them to create a fast data application capable of pulling out business insights in no time at all. Before beginning the course, you should have a basic understanding of Apache Spark, which you can get from my other course, Apache Spark Fundamentals. I hope you'll join me on this journey to learn about Spark's SQL and streaming libraries, and how they can be used in this new architecture overtaking the big data world via the Fast Data Handling with Apache Spark SQL and Streaming at Pluralsight.