Featured resource
2026 Tech Forecast
2026 Tech Forecast

Stay ahead of what’s next in tech with predictions from 1,500+ business leaders, insiders, and Pluralsight Authors.

Get these insights
  • Course

Apache Spark Fundamentals

This course will teach you how to use Apache Spark to analyze your big data at lightning-fast speeds; leaving Hadoop in the dust! For a deep dive on SQL and Streaming check out the sequel, Handling Fast Data with Apache Spark SQL and Streaming.

Intermediate
4h 15m
(433)

Created by Justin Pihony

Last Updated Jun 21, 2024

Course Thumbnail
  • Course

Apache Spark Fundamentals

This course will teach you how to use Apache Spark to analyze your big data at lightning-fast speeds; leaving Hadoop in the dust! For a deep dive on SQL and Streaming check out the sequel, Handling Fast Data with Apache Spark SQL and Streaming.

Intermediate
4h 15m
(433)

Created by Justin Pihony

Last Updated Jun 21, 2024

Get started today

Access this course and other top-rated tech content with one of our business plans.

Try this course for free

Access this course and other top-rated tech content with one of our individual plans.

This course is included in the libraries shown below:

  • Data
What you'll learn

Our ever-connected world is creating data faster than Moore's law can keep up, making it so that we have to be smarter in our decisions on how to analyze it. Previously, we had Hadoop's MapReduce framework for batch processing, but modern big data processing demands have outgrown this framework. That's where Apache Spark steps in, boasting speeds 10-100x faster than Hadoop and setting the world record in large scale sorting. Spark's general abstraction means it can expand beyond simple batch processing, making it capable of such things as blazing-fast, iterative algorithms and exactly once streaming semantics. In this course, you'll learn Spark from the ground up, starting with its history before creating a Wikipedia analysis application as one of the means for learning a wide scope of its core API. That core knowledge will make it easier to look into Spark's other libraries, such as the streaming and SQL APIs. Finally, you'll learn how to avoid a few commonly encountered rough edges of Spark. You will leave this course with a tool belt capable of creating your own performance-maximized Spark application.

Apache Spark Fundamentals
Intermediate
4h 15m
(433)
Table of contents

About the author
Justin Pihony - Pluralsight course - Apache Spark Fundamentals
Justin Pihony
5 courses 3.8 author rating 1167 ratings

Justin is a software journeyman, continuously learning and honing his skills.

Get started with Pluralsight