Featured resource
2025 Tech Upskilling Playbook
Tech Upskilling Playbook

Build future-ready tech teams and hit key business milestones with seven proven plays from industry leaders.

Check it out
  • Course
    • Libraries: If you want this course, consider one of these libraries.
    • Data

Transform Data Using the Pandas API in Apache Spark

Learn to transform data with the Pandas API in Apache Spark. This course will teach you practical techniques for data manipulation, performance optimization, and using advanced window functions in Spark workflows.

Bismark Adomako - Pluralsight course - Transform Data Using the Pandas API in Apache Spark
Bismark Adomako
What you'll learn

Efficient data manipulation is essential in large-scale data processing. In this course, Transform Data Using the Pandas API in Apache Spark, you'll learn how to leverage the Pandas API for powerful data transformation in Spark. First, you’ll cover essential techniques like filtering, grouping, and merging. Next, you'll optimize workflows with Arrow. Finally, you'll dive into rolling and expanding window functions. When you’re finished with this course, you’ll have a better understanding of how to integrate the Pandas API with Apache Spark to handle complex data manipulation tasks with improved performance and efficiency.

Table of contents

About the author
Bismark Adomako - Pluralsight course - Transform Data Using the Pandas API in Apache Spark
Bismark Adomako

Bismark is a BI & Big Data Engineer obsessed with applying his knowledge in computer engineering and mathematics in the fields of Data Science, Artificial Intelligence, Machine Learning, Big Data, and Human Computer Interaction to find disease cures, provision of better healthcare and technology, autonomous systems, education and productivity through research into novel methods and algorithms for computation.

Get access now

Sign up to get immediate access to this course plus thousands more you can watch anytime, anywhere.

Get started with Pluralsight