Featured resource
2025 Tech Upskilling Playbook
Tech Upskilling Playbook

Build future-ready tech teams and hit key business milestones with seven proven plays from industry leaders.

Check it out
  • Course

Data Transformations with Apache Pig

Pig is an open source engine for executing parallelized data transformations which run on Hadoop. This course shows you how Pig can help you work on incomplete data with an inconsistent schema, or perhaps no schema at all.

Beginner
3h 15m
(35)

Created by Janani Ravi

Last Updated Jun 20, 2024

Course Thumbnail
  • Course

Data Transformations with Apache Pig

Pig is an open source engine for executing parallelized data transformations which run on Hadoop. This course shows you how Pig can help you work on incomplete data with an inconsistent schema, or perhaps no schema at all.

Beginner
3h 15m
(35)

Created by Janani Ravi

Last Updated Jun 20, 2024

Get started today

Access this course and other top-rated tech content with one of our business plans.

Try this course for free

Access this course and other top-rated tech content with one of our individual plans.

This course is included in the libraries shown below:

  • Data
What you'll learn

Pig is an open source software which is part of the Hadoop eco-system of technologies. Pig is great at working with data which are beyond traditional data warehouses. It can deal well with missing, incomplete, and inconsistent data having no schema. In this course, Data Transformations with Apache Pig, you'll learn about data transformations with Apache. First, you'll start with the very basics which will show you how to get Pig installed and get started working with the Grunt shell. Next, you'll discover how to load data into relations in Pig and store transformed results to files via load and store commands. Then, you'll work on a real world dataset where you analyze accidents in NYC using collision data from the City of New York. Finally, you'll explore advanced constructs such as the nested foreach and also gives you a brief glimpse into the world of MapReduce and shows you how easy it is to implement this construct in Pig. By the end of this course, you'll have a better understanding of data transformations with Apache Pig.

Data Transformations with Apache Pig
Beginner
3h 15m
(35)
Table of contents

About the author
Janani Ravi - Pluralsight course - Data Transformations with Apache Pig
Janani Ravi
193 courses 4.5 author rating 6281 ratings

A problem solver at heart, Janani has a Masters degree from Stanford and worked for 7+ years at Google. She was one of the original engineers on Google Docs and holds 4 patents for its real-time collaborative editing framework.

2025 Forrester Wave™ names Pluralsight as a Leader among tech skills dev platforms

See how our offering and strategy stack up.

forrester wave report