Featured resource
2026 Tech Forecast
2026 Tech Forecast

1,500+ tech insiders, business leaders, and Pluralsight Authors share their predictions on what’s shifting fastest and how to stay ahead.

Download the forecast
  • Lab
    • Libraries: If you want this lab, consider one of these libraries.
Google Cloud Platform icon
Labs

Deepak Goyal - Test Shell

In this lab, you will build a simple data ingestion and preparation workflow using AWS Glue. You will begin by downloading a sample CSV dataset and uploading it to an Amazon S3 source bucket. Next, you will use an AWS Glue Crawler to discover and catalog the data, create an AWS Glue ETL job to transform and prepare the dataset, and store the processed output in a destination Amazon S3 bucket. By the end of the lab, you will have practical experience with data ingestion, schema discovery, catalog management, and ETL processing using AWS Glue.

Google Cloud Platform icon
Lab platform
Lab Info
Level
Intermediate
Last updated
Jun 29, 2026
Duration
30m

Contact sales

By clicking submit, you agree to our Privacy Policy and Terms of Use, and consent to receive marketing emails from Pluralsight.
Table of Contents
  1. Challenge

    Upload source data to Amazon S3

    Download the provided https://github.com/pluralsight-cloud/Lab-Ingesting-Data-Using-AWS-Glue/blob/main/order.csv dataset and upload it to the pre-created Amazon S3 source bucket. Verify that the file has been uploaded successfully and is available for processing by AWS Glue.

  2. Challenge

    Use an AWS Glue crawler to catalog the dataset

    Create and configure an AWS Glue Crawler to scan the uploaded dataset. Run the crawler and register the discovered schema in the AWS Glue Data Catalog. Verify that the database and table definitions were created successfully.

  3. Challenge

    Create and configure an AWS Glue ETL job

    Create an AWS Glue ETL job using the cataloged dataset as the source. Configure the job to perform simple data preparation tasks such as selecting columns, renaming fields, and filtering records into a refined dataset. Configure the ETL job to write the transformed output to the destination Amazon S3 bucket.

  4. Challenge

    Run and validate the processed dataset

    Run the job, monitor its execution, and verify that the processed dataset has been successfully generated and stored in the target location.

About the author

Pluralsight Skills gives leaders confidence they have the skills needed to execute technology strategy. Technology teams can benchmark expertise across roles, speed up release cycles and build reliable, secure products. By leveraging our expert content, skill assessments and one-of-a-kind analytics, keep up with the pace of change, put the right people on the right projects and boost productivity. It's the most effective path to developing tech skills at scale.

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Get started with Pluralsight