Featured resource
2025 Tech Upskilling Playbook
Tech Upskilling Playbook

Build future-ready tech teams and hit key business milestones with seven proven plays from industry leaders.

Check it out
  • Lab
    • Libraries: If you want this lab, consider one of these libraries.
    • AI
    • Cloud
Google Cloud Platform icon
Labs

Perform Feature Engineering Using Amazon SageMaker

Imagine you are the data engineer, and you have been assigned the task of preprocessing the data and getting it ready for the machine learning engineers to create a highly predictable model. Your data contains both text and numerical data. The numerical data is of different ranges, and some text features require proper ordering. In this hands-on lab, you will learn how to encode, scale, and bin the data using scikit-learn.

Google Cloud Platform icon
Lab platform
Lab Info
Level
Intermediate
Last updated
Oct 14, 2025
Duration
45m

Contact sales

By clicking submit, you agree to our Privacy Policy and Terms of Use.
Table of Contents
  1. Challenge

    Launch SageMaker Notebook

    Log in to the AWS console and navigate to **AWS SageMaker **. From there, load the Jupyter Notebook that has been provided with this hands-on lab.

  2. Challenge

    Load Libraries and Prepare the Data
    1. Use the Pandas library and load the data from "Employee_encoding.csv".
    2. Display the top few rows and ensure the data is read successfully.
  3. Challenge

    Apply Encoding Techniques
    1. Use **OrdinalEncoder **and encode the title feature.
    2. Check the categories and ensure the encoder's categories follow the required ordering.
    3. Use OneHotEncoder and encode the gender feature.
    4. Use Labelencoder and encode the department feature.
  4. Challenge

    Apply Scaling Techniques
    1. Use MinMaxScaler and scale the salary feature to values between 0 and 1.
    2. Use the scaler's describe function and validate the values.
  5. Challenge

    Apply Binning Techniques
    1. Initialize KBinsDiscretizer and apply the equal-frequency strategy to the age feature.
    2. Use matplotlib and plot the binned data.
About the author

Pluralsight Skills gives leaders confidence they have the skills needed to execute technology strategy. Technology teams can benchmark expertise across roles, speed up release cycles and build reliable, secure products. By leveraging our expert content, skill assessments and one-of-a-kind analytics, keep up with the pace of change, put the right people on the right projects and boost productivity. It's the most effective path to developing tech skills at scale.

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Get started with Pluralsight