Libraries: If you want this lab, consider one of these libraries.
Data

Index Objects with Pandas Hands-on Practice

In this lab, you'll master data manipulation and retrieval using DataFrame operations, various indexing methods including datetime and multi-indexing, and advanced categorization techniques.

Get started Contact sales

Lab Info

Level

Beginner

Last updated

Jan 07, 2026

Duration

30m

Challenge

Exploring DataFrames and Indexing in Pandas
Jupyter Guide

To get started, open the file on the right entitled "Step 1...". You'll complete each task for Step 1 in that Jupyter Notebook file. Remember, you must run the cells (ctrl/cmd + Enter) for each task before moving onto the next task in the Jupyter Notebook. Continue until you have completed all tasks in this step. Then when you are ready to move onto the next step, you'll come back and click on the file for the next step until you have completed all tasks in all steps of the lab.

Exploring DataFrames and Indexing in Pandas

To review the concepts covered in this step, please refer to the Introduction to Indexing Objects in Pandas module of the Index Objects with Pandas course.

Understanding the structure of DataFrames and the concept of indexing in Pandas is important because it forms the foundation for data extraction, manipulation, and modification. This step will allow you to practice the basics of indexing and explore a dataset using position-based indexing.

Let's dive into the world of data analysis with Pandas! In this step, you'll get hands-on experience with the basics of DataFrames and indexing in Pandas. You'll be using the Learning_Management.csv dataset to practice extracting data using numerical indexing, and selecting subsets of data using both row and column labels. The goal here is to familiarize yourself with the structure of DataFrames and understand the importance of indexing in data extraction.

Task 1.1: Importing the Pandas Library

Before you can start working with DataFrames, you need to import the pandas library. Import the pandas library as pd.

🔍 Hint

Use the import keyword followed by the library name and as keyword to give it a short alias. For example, import pandas as pd.
🔑 Solution

import pandas as pd
Task 1.2: Loading the Dataset

Load the Learning_Management.csv file into a DataFrame using pandas. Name the DataFrame df.

🔍 Hint

Use the pd.read_csv() function to read the csv file. Pass the file path as a string to the function. For example, df = pd.read_csv('file_path').
🔑 Solution

df = pd.read_csv('Learning_Management.csv')
Task 1.3: Inspecting the DataFrame

Inspect the first 5 rows of the DataFrame using the head() function.

🔍 Hint

Use the head() function on the DataFrame to view the first 5 rows. For example, df.head().
🔑 Solution

df.head()
Task 1.4: Selecting a Single Column

Select the employee_name column from the DataFrame.

🔍 Hint

Use the column label as an index to select a single column. For example, df['column_name'].
🔑 Solution

df['employee_name']
Task 1.5: Selecting Multiple Columns

Select the employee_name and course_name columns from the DataFrame.

🔍 Hint

Use a list of column labels as an index to select multiple columns. For example, df[['column1', 'column2']].
🔑 Solution

df[['employee_name', 'course_name']]
Task 1.6: Selecting Rows Using Index

Select the first 10 rows of the DataFrame using numerical indexing.

🔍 Hint

Use the iloc property with a slice to select rows. For example, df.iloc[start:end]. If you want to start at the beginning, leave start empty and include only 'end'. For example, df.iloc[:end].
🔑 Solution

df.iloc[:10]
Task 1.7: Selecting Subsets of Data

Select the employee_name and course_name columns for the first 10 rows of the DataFrame.

🔍 Hint

Use the iloc property with a slice for rows and a list of column labels for columns. For example, df.iloc[:10][column list].
🔑 Solution

df.iloc[:10][['employee_name', 'course_name']]
Challenge

Working with Time Series Data in Pandas
Working with Time Series Data in Pandas

To review the concepts covered in this step, please refer to the Pandas Index Objects for Time Series Data module of the Index Objects with Pandas course.

Understanding how to use datetime and timedelta indexing in Pandas is important because it allows for efficient handling and manipulation of time-series data. This step will provide you with the opportunity to practice these concepts using a real-world dataset.

Time to tackle time-series data! In this step, you'll explore how to use datetime and timedelta indexing in Pandas to manipulate and extract data. Using the completion_date column from the Learning_Management.csv dataset, you'll practice creating a datetime index, extracting data for specific time periods, and performing basic operations on pandas built-in datetime objects. The goal is to get comfortable with handling time-series data in Pandas.

Task 2.1: Load the Dataset

Start by loading the Learning_Management.csv dataset into a pandas DataFrame. Name the DataFrame df.

After loading the data, display the head of the DataFrame to view the first few rows.

🔍 Hint

Use the pd.read_csv() function to load the dataset. The file path is 'Learning_Management.csv'.
🔑 Solution

import pandas as pd df = pd.read_csv('Learning_Management.csv') df.head()
Task 2.2: Convert the 'completion_date' Column to Datetime

Convert the 'completion_date' column in the DataFrame to a datetime object. This will allow you to perform time-series operations on the data.

After converting, print the data type of the column again to see the change.

🔍 Hint

Use the pd.to_datetime() function to convert the 'completion_date' column to datetime. Make sure to assign the result back to the 'completion_date' column in the DataFrame.
🔑 Solution

# Provided code print(df['completion_date'].dtype) # Convert the 'completion_date' column to datetime and print column dtype df['completion_date'] = pd.to_datetime(df['completion_date']) print(df['completion_date'].dtype)
Task 2.3: Set the 'completion_date' Column as the DataFrame Index

Set the 'completion_date' column as the index of the DataFrame. This will allow you to use datetime indexing to select data based on the completion date.

After setting the index, display the head of the DataFrame to see the changes.

🔍 Hint

Use the df.set_index() method to set the 'completion_date' column as the index. Make sure to assign the result back to df.
🔑 Solution

df = df.set_index('completion_date') df.head()
Task 2.4: Select Data for a Specific Time Period

Select all rows in the DataFrame where the completion date is in May 2022.

After selecting, display the selected data to verify the result.

🔍 Hint

Use the df.loc[] indexer to select data for May 2022. The syntax for selecting a specific month is 'YYYY-MM'.
🔑 Solution

may_2022_data = df.loc['2022-05'] may_2022_data
Task 2.5: Perform a Basic Operation on a Datetime Object

Calculate the number of days between the earliest and latest completion dates in the DataFrame.

Display the result to see the number of days.

🔍 Hint

Use the df.index.min() and df.index.max() methods to get the earliest and latest completion dates, respectively. Subtract the earliest date from the latest date to get the number of days between them.
🔑 Solution

num_days = df.index.max() - df.index.min() num_days
Challenge

Interval, Categorical, and Period Indexing in Pandas
Interval, Categorical, and Period Indexing in Pandas

To review the concepts covered in this step, please refer to the Interval, Categorical, and Period Indexing in Pandas module of the Index Objects with Pandas course.

Understanding how to create and use interval, categorical, and period indices in Pandas is important because these indexing techniques enable advanced data extraction from a DataFrame. This step will allow you to practice creating these indices and using them to extract data from a DataFrame.

Ready to level up your indexing skills? In this step, you'll delve into interval, categorical, and period indexing in Pandas. You'll practice creating these indices using sample data and learn how to use them for efficient data extraction. The goal is to practice these advanced indexing techniques for more efficient data extraction.

Task 3.1: Create an Interval Index

Create an interval index based on a range of values. Use the pd.cut() function to divide the range 0 to 100 into 5 equal intervals. This method helps in binning or bucketing the data.

🔍 Hint

Use the pd.cut() function with a range of values (e.g., range(0, 101)) as the first argument and 5 as the second argument to create the interval index. This function will return an IntervalIndex which can be used as an index in creating a DataFrame.
🔑 Solution

# Provided Code import pandas as pd import numpy as np interval_index = pd.cut(range(0, 101), 5)
Task 3.2: Create a DataFrame Using Interval Index

Use the interval index created in Task 3.1 to create a DataFrame with random data. The DataFrame should have 101 rows and 2 columns. Use the provided code to create the random data.

🔍 Hint

Use pd.DataFrame() with np.random.randn(101, 2) to create random data. Use the interval index created in Task 3.1 as the index of the DataFrame.
🔑 Solution

# Provided code random_data = np.random.randn(101, 2) df_interval = pd.DataFrame(random_data, index=interval_index, columns=['A', 'B'])
Task 3.3: Index into the Interval Indexed DataFrame

Select the rows from the DataFrame created in Task 3.2 where the interval index includes the value 42.

🔍 Hint

To index into the DataFrame, use the indexer df_interval.loc[] with the specific value (e.g., 42) you want to find within the intervals.
🔑 Solution

df_interval.loc[42]
Task 3.4: Create a Categorical Index

Create a categorical index using a list of 4 categories. Categorical data is a Pandas data type corresponding to categorical variables in statistics.

🔍 Hint

Create a list of categories and use the pd.Categorical() function to create the categorical index. The pd.Categorical() function is used for creating array-like objects representing categorical variables.
🔑 Solution

categories = ['Category1', 'Category2', 'Category3', 'Category4'] categorical_index = pd.Categorical(categories)
Task 3.5: Create a DataFrame Using Categorical Index

Use the categorical index created in Task 3.4 to create a DataFrame with random data. The DataFrame should have 4 rows and 2 columns. Use the provided code to create the random data.

🔍 Hint

Use pd.DataFrame() with np.random.randn(4, 2) to create random data. Use the categorical index created in Task 3.4 as the index of the DataFrame.
🔑 Solution

# Provided code random_data = np.random.randn(4, 2) df_categorical = pd.DataFrame(random_data, index=categorical_index, columns=['A', 'B'])
Task 3.6: Index into the Categorical Indexed DataFrame

Select the row from the DataFrame created in Task 3.5 that corresponds to your second category.

🔍 Hint

To index into the DataFrame, use the indexer df_categorical.loc[] with the specific category (e.g., 'Category2') you want to access.
🔑 Solution

df_categorical.loc['Category2']
Task 3.7: Create a Period Index

Create a period index representing each month in 2023. Period indices are useful for time series data that require to be aggregated or indexed by a particular time period.

🔍 Hint

Use the pd.period_range() function to create a period index that represents each month in a year. The first argument should be in the format 'YYYY-MM', followed by keyword arguments periods=12 and freq=M. This function returns a PeriodIndex which can be used to index data in a DataFrame.
🔑 Solution

period_index = pd.period_range('2023-01', periods=12, freq='M')
Task 3.8: Create a DataFrame Using Period Index

Use the period index created in Task 3.7 to create a DataFrame with random data. The DataFrame should have 12 rows and 2 columns.

🔍 Hint

Use pd.DataFrame() with np.random.randn(12, 2) to create random data. Use the period index created in Task 3.7 as the index of the DataFrame.
🔑 Solution

# Provided code random_data = np.random.randn(12, 2) df_period = pd.DataFrame(random_data, index=period_index, columns=['A', 'B'])
Task 3.9: Index into the Period Indexed DataFrame

Select the row from the DataFrame created in Task 3.8 that corresponds to the month '2023-05'.

🔍 Hint

To index into the DataFrame, use the indexer df_period.loc[] with the specific period (e.g., '2023-05') you want to access.
🔑 Solution

df_period.loc['2023-05']
Challenge

Multi-indexing in Pandas
Multi-indexing in Pandas

To review the concepts covered in this step, please refer to the Multi-indexing in Pandas module of the Index Objects with Pandas course.

Understanding how to create and use a MultiIndex in Pandas is important because it allows for efficient organization and retrieval of hierarchical data. This step will provide you with the opportunity to practice creating a MultiIndex and using it to retrieve data at different hierarchy levels.

Let's dive into the world of multi-indexing! In this step, you'll learn how to create a MultiIndex for hierarchical data organization in a DataFrame. Using the Learning_Management.csv dataset, you'll practice creating a MultiIndex and using it to retrieve data at different hierarchy levels. The goal is to understand the benefits of using MultiIndexing in pandas for hierarchical data organization and efficient data retrieval.

Task 4.1: Importing Pandas

Before we start working with the data, we need to import pandas. In this task, import the pandas library which will be used throughout this step.

🔍 Hint

Use the import keyword to import pandas. It's common to import pandas as pd.
🔑 Solution

import pandas as pd
Task 4.2: Loading the Dataset

Now that we have imported pandas, let's load the dataset. The dataset is stored in a CSV file named 'Learning_Management.csv'.

🔍 Hint

Use the read_csv function from pandas to load the dataset. The file path is 'Learning_Management.csv'.
🔑 Solution

df = pd.read_csv('Learning_Management.csv')
Task 4.3: Creating a MultiIndex

Now that we have loaded the dataset, let's create a MultiIndex. We will use the 'employee_id' and 'course_id' columns as our index. This will allow us to organize our data hierarchically.

After creating the MultiIndex, display the head of the DataFrame to visualize the change.

🔍 Hint

Use the set_index function on the dataframe and pass in a list of column names ['employee_id', 'course_id'] to create a MultiIndex. Then use df.head() to display the first few rows of the DataFrame.
🔑 Solution

df.set_index(['employee_id', 'course_id'], inplace=True) df.head() # or # df = df.set_index(['employee_id', 'course_id']) # df.head()
Task 4.4: Retrieving Data Using MultiIndex

With the MultiIndex created, your next task is to retrieve data for a specific course. Find all employees who completed the course with courseid 'C002'.

🔍 Hint

Use the xs (cross-section) function on the DataFrame to retrieve data for a specific 'course_id'. You'll need to specify the course id (e.g., 'C002') and the level ('course_id') at which to perform the cross-section.
🔑 Solution

df.xs('C002', level='course_id')

About the author

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.

Index Objects with Pandas Hands-on Practice

Lab Info

Table of Contents

Exploring DataFrames and Indexing in Pandas

Jupyter Guide

Exploring DataFrames and Indexing in Pandas

Task 1.1: Importing the Pandas Library

Task 1.2: Loading the Dataset

Task 1.3: Inspecting the DataFrame

Task 1.4: Selecting a Single Column

Task 1.5: Selecting Multiple Columns

Task 1.6: Selecting Rows Using Index

Task 1.7: Selecting Subsets of Data

Working with Time Series Data in Pandas

Working with Time Series Data in Pandas

Task 2.1: Load the Dataset

Task 2.2: Convert the 'completion_date' Column to Datetime

Task 2.3: Set the 'completion_date' Column as the DataFrame Index

Task 2.4: Select Data for a Specific Time Period

Task 2.5: Perform a Basic Operation on a Datetime Object

Interval, Categorical, and Period Indexing in Pandas

Interval, Categorical, and Period Indexing in Pandas

Task 3.1: Create an Interval Index

Task 3.2: Create a DataFrame Using Interval Index

Task 3.3: Index into the Interval Indexed DataFrame

Task 3.4: Create a Categorical Index

Task 3.5: Create a DataFrame Using Categorical Index

Task 3.6: Index into the Categorical Indexed DataFrame

Task 3.7: Create a Period Index

Task 3.8: Create a DataFrame Using Period Index

Task 3.9: Index into the Period Indexed DataFrame

Multi-indexing in Pandas

Multi-indexing in Pandas

Task 4.1: Importing Pandas

Task 4.2: Loading the Dataset

Task 4.3: Creating a MultiIndex

Task 4.4: Retrieving Data Using MultiIndex

About the author

Real skill practice before real-world application

Learn by doing

Follow your guide

Turn time into mastery

Get started with Pluralsight