- Lab
- Data

Plotting Data with Pandas Hands-on Practice
In this lab, you'll explore data visualization in Jupyter Notebook using Pandas and Matplotlib. You'll create line and scatter plots to understand trends and relationships, a bar plot to compare literacy rates, and a histogram for population distribution. The lab concludes with an area chart showing GDP changes over time. This hands-on lab enhances your skills in visualizing and interpreting data, crucial for effective data analysis.

Path Info
Table of Contents
-
Challenge
Creating a Basic Line Plot
Jupyter Guide
To get started, open the file on the right entitled "Step 1...". You'll complete each task for Step 1 in that Jupyter Notebook file. Remember, you must run the cells
(Ctrl/Cmd(β) + Enter)
for each task before moving onto the next task in the Jupyter Notebook. Continue until you have completed all tasks in this step. Then when you are ready to move onto the next step, you'll come back and click on the file for the next step until you have completed all tasks in all steps of the lab.
Creating a Basic Line Plot
To review the concepts covered in this step, please refer to the Introducing the Plotting Ecosystem module of the Plotting Data with Pandas course.
Creating a basic line plot is important because it is one of the most common types of plots and can be used to represent trends over time. In this step, you will use the
plot
function to create a line graph.Let's get our toes wet and create a basic line plot! The goal of this step is to use the
plot
function to create a line graph that visualizes some basic data point. You will create some data points with lists of integers and you'll plot the data points using thematplotlib.pyplot.plot
function.
Task 1.1: Importing Required Libraries
Before we can start plotting, we need to import the necessary libraries. In this task, you will import matplotlib's pyplot for plotting.
π Hint
Use the
import
keyword to import matplotlib's pyplot. Remember to useas
keyword to give it an alias. We want to import,matplotlib.pyplot
.π Solution
import matplotlib.pyplot as plt
Task 1.2: Creating Sample Data
In this task, you will create some sample data for plotting. Let's define two lists,
x
representing the x-axis values andy
representing the y-axis values.π Hint
You can create lists in Python using square brackets
[]
. For example,x = [1, 2, 3, 4, 5]
.π Solution
x = [1, 2, 3, 4, 5] y = [1, 4, 9, 16, 25]
Task 1.3: Creating a Line Plot
Now that we have our data, let's create a line plot. In this task, you will create a line plot using the sample data we defined in the previous task.
π Hint
Use the
plt.plot()
function to create a line plot. Pass thex
andy
lists to this function.π Solution
plt.plot(x, y) plt.show()
-
Challenge
Creating a Scatter Plot
Creating a Scatter Plot
To review the concepts covered in this step, please refer to the Plotting Using Pandas module of the Plotting Data with Pandas course.
Let's move on to another type of plot - the scatter plot! Creating a scatter plot is important because it can be used to visualize the relationship between two variables. In this step, you will use pandas to create a scatter plot.
The goal of this step is to create a scatter plot that visualizes the relationship between a country's GDP and its life expectancy. You will use the 'GDP' column as the x-axis and the 'Life_Expectancy' column as the y-axis.
Task 2.1: Import Necessary Libraries
Import the pandas and matplotlib libraries, which will be used for data manipulation and visualization respectively.
π Hint
Use the
import
keyword to import pandas aspd
and matplotlib.pyplot asplt
.π Solution
import pandas as pd import matplotlib.pyplot as plt
Task 2.2: Load the Data
Load the
'World_Bank-Fictional.csv'
file into a pandas DataFrame.π Hint
Use the
pd.read_csv()
function to read the csv file. The file path is'World_Bank-Fictional.csv'
.π Solution
df = pd.read_csv('World_Bank-Fictional.csv')
Task 2.3: Inspect the Data
Inspect the first 5 rows of the DataFrame to understand its structure and contents.
π Hint
Use the
head()
function on the DataFrame to inspect the first 5 rows.π Solution
df.head()
Task 2.4: Create a Scatter Plot
Create a scatter plot using the 'GDP' column as the x-axis and the 'Life_Expectancy' column as the y-axis.
π Hint
Use the
plot()
method on the DataFrame. Setkind='scatter'
. Setx='GDP'
as the x parameter andy='Life_Expectancy'
as the y parameter.π Solution
df.plot(kind='scatter', x='GDP', y='Life_Expectancy') plt.show()
-
Challenge
Creating a Bar Plot
Creating a Bar Plot
To review the concepts covered in this step, please refer to the Plotting Using Pandas module of the Plotting Data with Pandas course.
Next up is the bar plot! Creating a bar plot is important because it can be used to compare the values of different categories. In this step, you will create a bar chart in pandas.
The goal of this step is to create a bar chart that compares the literacy rates of different countries. You will use the 'Country' column as the x-axis and the 'Literacy_Rate' column as the y-axis. Don't forget to add a title and labels to your chart!
Task 3.1: Import Necessary Libraries
Import the pandas and matplotlib libraries which will be used for data manipulation and plotting respectively.
π Hint
Use the
import
keyword to import pandas aspd
and matplotlib.pyplot asplt
.π Solution
import pandas as pd import matplotlib.pyplot as plt
Task 3.2: Load the Data
Load the
'World_Bank-Fictional.csv'
file into a pandas DataFrame.π Hint
Use the
pd.read_csv()
function to read the csv file. Assign the DataFrame to a variable nameddf
.π Solution
df = pd.read_csv('World_Bank-Fictional.csv')
Task 3.3: Inspect the Data
Inspect the first 5 rows of the DataFrame to understand the structure of the data.
π Hint
Use the
head()
function on the DataFramedf
to display the first 5 rows.π Solution
df.head()
Task 3.4: Create a Grouped Bar Plot
Create a grouped bar plot where each country is a group, and within each group, there are bars for different years. Use the provided code to create the pivot table we will plot. Call the
plot(kind='bar')
method on thepivot_df
object.The x-axis should represent the years, and the y-axis should be the 'Literacy_Rate'. Add a title and labels to your chart.
π Hint
In the provided code, we use the
pivot_table()
function to reshape the DataFrame so that the 'Year' becomes the index, 'Country' becomes the columns, and 'Literacy_Rate' are the values. Inspect this DataFrame usinghead()
if you want to understand this further. Then usepivot_df.plot()
to create a grouped bar chart. Useplt.title()
,plt.xlabel()
, andplt.ylabel()
for adding a title and labels.π Solution
# Provided code pivot_df = df.pivot_table(index='Year', columns='Country', values='Literacy_Rate') # Create bar plot pivot_df.plot(kind='bar', edgecolor='black') plt.title('Literacy Rate by Country Over Years') plt.xlabel('Year') plt.ylabel('Literacy Rate') plt.show()
-
Challenge
Creating a Histogram
Creating a Histogram
To review the concepts covered in this step, please refer to the Plotting Using Pandas module of the Plotting Data with Pandas course.
Now let's visualize some frequency data with a histogram! Creating a histogram is important because it can be used to visualize the distribution of a dataset. In this step, you will generate a histogram using Pandas.
The goal of this step is to create a histogram that shows the distribution of the 'Population' column. Remember, a histogram represents frequency data and will help you visualize the distribution of the population data.
Task 4.1: Import Necessary Libraries
Import the necessary libraries for data manipulation and visualization. In this case, you will need pandas and matplotlib.
π Hint
Use the
import
keyword to import pandas and matplotlib. Remember to usepd
andplt
as aliases for convenience.π Solution
import pandas as pd import matplotlib.pyplot as plt
Task 4.2: Load the Dataset
Load the
'World_Bank-Fictional.csv'
dataset using pandas and assign it to a variable nameddf
.π Hint
Use the
pd.read_csv()
function to read the csv file. The file path is'World_Bank-Fictional.csv'
.π Solution
df = pd.read_csv('World_Bank-Fictional.csv')
Task 4.3: Inspect the Dataset
Inspect the first 5 rows of the dataset using the
head()
function.π Hint
Use the
head()
function on the dataframe 'df' to inspect the first 5 rows.π Solution
df.head()
Task 4.4: Plot the Histogram
Plot a histogram of the
'Population'
column using pandas. Add a title to your plot.π Hint
Use the
Series.plot()
method from pandas on the 'Population' column of the dataframe 'df'. Use thekind='hist'
argument. Don't forget to useplt.show()
to display the plot.π Solution
df['Population'].plot(kind="hist", title="Distribution of Populations") plt.show()
-
Challenge
Creating an Area Chart
Creating an Area Chart
To review the concepts covered in this step, please refer to the Plotting Using Pandas module of the Plotting Data with Pandas course.
Last but not least, let's create an area chart! Area charts are useful because they can be used to visualize the change in one or more quantities over time. In this step, you will create an area chart in pandas.
The goal of this step is to create a stacked area plot that visualizes the change in GDP and population over time for each country. You will use the
'Year'
column as the x-axis and the'GDP'
column pivoted on the'Country'
column as the y-axis.
Task 5.1: Import Necessary Libraries
Import the pandas and matplotlib libraries which will be used throughout this lab.
π Hint
Use the
import
keyword to import pandas and matplotlib. Remember to useas
keyword to give them an alias.π Solution
import pandas as pd import matplotlib.pyplot as plt
Task 5.2: Load the Data
Load the
'World_Bank-Fictional.csv'
file into a pandas DataFrame.π Hint
Use the
pd.read_csv()
function to read the csv file. The file path is'World_Bank-Fictional.csv'
.π Solution
df = pd.read_csv('World_Bank-Fictional.csv')
Task 5.3: Inspect the Data
Inspect the first 5 rows of the DataFrame to understand its structure.
π Hint
Use the
head()
function on the DataFrame to inspect the first 5 rows.π Solution
df.head()
Task 5.4: Pivot the Data
Pivot the DataFrame so that each country's GDP is a separate column. Display the first 5 rows of the pivoted data.
π Hint
Use the
pivot()
function on the DataFrame. Set the 'Year' column as the index argument, 'Country' as the columns, and 'GDP' as the values. Inspect the first 5 rows of the pivoted DataFrame usinghead()
.π Solution
df_pivot = df.pivot(index='Year', columns='Country', values='GDP') df_pivot.head()
Task 5.5: Plot the Data
Create a stacked area plot of the pivoted DataFrame.
π Hint
Use the
plot()
method on the pivoted DataFrame. Setkind='area'
. To create a stacked area plot, set thestacked
parameter to True.π Solution
df_pivot.plot(kind='area', stacked=True) plt.title('GDP Over Time') plt.ylabel('GDP') plt.show()
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the authorβs guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.