Hamburger Icon
  • Labs icon Lab
  • Data
Labs

Plotting Data with Pandas Hands-on Practice

In this lab, you'll explore data visualization in Jupyter Notebook using Pandas and Matplotlib. You'll create line and scatter plots to understand trends and relationships, a bar plot to compare literacy rates, and a histogram for population distribution. The lab concludes with an area chart showing GDP changes over time. This hands-on lab enhances your skills in visualizing and interpreting data, crucial for effective data analysis.

Labs

Path Info

Level
Clock icon Beginner
Duration
Clock icon 23m
Published
Clock icon Dec 06, 2023

Contact sales

By filling out this form and clicking submit, you acknowledge ourΒ privacy policy.

Table of Contents

  1. Challenge

    Creating a Basic Line Plot

    Jupyter Guide

    To get started, open the file on the right entitled "Step 1...". You'll complete each task for Step 1 in that Jupyter Notebook file. Remember, you must run the cells (Ctrl/Cmd(⌘) + Enter) for each task before moving onto the next task in the Jupyter Notebook. Continue until you have completed all tasks in this step. Then when you are ready to move onto the next step, you'll come back and click on the file for the next step until you have completed all tasks in all steps of the lab.


    Creating a Basic Line Plot

    To review the concepts covered in this step, please refer to the Introducing the Plotting Ecosystem module of the Plotting Data with Pandas course.

    Creating a basic line plot is important because it is one of the most common types of plots and can be used to represent trends over time. In this step, you will use the plot function to create a line graph.

    Let's get our toes wet and create a basic line plot! The goal of this step is to use the plot function to create a line graph that visualizes some basic data point. You will create some data points with lists of integers and you'll plot the data points using the matplotlib.pyplot.plot function.


    Task 1.1: Importing Required Libraries

    Before we can start plotting, we need to import the necessary libraries. In this task, you will import matplotlib's pyplot for plotting.

    πŸ” Hint

    Use the import keyword to import matplotlib's pyplot. Remember to use as keyword to give it an alias. We want to import, matplotlib.pyplot.

    πŸ”‘ Solution
    import matplotlib.pyplot as plt
    

    Task 1.2: Creating Sample Data

    In this task, you will create some sample data for plotting. Let's define two lists, x representing the x-axis values and y representing the y-axis values.

    πŸ” Hint

    You can create lists in Python using square brackets []. For example, x = [1, 2, 3, 4, 5].

    πŸ”‘ Solution
    x = [1, 2, 3, 4, 5]
    y = [1, 4, 9, 16, 25]
    

    Task 1.3: Creating a Line Plot

    Now that we have our data, let's create a line plot. In this task, you will create a line plot using the sample data we defined in the previous task.

    πŸ” Hint

    Use the plt.plot() function to create a line plot. Pass the x and y lists to this function.

    πŸ”‘ Solution
    plt.plot(x, y)
    plt.show()
    
  2. Challenge

    Creating a Scatter Plot

    Creating a Scatter Plot

    To review the concepts covered in this step, please refer to the Plotting Using Pandas module of the Plotting Data with Pandas course.

    Let's move on to another type of plot - the scatter plot! Creating a scatter plot is important because it can be used to visualize the relationship between two variables. In this step, you will use pandas to create a scatter plot.

    The goal of this step is to create a scatter plot that visualizes the relationship between a country's GDP and its life expectancy. You will use the 'GDP' column as the x-axis and the 'Life_Expectancy' column as the y-axis.


    Task 2.1: Import Necessary Libraries

    Import the pandas and matplotlib libraries, which will be used for data manipulation and visualization respectively.

    πŸ” Hint

    Use the import keyword to import pandas as pd and matplotlib.pyplot as plt.

    πŸ”‘ Solution
    import pandas as pd
    import matplotlib.pyplot as plt
    

    Task 2.2: Load the Data

    Load the 'World_Bank-Fictional.csv' file into a pandas DataFrame.

    πŸ” Hint

    Use the pd.read_csv() function to read the csv file. The file path is 'World_Bank-Fictional.csv'.

    πŸ”‘ Solution
    df = pd.read_csv('World_Bank-Fictional.csv')
    

    Task 2.3: Inspect the Data

    Inspect the first 5 rows of the DataFrame to understand its structure and contents.

    πŸ” Hint

    Use the head() function on the DataFrame to inspect the first 5 rows.

    πŸ”‘ Solution
    df.head()
    

    Task 2.4: Create a Scatter Plot

    Create a scatter plot using the 'GDP' column as the x-axis and the 'Life_Expectancy' column as the y-axis.

    πŸ” Hint

    Use the plot() method on the DataFrame. Set kind='scatter'. Set x='GDP' as the x parameter and y='Life_Expectancy' as the y parameter.

    πŸ”‘ Solution
    df.plot(kind='scatter', x='GDP', y='Life_Expectancy')
    plt.show()
    
  3. Challenge

    Creating a Bar Plot

    Creating a Bar Plot

    To review the concepts covered in this step, please refer to the Plotting Using Pandas module of the Plotting Data with Pandas course.

    Next up is the bar plot! Creating a bar plot is important because it can be used to compare the values of different categories. In this step, you will create a bar chart in pandas.

    The goal of this step is to create a bar chart that compares the literacy rates of different countries. You will use the 'Country' column as the x-axis and the 'Literacy_Rate' column as the y-axis. Don't forget to add a title and labels to your chart!


    Task 3.1: Import Necessary Libraries

    Import the pandas and matplotlib libraries which will be used for data manipulation and plotting respectively.

    πŸ” Hint

    Use the import keyword to import pandas as pd and matplotlib.pyplot as plt.

    πŸ”‘ Solution
    import pandas as pd
    import matplotlib.pyplot as plt
    

    Task 3.2: Load the Data

    Load the 'World_Bank-Fictional.csv' file into a pandas DataFrame.

    πŸ” Hint

    Use the pd.read_csv() function to read the csv file. Assign the DataFrame to a variable named df.

    πŸ”‘ Solution
    df = pd.read_csv('World_Bank-Fictional.csv')
    

    Task 3.3: Inspect the Data

    Inspect the first 5 rows of the DataFrame to understand the structure of the data.

    πŸ” Hint

    Use the head() function on the DataFrame df to display the first 5 rows.

    πŸ”‘ Solution
    df.head()
    

    Task 3.4: Create a Grouped Bar Plot

    Create a grouped bar plot where each country is a group, and within each group, there are bars for different years. Use the provided code to create the pivot table we will plot. Call the plot(kind='bar') method on the pivot_df object.

    The x-axis should represent the years, and the y-axis should be the 'Literacy_Rate'. Add a title and labels to your chart.

    πŸ” Hint

    In the provided code, we use the pivot_table() function to reshape the DataFrame so that the 'Year' becomes the index, 'Country' becomes the columns, and 'Literacy_Rate' are the values. Inspect this DataFrame using head() if you want to understand this further. Then use pivot_df.plot() to create a grouped bar chart. Use plt.title(), plt.xlabel(), and plt.ylabel() for adding a title and labels.

    πŸ”‘ Solution
    # Provided code
    pivot_df = df.pivot_table(index='Year', columns='Country', values='Literacy_Rate')
    
    # Create bar plot
    pivot_df.plot(kind='bar', edgecolor='black')
    plt.title('Literacy Rate by Country Over Years')
    plt.xlabel('Year')
    plt.ylabel('Literacy Rate')
    plt.show()
    
  4. Challenge

    Creating a Histogram

    Creating a Histogram

    To review the concepts covered in this step, please refer to the Plotting Using Pandas module of the Plotting Data with Pandas course.

    Now let's visualize some frequency data with a histogram! Creating a histogram is important because it can be used to visualize the distribution of a dataset. In this step, you will generate a histogram using Pandas.

    The goal of this step is to create a histogram that shows the distribution of the 'Population' column. Remember, a histogram represents frequency data and will help you visualize the distribution of the population data.


    Task 4.1: Import Necessary Libraries

    Import the necessary libraries for data manipulation and visualization. In this case, you will need pandas and matplotlib.

    πŸ” Hint

    Use the import keyword to import pandas and matplotlib. Remember to use pd and plt as aliases for convenience.

    πŸ”‘ Solution
    import pandas as pd
    import matplotlib.pyplot as plt
    

    Task 4.2: Load the Dataset

    Load the 'World_Bank-Fictional.csv' dataset using pandas and assign it to a variable named df.

    πŸ” Hint

    Use the pd.read_csv() function to read the csv file. The file path is 'World_Bank-Fictional.csv'.

    πŸ”‘ Solution
    df = pd.read_csv('World_Bank-Fictional.csv')
    

    Task 4.3: Inspect the Dataset

    Inspect the first 5 rows of the dataset using the head() function.

    πŸ” Hint

    Use the head() function on the dataframe 'df' to inspect the first 5 rows.

    πŸ”‘ Solution
    df.head()
    

    Task 4.4: Plot the Histogram

    Plot a histogram of the 'Population' column using pandas. Add a title to your plot.

    πŸ” Hint

    Use the Series.plot() method from pandas on the 'Population' column of the dataframe 'df'. Use the kind='hist' argument. Don't forget to use plt.show() to display the plot.

    πŸ”‘ Solution
    df['Population'].plot(kind="hist", title="Distribution of Populations")
    plt.show()
    
  5. Challenge

    Creating an Area Chart

    Creating an Area Chart

    To review the concepts covered in this step, please refer to the Plotting Using Pandas module of the Plotting Data with Pandas course.

    Last but not least, let's create an area chart! Area charts are useful because they can be used to visualize the change in one or more quantities over time. In this step, you will create an area chart in pandas.

    The goal of this step is to create a stacked area plot that visualizes the change in GDP and population over time for each country. You will use the 'Year' column as the x-axis and the 'GDP' column pivoted on the 'Country' column as the y-axis.


    Task 5.1: Import Necessary Libraries

    Import the pandas and matplotlib libraries which will be used throughout this lab.

    πŸ” Hint

    Use the import keyword to import pandas and matplotlib. Remember to use as keyword to give them an alias.

    πŸ”‘ Solution
    import pandas as pd
    import matplotlib.pyplot as plt
    

    Task 5.2: Load the Data

    Load the 'World_Bank-Fictional.csv' file into a pandas DataFrame.

    πŸ” Hint

    Use the pd.read_csv() function to read the csv file. The file path is 'World_Bank-Fictional.csv'.

    πŸ”‘ Solution
    df = pd.read_csv('World_Bank-Fictional.csv')
    

    Task 5.3: Inspect the Data

    Inspect the first 5 rows of the DataFrame to understand its structure.

    πŸ” Hint

    Use the head() function on the DataFrame to inspect the first 5 rows.

    πŸ”‘ Solution
    df.head()
    

    Task 5.4: Pivot the Data

    Pivot the DataFrame so that each country's GDP is a separate column. Display the first 5 rows of the pivoted data.

    πŸ” Hint

    Use the pivot() function on the DataFrame. Set the 'Year' column as the index argument, 'Country' as the columns, and 'GDP' as the values. Inspect the first 5 rows of the pivoted DataFrame using head().

    πŸ”‘ Solution
    df_pivot = df.pivot(index='Year', columns='Country', values='GDP')
    df_pivot.head()
    

    Task 5.5: Plot the Data

    Create a stacked area plot of the pivoted DataFrame.

    πŸ” Hint

    Use the plot() method on the pivoted DataFrame. Set kind='area'. To create a stacked area plot, set the stacked parameter to True.

    πŸ”‘ Solution
    df_pivot.plot(kind='area', stacked=True)
    plt.title('GDP Over Time')
    plt.ylabel('GDP')
    plt.show()
    

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.