Featured resource
Tech Upskilling Playbook 2025
Tech Upskilling Playbook

Build future-ready tech teams and hit key business milestones with seven proven plays from industry leaders.

Learn more
  • Labs icon Lab
  • Data
Labs

Build a Histogram in Python to Visualize Customer Purchase Frequency

In this lab, you'll build histograms in Python to visualize customer purchasing behavior using a real-world e-commerce dataset. You’ll start by plotting basic histograms in Matplotlib, adjusting bin sizes, and adding titles, labels, and styling. Then, you’ll explore overlaying multiple distributions and exporting your charts. You’ll finish by transitioning to Seaborn for grouped histograms, enhanced themes, and accessibility-focused palettes. By the end, you’ll know how to customize and compare histograms to extract insights from customer purchase data.

Labs

Path Info

Level
Clock icon Intermediate
Duration
Clock icon 48m
Last updated
Clock icon Jul 23, 2025

Contact sales

By filling out this form and clicking submit, you acknowledge our privacy policy.

Table of Contents

  1. Challenge

    Step 1: Build Your First Histogram with Matplotlib

    Step 1: Build Your First Histogram

    In this Code Lab, you'll explore customer purchasing behavior using real-world e-commerce data. You'll begin by building a basic histogram, which groups continuous numeric values into bins to show the distribution of purchase prices.

    Histograms are a foundational chart type for understanding frequency and spread—especially helpful when analyzing how often certain purchase amounts occur.

    You’ll be using Matplotlib to create your first histogram based on purchase price.

    What You’ll Learn in This Step

    • Build a basic histogram using Matplotlib
    • Render the plot inline with plt.show()
    • Understand the role of bins and frequency in visualizing continuous data

    You'll complete all tasks in a Jupyter notebook rather than in standalone .py files.

    Open the Notebook

    • Open the file: step-one.ipynb in the workspace.
    Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.
    How to Complete Each Task > * Find the matching **code cell** labeled **Task 1.1**, **Task 1.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.
    --- ### Create a Basic Histogram Using Matplotlib

    A histogram is ideal for showing how frequently values occur within specific ranges. In this task, you'll create your first histogram to visualize customer purchase prices.

    Histograms differ from bar charts:

    • Bar charts visualize categories.
    • Histograms visualize continuous numeric ranges.

    In Matplotlib, you'll use the plt.hist() function to generate histograms easily.

    Syntax Breakdown: plt.hist(data)
    import matplotlib.pyplot as plt
    
    plt.hist(data)
    plt.show()
    
    • data: The numeric column you want to analyze (e.g., df["price"]).
    • plt.show(): Renders the chart directly in your Jupyter notebook.

    You'll work with the transactions.csv file, located in the /data directory. This dataset has already been loaded for you as a Pandas DataFrame called df. It contains customer purchase data from an e-commerce platform.

  2. Challenge

    Step 2: Control Binning and Chart Shape

    Step 2: Control Binning and Chart Shape

    In this step, you'll learn how to adjust binning parameters in Matplotlib histograms. Binning controls how your continuous data is divided into ranges, which can change the shape and interpretability of your chart.

    By changing the number of bins or using cumulative options, you can fine-tune how purchase price frequencies are displayed for clearer insights.

    What You’ll Learn in This Step

    • Control histogram shape using the bins parameter
    • Generate cumulative histograms
    • Create histograms for different numeric columns

    You'll complete all tasks in a Jupyter notebook rather than in standalone .py files.

    Open the Notebook

    • Open the file: 2-step-two.ipynb in the workspace.
    Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.
    How to Complete Each Task > * Find the matching **code cell** labeled **Task 2.1**, **Task 2.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.

    Adjust the Number of Bins in Your Histogram

    The number of bins determines how the data range is divided into intervals. More bins can reveal detailed patterns, while fewer bins can simplify trends.

    Matplotlib allows you to control binning using the bins parameter inside plt.hist().

    Syntax Example: plt.hist(data, bins=number)
    import matplotlib.pyplot as plt
    
    plt.hist(data, bins=20)
    plt.show()
    
    • bins: The number of equal-width intervals to divide your data into.
    • plt.show(): Displays the chart after binning is applied.
    ### Create a Cumulative Histogram with Density Normalization

    A cumulative histogram shows the running total of values up to each bin, making it easy to see how data accumulates across the range.

    By combining cumulative=True and density=True, you'll normalize the values and display cumulative proportions rather than raw counts.

    Syntax Example: plt.hist(data, bins, density, cumulative)
    import matplotlib.pyplot as plt
    
    plt.hist(data, bins=20, density=True, cumulative=True)
    plt.show()
    
    • bins=20: Number of bins.
    • density=True: Normalizes the histogram to a probability distribution.
    • cumulative=True: Converts to a running total.
    • plt.show(): Displays the cumulative histogram.
    ### Create a Histogram for a Different Numeric Column

    Histograms aren't limited to just prices—you can visualize any numeric column. In this task, you'll create a histogram based on the quantity column, which shows how many items were purchased per transaction.

    By setting fewer bins, you'll simplify the view to highlight overall purchase volume trends.

    Syntax Example: plt.hist(data, bins=number)
    import matplotlib.pyplot as plt
    
    plt.hist(data, bins=5)
    plt.show()
    
    • data: The numeric column you wish to plot (e.g., df["quantity"]).
    • bins=5: Divides data into 5 equal-width bins.
    • plt.show(): Renders the plot.

    You’ll now create a histogram of product quantities purchased.

  3. Challenge

    Step 3: Add Labels and Titles for Clarity

    Step 3: Add Labels and Titles for Clarity

    In this step, you'll enhance your charts by adding labels and titles. Labels help your audience quickly understand what the axes represent, while titles provide essential context for your visualization.

    These small additions dramatically increase the readability and professionalism of your charts.

    What You’ll Learn in This Step

    • Add X-axis and Y-axis labels using plt.xlabel() and plt.ylabel()
    • Add chart titles using plt.title()
    • Rotate tick labels to improve label readability

    You'll complete all tasks in a Jupyter notebook rather than in standalone .py files.

    Open the Notebook

    • Open the file: 3-step-two.ipynb in the workspace.
    Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.
    How to Complete Each Task > * Find the matching **code cell** labeled **Task 3.1**, **Task 3.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.
    ---

    Add Axis Labels and a Title to Your Histogram

    While your histogram shows data distribution, it’s missing important context without descriptive labels. You can add text to the x-axis, y-axis, and top of the chart using matplotlib.pyplot functions.

    Syntax Example
    plt.xlabel("x-Axis Label")
    plt.ylabel("y-Axis Label")
    plt.title("Chart Title")
    
    • xlabel(): Adds a label beneath the x-axis.
    • ylabel(): Adds a label beside the y-axis.
    • title(): Adds a descriptive chart title.

    Rotate X-Axis Tick Labels for Better Readability

    When category or numeric labels overlap, rotating them can improves readability. This is especially helpful when dealing with long or closely packed labels.

    You can rotate tick labels using plt.xticks() and its rotation parameter.

    Syntax Example
    plt.xticks(rotation=45)
    
    • rotation=45: Rotates X-axis labels by 45 degrees.
    • You may choose any angle, but 45 degrees is commonly used for short labels.

  4. Challenge

    Step 4: Style and Export Your Histogram

    Step 4: Style and Export Your Histogram

    In this step, you’ll enhance your histograms visually by applying color, transparency, and edge styling. Clear, attractive charts are easier to present and interpret.

    You’ll also practice exporting your chart as an image using plt.savefig() for use in reports or presentations.

    What You’ll Learn in This Step

    • Apply color, edge color, and transparency with Matplotlib styling parameters
    • Export charts as PNG image files using plt.savefig()

    You'll complete all tasks in a Jupyter notebook rather than in standalone .py files.

    Open the Notebook

    • Open the file: 4-step-four.ipynb in the workspace.
    Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.
    How to Complete Each Task > * Find the matching **code cell** labeled **Task 4.1**, **Task 4.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.
    ### Apply Styling to Your Histogram

    Styling improves both the clarity and appearance of your visualizations. Matplotlib allows you to easily customize color, edge lines, and transparency using optional parameters in plt.hist().

    Styling Parameters
    plt.hist(data, color="skyblue", edgecolor="black", alpha=0.7)
    
    • color="skyblue": Fills bars with light blue.
    • edgecolor="black": Draws black borders around each bar.
    • alpha=0.7: Adjusts transparency (1 is fully opaque, 0 is fully transparent).
    ### Export Your Histogram as an Image File

    Once your chart looks polished, you may want to save it as an image for reports or presentations. Matplotlib allows you to export plots using plt.savefig().

    Export Syntax
    plt.savefig("filename.png")
    plt.savefig("folder/subfolder/filename.png")
    
    • "filename.png": Saves the image to your current working directory.
    • "folder/subfolder/filename.png": Saves to a specific folder location if provided.
    • Always ensure the directory exists before saving to a subfolder.
  5. Challenge

    Step 5: Overlay Multiple Histograms and Add Legends

    Step 5: Overlay Multiple Histograms and Add Legends

    In this step, you’ll combine multiple histograms on the same plot. Overlaying two distributions allows you to compare related data directly. You’ll also learn how to add legends to clearly label each dataset on the chart.

    What You’ll Learn in This Step

    • Plot multiple histograms on the same axes
    • Use ax.hist() with Matplotlib subplots API
    • Add a legend using ax.legend()

    You'll complete all tasks in a Jupyter notebook rather than in standalone .py files.

    Open the Notebook

    • Open the file: 5-step-five.ipynb in the workspace.
    Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.
    How to Complete Each Task > * Find the matching **code cell** labeled **Task 5.1**, **Task 5.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.
    ### Overlay Multiple Histograms in One Chart

    Sometimes you'll want to compare two numeric distributions side by side. You can overlay multiple histograms in Matplotlib using the object-oriented API with ax.hist().

    For example, by plotting both purchase price and a derived column, you can compare how frequently different transaction totals occur.

    Syntax Example
    fig, ax = plt.subplots()
    
    ax.hist(first_data, bins=20, color="skyblue", alpha=0.7, label="First")
    ax.hist(second_data, bins=20, color="orange", alpha=0.7, label="Second")
    
    plt.show()
    
    • fig, ax = plt.subplots(): Creates reusable figure and axis objects.
    • ax.hist(): Draws a histogram on a specific axis.
    • label: Assigns a name for each dataset to be used in the legend.
    ### Add a Legend to Your Overlaid Histogram

    When displaying multiple datasets on one chart, legends help viewers quickly identify which series belongs to which color. You can add legends using ax.legend() on your Matplotlib axis object.

    Syntax Example
    ax.legend()
    ax.legend(loc="upper right")
    
    • loc: Controls where the legend appears on the chart.
    • Common values include "upper right", "lower left", "center", and more.
  6. Challenge

    Step 6: Create a Styled Histogram Using Seaborn

    Step 6: Create a Styled Histogram Using Seaborn

    In this step, you’ll shift from Matplotlib to Seaborn, which simplifies many visualization tasks and provides attractive default styling. You’ll also explore density curves, automatic theming, and custom color palettes.

    What You’ll Learn in This Step

    • Create histograms using Seaborn’s histplot() function
    • Apply kernel density estimates (KDE) for smoothed distributions
    • Style charts using Seaborn themes and color palettes

    You'll complete all tasks in a Jupyter notebook rather than in standalone .py files.

    Open the Notebook

    • Open the file: 6-step-six.ipynb in the workspace.
    Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.
    How to Complete Each Task > * Find the matching **code cell** labeled **Task 6.1**, **Task 6.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.
    ### Create a Histogram Using Seaborn’s `histplot()`

    Seaborn simplifies histogram creation while adding automatic styling. Unlike Matplotlib’s imperative style, Seaborn follows a declarative approach, where you pass both the dataset and column name directly.

    Syntax Example
    import seaborn as sns
    
    sns.histplot(data=df, x="column_name")
    plt.show()
    
    • data: The DataFrame containing your data.
    • x: The column you want to visualize.
    • plt.show(): Renders the chart inline.
    ### Add a Density Curve to Your Seaborn Histogram

    Seaborn makes it easy to overlay a kernel density estimate (KDE) on top of your histogram. This smooth curve helps visualize the probability distribution of continuous data.

    You can also control the statistic calculation using stat="density" and change the bar style with element="step" to improve how density is displayed.

    Syntax Example
    sns.histplot(data=df, x="price", stat="density", kde=True, element="step")
    plt.show()
    
    • stat="density": Normalizes counts into a probability density.
    • kde=True: Adds the smooth density curve.
    • element="step": Draws unfilled bars with outlines only.
    ### Apply a Theme and Palette to Your Seaborn Chart

    Seaborn includes built-in themes and color palettes that improve chart aesthetics automatically. You can control these globally using sns.set_theme() and sns.color_palette().

    Syntax Example
    sns.set_theme(style="whitegrid")
    sns.set_palette("Blues")
    
    • set_theme(): Controls the overall style (gridlines, background, etc.).
    • set_palette(): Defines the color palette used in subsequent plots.
  7. Challenge

    Step 7: Group, Style, and Enhance Seaborn Histograms

    Step 7: Group, Style, and Enhance Seaborn Histograms

    In this step, you’ll take Seaborn histograms to the next level by grouping data with the hue parameter, controlling legends, and applying accessibility-focused color palettes.

    These advanced features allow you to clearly compare grouped distributions and make your charts accessible for all viewers.

    What You’ll Learn in This Step

    • Group histograms using Seaborn’s hue parameter
    • Customize and remove legends
    • Apply colorblind-friendly palettes

    You'll complete all tasks in a Jupyter notebook rather than in standalone .py files.

    Open the Notebook

    • Open the file: 7-step-seven.ipynb in the workspace.
    Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.
    How to Complete Each Task > * Find the matching **code cell** labeled **Task 7.1**, **Task 7.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.
    ### Create Grouped Histograms Using `hue`

    Seaborn’s hue parameter allows you to break down a distribution by category, displaying multiple groups within the same histogram.

    When applied, Seaborn automatically assigns different colors to each group, making visual comparisons much easier.

    Syntax Example
    sns.histplot(data=df, x="price", hue="category")
    plt.show()
    
    • hue="category": Splits data into groups based on unique category values.
    • Seaborn automatically assigns distinct colors to each group.
    ### Remove the Default Legend

    Seaborn automatically adds a legend when using the hue parameter. However, you may want to customize or remove the legend depending on your chart layout.

    You can remove the default legend by accessing the axis object and calling ax.legend_.remove().

    Syntax Example
    ax = sns.histplot(data=df, x="price", hue="category")
    ax.legend_.remove()
    plt.show()
    
    • ax.legend_.remove(): Completely removes the existing legend.
    ### Re-Add a Custom Legend

    After removing the default legend, you can manually re-add one using ax.legend(). This gives you full control over placement and appearance.

    The most common approach is to simply call ax.legend() after plotting and specify the location.

    Syntax Example
    ax.legend(loc="upper right")
    
    • loc: Controls the legend placement.
    • "upper right" is one of several valid locations values.

     

    Matplotlib loc options ## Matplotlib Legend `loc` Options

    Here are the full built-in strings you can use for loc: | Keyword | Description | | ---------------- | ------------------------------------------------------------ | | 'best' | Matplotlib automatically selects the least obstructive location | | 'upper right' | Top-right corner inside the axes | | 'upper left' | Top-left corner inside the axes | | 'lower left' | Bottom-left corner | | 'lower right' | Bottom-right corner | | 'right' | Center-right vertically | | 'center left' | Middle-left vertically | | 'center right' | Middle-right vertically | | 'lower center' | Bottom-center | | 'upper center' | Top-center | | 'center' | Dead-center of plot |

     

    You can also use numeric codes:

    Matplotlib also allows integers for these positions:

    | string | integer code | | -------------- | -------- | | 'best' | 0 | | 'upper right' | 1 | | 'upper left' | 2 | | 'lower left' | 3 | | 'lower right' | 4 | | 'right' | 5 | | 'center left' | 6 | | 'center right' | 7 | | 'lower center' | 8 | | 'upper center' | 9 | | 'center' | 10 |

     

    Example:

    ax.legend(loc=1)  # same as 'upper right'
    

    Bonus: Fine-Tune Legend Position with bbox_to_anchor

    You can combine loc with bbox_to_anchor for precise control over legend placement:

     ax.legend(loc='upper right', bbox_to_anchor=(1.2, 1))
    
    • bbox_to_anchor moves legend outside or inside the axes with pixel-perfect control.
    • Very useful when you want legends outside the chart.
    ### Apply a Colorblind-Friendly Palette

    Seaborn includes built-in color palettes designed for accessibility. The "colorblind" palette ensures that your charts remain legible for viewers with common forms of color vision deficiency.

    You can apply this palette using sns.set_palette() before plotting.

    Syntax Example
    sns.set_palette("colorblind")
    sns.histplot(data=df, x="price", hue="category")
    plt.show()
    
    • sns.set_palette("colorblind"): Applies a global colorblind-friendly palette for all plots.
    # Lab Complete, Great Work!

    You’ve made it to the end of the lab—nice job!

    In this lab, you’ve worked through a full progression of histogram techniques including:

    • Building histograms with both Matplotlib and Seaborn
    • Controlling bin sizes to shape your charts
    • Adding titles, axis labels, and rotating tick labels for better readability
    • Styling your charts with colors, edges, and transparency
    • Exporting charts as image files for reports or presentations
    • Overlaying multiple datasets for side-by-side comparisons
    • Using Seaborn’s hue parameter to group data by category
    • Customizing legends and applying accessibility-friendly color palettes

    Want to see everything combined?

    Check out the lab-recap.ipynb notebook in your workspace. It pulls together everything you’ve learned into a fully polished, presentation-ready histogram.

What's a lab?

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Provided environment for hands-on practice

We will provide the credentials and environment necessary for you to practice right within your browser.

Guided walkthrough

Follow along with the author’s guided walkthrough and build something new in your provided environment!

Did you know?

On average, you retain 75% more of your learning if you get time for practice.