- Lab
- Data

Build a Histogram in Python to Visualize Customer Purchase Frequency
In this lab, you'll build histograms in Python to visualize customer purchasing behavior using a real-world e-commerce dataset. You’ll start by plotting basic histograms in Matplotlib, adjusting bin sizes, and adding titles, labels, and styling. Then, you’ll explore overlaying multiple distributions and exporting your charts. You’ll finish by transitioning to Seaborn for grouped histograms, enhanced themes, and accessibility-focused palettes. By the end, you’ll know how to customize and compare histograms to extract insights from customer purchase data.

Path Info
Table of Contents
-
Challenge
Step 1: Build Your First Histogram with Matplotlib
Step 1: Build Your First Histogram
In this Code Lab, you'll explore customer purchasing behavior using real-world e-commerce data. You'll begin by building a basic histogram, which groups continuous numeric values into bins to show the distribution of purchase prices.
Histograms are a foundational chart type for understanding frequency and spread—especially helpful when analyzing how often certain purchase amounts occur.
You’ll be using Matplotlib to create your first histogram based on purchase price.
What You’ll Learn in This Step
- Build a basic histogram using Matplotlib
- Render the plot inline with
plt.show()
- Understand the role of bins and frequency in visualizing continuous data
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Open the file:
step-one.ipynb
in the workspace.
Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.How to Complete Each Task
> * Find the matching **code cell** labeled **Task 1.1**, **Task 1.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.A histogram is ideal for showing how frequently values occur within specific ranges. In this task, you'll create your first histogram to visualize customer purchase prices.
Histograms differ from bar charts:
- Bar charts visualize categories.
- Histograms visualize continuous numeric ranges.
In Matplotlib, you'll use the
plt.hist()
function to generate histograms easily.Syntax Breakdown:
plt.hist(data)
import matplotlib.pyplot as plt plt.hist(data) plt.show()
data
: The numeric column you want to analyze (e.g.,df["price"]
).plt.show()
: Renders the chart directly in your Jupyter notebook.
You'll work with the
transactions.csv
file, located in the/data
directory. This dataset has already been loaded for you as a Pandas DataFrame calleddf
. It contains customer purchase data from an e-commerce platform. -
Challenge
Step 2: Control Binning and Chart Shape
Step 2: Control Binning and Chart Shape
In this step, you'll learn how to adjust binning parameters in Matplotlib histograms. Binning controls how your continuous data is divided into ranges, which can change the shape and interpretability of your chart.
By changing the number of bins or using cumulative options, you can fine-tune how purchase price frequencies are displayed for clearer insights.
What You’ll Learn in This Step
- Control histogram shape using the
bins
parameter - Generate cumulative histograms
- Create histograms for different numeric columns
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Open the file:
2-step-two.ipynb
in the workspace.
Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.How to Complete Each Task
> * Find the matching **code cell** labeled **Task 2.1**, **Task 2.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.
Adjust the Number of Bins in Your Histogram
The number of bins determines how the data range is divided into intervals. More bins can reveal detailed patterns, while fewer bins can simplify trends.
Matplotlib allows you to control binning using the
bins
parameter insideplt.hist()
.Syntax Example:
plt.hist(data, bins=number)
import matplotlib.pyplot as plt plt.hist(data, bins=20) plt.show()
bins
: The number of equal-width intervals to divide your data into.plt.show()
: Displays the chart after binning is applied.
A cumulative histogram shows the running total of values up to each bin, making it easy to see how data accumulates across the range.
By combining
cumulative=True
anddensity=True
, you'll normalize the values and display cumulative proportions rather than raw counts.Syntax Example:
plt.hist(data, bins, density, cumulative)
import matplotlib.pyplot as plt plt.hist(data, bins=20, density=True, cumulative=True) plt.show()
bins=20
: Number of bins.density=True
: Normalizes the histogram to a probability distribution.cumulative=True
: Converts to a running total.plt.show()
: Displays the cumulative histogram.
Histograms aren't limited to just prices—you can visualize any numeric column. In this task, you'll create a histogram based on the
quantity
column, which shows how many items were purchased per transaction.By setting fewer bins, you'll simplify the view to highlight overall purchase volume trends.
Syntax Example:
plt.hist(data, bins=number)
import matplotlib.pyplot as plt plt.hist(data, bins=5) plt.show()
data
: The numeric column you wish to plot (e.g.,df["quantity"]
).bins=5
: Divides data into 5 equal-width bins.plt.show()
: Renders the plot.
You’ll now create a histogram of product quantities purchased.
- Control histogram shape using the
-
Challenge
Step 3: Add Labels and Titles for Clarity
Step 3: Add Labels and Titles for Clarity
In this step, you'll enhance your charts by adding labels and titles. Labels help your audience quickly understand what the axes represent, while titles provide essential context for your visualization.
These small additions dramatically increase the readability and professionalism of your charts.
What You’ll Learn in This Step
- Add X-axis and Y-axis labels using
plt.xlabel()
andplt.ylabel()
- Add chart titles using
plt.title()
- Rotate tick labels to improve label readability
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Open the file:
3-step-two.ipynb
in the workspace.
Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.How to Complete Each Task
> * Find the matching **code cell** labeled **Task 3.1**, **Task 3.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.Add Axis Labels and a Title to Your Histogram
While your histogram shows data distribution, it’s missing important context without descriptive labels. You can add text to the x-axis, y-axis, and top of the chart using
matplotlib.pyplot
functions.Syntax Example
plt.xlabel("x-Axis Label") plt.ylabel("y-Axis Label") plt.title("Chart Title")
xlabel()
: Adds a label beneath the x-axis.ylabel()
: Adds a label beside the y-axis.title()
: Adds a descriptive chart title.
Rotate X-Axis Tick Labels for Better Readability
When category or numeric labels overlap, rotating them can improves readability. This is especially helpful when dealing with long or closely packed labels.
You can rotate tick labels using
plt.xticks()
and itsrotation
parameter.Syntax Example
plt.xticks(rotation=45)
rotation=45
: Rotates X-axis labels by 45 degrees.- You may choose any angle, but 45 degrees is commonly used for short labels.
- Add X-axis and Y-axis labels using
-
Challenge
Step 4: Style and Export Your Histogram
Step 4: Style and Export Your Histogram
In this step, you’ll enhance your histograms visually by applying color, transparency, and edge styling. Clear, attractive charts are easier to present and interpret.
You’ll also practice exporting your chart as an image using
plt.savefig()
for use in reports or presentations.What You’ll Learn in This Step
- Apply color, edge color, and transparency with Matplotlib styling parameters
- Export charts as PNG image files using
plt.savefig()
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Open the file:
4-step-four.ipynb
in the workspace.
Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.How to Complete Each Task
> * Find the matching **code cell** labeled **Task 4.1**, **Task 4.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.Styling improves both the clarity and appearance of your visualizations. Matplotlib allows you to easily customize color, edge lines, and transparency using optional parameters in
plt.hist()
.Styling Parameters
plt.hist(data, color="skyblue", edgecolor="black", alpha=0.7)
color="skyblue"
: Fills bars with light blue.edgecolor="black"
: Draws black borders around each bar.alpha=0.7
: Adjusts transparency (1 is fully opaque, 0 is fully transparent).
Once your chart looks polished, you may want to save it as an image for reports or presentations. Matplotlib allows you to export plots using
plt.savefig()
.Export Syntax
plt.savefig("filename.png") plt.savefig("folder/subfolder/filename.png")
"filename.png"
: Saves the image to your current working directory."folder/subfolder/filename.png"
: Saves to a specific folder location if provided.- Always ensure the directory exists before saving to a subfolder.
-
Challenge
Step 5: Overlay Multiple Histograms and Add Legends
Step 5: Overlay Multiple Histograms and Add Legends
In this step, you’ll combine multiple histograms on the same plot. Overlaying two distributions allows you to compare related data directly. You’ll also learn how to add legends to clearly label each dataset on the chart.
What You’ll Learn in This Step
- Plot multiple histograms on the same axes
- Use
ax.hist()
with Matplotlib subplots API - Add a legend using
ax.legend()
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Open the file:
5-step-five.ipynb
in the workspace.
Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.How to Complete Each Task
> * Find the matching **code cell** labeled **Task 5.1**, **Task 5.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.Sometimes you'll want to compare two numeric distributions side by side. You can overlay multiple histograms in Matplotlib using the object-oriented API with
ax.hist()
.For example, by plotting both purchase price and a derived column, you can compare how frequently different transaction totals occur.
Syntax Example
fig, ax = plt.subplots() ax.hist(first_data, bins=20, color="skyblue", alpha=0.7, label="First") ax.hist(second_data, bins=20, color="orange", alpha=0.7, label="Second") plt.show()
fig, ax = plt.subplots()
: Creates reusable figure and axis objects.ax.hist()
: Draws a histogram on a specific axis.label
: Assigns a name for each dataset to be used in the legend.
When displaying multiple datasets on one chart, legends help viewers quickly identify which series belongs to which color. You can add legends using
ax.legend()
on your Matplotlib axis object.Syntax Example
ax.legend() ax.legend(loc="upper right")
loc
: Controls where the legend appears on the chart.- Common values include
"upper right"
,"lower left"
,"center"
, and more.
-
Challenge
Step 6: Create a Styled Histogram Using Seaborn
Step 6: Create a Styled Histogram Using Seaborn
In this step, you’ll shift from Matplotlib to Seaborn, which simplifies many visualization tasks and provides attractive default styling. You’ll also explore density curves, automatic theming, and custom color palettes.
What You’ll Learn in This Step
- Create histograms using Seaborn’s
histplot()
function - Apply kernel density estimates (KDE) for smoothed distributions
- Style charts using Seaborn themes and color palettes
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Open the file:
6-step-six.ipynb
in the workspace.
Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.How to Complete Each Task
> * Find the matching **code cell** labeled **Task 6.1**, **Task 6.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.Seaborn simplifies histogram creation while adding automatic styling. Unlike Matplotlib’s imperative style, Seaborn follows a declarative approach, where you pass both the dataset and column name directly.
Syntax Example
import seaborn as sns sns.histplot(data=df, x="column_name") plt.show()
data
: The DataFrame containing your data.x
: The column you want to visualize.plt.show()
: Renders the chart inline.
Seaborn makes it easy to overlay a kernel density estimate (KDE) on top of your histogram. This smooth curve helps visualize the probability distribution of continuous data.
You can also control the statistic calculation using
stat="density"
and change the bar style withelement="step"
to improve how density is displayed.Syntax Example
sns.histplot(data=df, x="price", stat="density", kde=True, element="step") plt.show()
stat="density"
: Normalizes counts into a probability density.kde=True
: Adds the smooth density curve.element="step"
: Draws unfilled bars with outlines only.
Seaborn includes built-in themes and color palettes that improve chart aesthetics automatically. You can control these globally using
sns.set_theme()
andsns.color_palette()
.Syntax Example
sns.set_theme(style="whitegrid") sns.set_palette("Blues")
set_theme()
: Controls the overall style (gridlines, background, etc.).set_palette()
: Defines the color palette used in subsequent plots.
- Create histograms using Seaborn’s
-
Challenge
Step 7: Group, Style, and Enhance Seaborn Histograms
Step 7: Group, Style, and Enhance Seaborn Histograms
In this step, you’ll take Seaborn histograms to the next level by grouping data with the
hue
parameter, controlling legends, and applying accessibility-focused color palettes.These advanced features allow you to clearly compare grouped distributions and make your charts accessible for all viewers.
What You’ll Learn in This Step
- Group histograms using Seaborn’s
hue
parameter - Customize and remove legends
- Apply colorblind-friendly palettes
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Open the file:
7-step-seven.ipynb
in the workspace.
Important: You must save the file `Cmd+Shift+S` (Mac) or `Ctrl+Shift+S` (Windows/Linux) before clicking **Validate**.How to Complete Each Task
> * Find the matching **code cell** labeled **Task 7.1**, **Task 7.2**, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > You do not need to use the terminal, create extra files, or use `plt.savefig()`—all code and output will appear inline.Seaborn’s
hue
parameter allows you to break down a distribution by category, displaying multiple groups within the same histogram.When applied, Seaborn automatically assigns different colors to each group, making visual comparisons much easier.
Syntax Example
sns.histplot(data=df, x="price", hue="category") plt.show()
hue="category"
: Splits data into groups based on unique category values.- Seaborn automatically assigns distinct colors to each group.
Seaborn automatically adds a legend when using the
hue
parameter. However, you may want to customize or remove the legend depending on your chart layout.You can remove the default legend by accessing the axis object and calling
ax.legend_.remove()
.Syntax Example
ax = sns.histplot(data=df, x="price", hue="category") ax.legend_.remove() plt.show()
ax.legend_.remove()
: Completely removes the existing legend.
After removing the default legend, you can manually re-add one using
ax.legend()
. This gives you full control over placement and appearance.The most common approach is to simply call
ax.legend()
after plotting and specify the location.Syntax Example
ax.legend(loc="upper right")
loc
: Controls the legend placement."upper right"
is one of several valid locations values.
Matplotlib loc options
## Matplotlib Legend `loc` OptionsHere are the full built-in strings you can use for loc: | Keyword | Description | | ---------------- | ------------------------------------------------------------ | |
'best'
| Matplotlib automatically selects the least obstructive location | |'upper right'
| Top-right corner inside the axes | |'upper left'
| Top-left corner inside the axes | |'lower left'
| Bottom-left corner | |'lower right'
| Bottom-right corner | |'right'
| Center-right vertically | |'center left'
| Middle-left vertically | |'center right'
| Middle-right vertically | |'lower center'
| Bottom-center | |'upper center'
| Top-center | |'center'
| Dead-center of plot |You can also use numeric codes:
Matplotlib also allows integers for these positions:
| string | integer code | | -------------- | -------- | | 'best' | 0 | | 'upper right' | 1 | | 'upper left' | 2 | | 'lower left' | 3 | | 'lower right' | 4 | | 'right' | 5 | | 'center left' | 6 | | 'center right' | 7 | | 'lower center' | 8 | | 'upper center' | 9 | | 'center' | 10 |
Example:
ax.legend(loc=1) # same as 'upper right'
Bonus: Fine-Tune Legend Position with
bbox_to_anchor
You can combine
loc
withbbox_to_anchor
for precise control over legend placement:ax.legend(loc='upper right', bbox_to_anchor=(1.2, 1))
bbox_to_anchor
moves legend outside or inside the axes with pixel-perfect control.- Very useful when you want legends outside the chart.
Seaborn includes built-in color palettes designed for accessibility. The
"colorblind"
palette ensures that your charts remain legible for viewers with common forms of color vision deficiency.You can apply this palette using
sns.set_palette()
before plotting.Syntax Example
sns.set_palette("colorblind") sns.histplot(data=df, x="price", hue="category") plt.show()
sns.set_palette("colorblind")
: Applies a global colorblind-friendly palette for all plots.
You’ve made it to the end of the lab—nice job!
In this lab, you’ve worked through a full progression of histogram techniques including:
- Building histograms with both Matplotlib and Seaborn
- Controlling bin sizes to shape your charts
- Adding titles, axis labels, and rotating tick labels for better readability
- Styling your charts with colors, edges, and transparency
- Exporting charts as image files for reports or presentations
- Overlaying multiple datasets for side-by-side comparisons
- Using Seaborn’s
hue
parameter to group data by category - Customizing legends and applying accessibility-friendly color palettes
Want to see everything combined?
Check out the
lab-recap.ipynb
notebook in your workspace. It pulls together everything you’ve learned into a fully polished, presentation-ready histogram. - Group histograms using Seaborn’s
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.