- Lab
- Data

Build a Box Plot in Python to Visualize Clinical Drug Trial Effectiveness
In this Code Lab, you'll visualize clinical drug trial effectiveness using both Seaborn and Plotly. You'll begin by building classic box plots in Seaborn to explore treatment group outcomes and learn key statistical features like medians, outliers, and IQRs. Then, you'll transition to Plotly to build interactive versions that allow deeper exploratory analysis. By the end, you'll understand both static and interactive visualization approaches used widely across healthcare, data science, and analytics roles.

Path Info
Table of Contents
-
Challenge
Build a Basic Box Plot with Seaborn
Step 1: Build a Basic Box Plot to Explore Data Distributions
Box plots are a powerful tool for visualizing the distribution of continuous data. They help you quickly identify key statistical features such as medians, quartiles, variability, and potential outliers — all critical when comparing groups or evaluating trends.
In this Code Lab, you'll learn how to create your first box plots using Seaborn’s
sns.boxplot()
function. While practicing box plot fundamentals, you’ll work with real-world data from a clinical drug trial to explore treatment effectiveness across patient groups.
What You’ll Learn in This Step
- Build a basic box plot using Seaborn.
- Understand box plot components: median, quartiles, whiskers, and outliers.
- Assign variables to compare categories visually.
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Navigate to the workspace in the right panel.
- Open the file:
1-step-one.ipynb
.
info> Important: You must save your notebook (Ctrl/Cmd + S) before clicking Validate. Validation checks the most recent saved checkpoint.
How to Complete Each Task
> * Find the matching code cell labeled `Task 1.1`, `Task 1.2`, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or by pressing `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > * You do not need to use the terminal, create additional files, or call `plt.savefig()`. All code and output will appear inline.Box plots help visualize how numerical data is distributed and compared across categories. They display key summary statistics including:
- Median: The center line inside the box
- Interquartile Range (IQR): The top and bottom of the box
- Whiskers: The range of most of the data
- Outliers: The dots beyond the whiskers
Seaborn makes it easy to build box plots using the
sns.boxplot()
function. This function requires you to specify:x
: The column containing your categorical groupingsy
: The column containing your numeric values to summarizedata
: The full dataframe containing your dataset
In this task, you'll build your very first box plot using Seaborn. The dataset has already been loaded for you.
You’ll plot:
treatment_group
on the x-axiseffectiveness_score
on the y-axis
This will allow you to visualize how the different treatment groups performed during the clinical trial.
-
Challenge
Customize Box Plot Styling and Grouping with Seaborn
Step 2: Customize Box Plot Styling and Grouping with Seaborn
Now that you’ve built your first basic box plot, it’s time to explore how to customize its appearance. Being able to control the visual styling of box plots is critical for making them both more readable and more insightful.
What You’ll Learn in This Step
- Apply Seaborn themes with
set_theme()
. - Apply color palettes with
set_palette()
. - Customize box width, whiskers, and outlier markers using plot arguments.
- Add subgroups using the
hue
parameter for grouped box plots.
These are common tasks you’ll perform when building box plots for presentations, reports, and exploratory data analysis — especially in fields like healthcare and clinical trials where subgroup comparison is essential.
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Navigate to the workspace in the right panel.
- Open the file:
2-step-two.ipynb
.
info> Important: You must save your notebook (Ctrl/Cmd + S) before clicking Validate. Validation checks the most recent saved checkpoint.
How to Complete Each Task
> * Find the matching code cell labeled `Task 2.1`, `Task 2.2`, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or by pressing `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > * You do not need to use the terminal, create additional files, or call `plt.savefig()`. All code and output will appear inline.Seaborn’s
set_theme()
function controls many default settings that affect the appearance of all plots you create. This includes:- Background style
- Gridlines
- Axis tick styling
- Context scaling (for presentations vs papers)
You typically call
set_theme()
once at the beginning of your notebook to apply global styling.
set_theme()
Syntaxsns.set_theme( style=None, palette=None, context=None, font_scale=None, rc=None)
style
: Controls background and gridlines (Common values:"whitegrid"
,"darkgrid"
,"white"
,"dark"
,"ticks"
)palette
: Sets default colors for all categorical datacontext
: Adjusts scaling (default:"notebook"
and other options:"paper"
,"talk"
,"poster"
)font_scale
: Multiplier to control overall font sizerc
: Dictionary for fine-grained Matplotlib overrides
Example
sns.set_theme(style="whitegrid", palette="pastel", context="notebook", font_scale=1.2)
set_palette()
FunctionWhile you can pass
palette=
directly toset_theme()
, you can also callset_palette()
separately:sns.set_palette("pastel")
This changes the colors applied to categorical variables.
Here is a quick rule of thumb for most data visualization tasks:
- Use
set_theme()
for global appearance. - Use
set_palette()
if you want to adjust colors separately after setting the theme. ### Customizing Box Appearances and Outlier Markers
Box plots in Seaborn aren’t just fixed visuals — you can control many aspects of their appearance directly through function parameters. This allows you to fine-tune the plot for clarity, aesthetics, or specific analytic goals.
Box Width (
width
)The
width
parameter controls how wide each box appears. Narrower boxes can be useful when comparing many categories.sns.boxplot(..., width=0.5)
- Default width is
0.8
. - Range is typically between
0.2
and1.0
depending on spacing.
Whisker Length (
whis
)The
whis
parameter controls how far the whiskers extend beyond the box.sns.boxplot(..., whis=1.5)
- The default
whis=1.5
means whiskers extend to 1.5 times the interquartile range (IQR). - Lower values produce shorter whiskers, revealing more outliers.
- Higher values include more data within the whiskers.
Outlier Marker Styling (
flierprops
)Outliers (or “fliers”) are drawn using Matplotlib’s scatter plot style. You can fully customize how outlier markers look using the
flierprops
argument.flier_props = dict(marker='o', markerfacecolor='red', markersize=6, linestyle='none') sns.boxplot(..., flierprops=flier_props)
marker
: Shape of the outlier marker (such as'o'
,'x'
,'^'
, etc.)markerfacecolor
: Fill colormarkersize
: Size of outlier markerslinestyle='none'
: Disables connecting lines
Note: If you omit
flierprops
, Seaborn uses default outlier styling. ### Grouping Categorical Data with the Hue ParameterBox plots often become even more powerful when you compare multiple subgroups within each category. In Seaborn, the
hue
parameter allows you to split each box into subgroups based on a second categorical variable.Instead of drawing a single box for each main category, Seaborn will draw multiple boxes side-by-side for each subgroup level.
Hue Syntax
sns.boxplot(x="category_col", y="numeric_col", hue="subgroup_col", data=df)
hue
must be set to a column that contains a second categorical variable.- The unique values in the
hue
column define how many subgroup boxes get drawn within each main category.
Example
sns.boxplot(x="treatment_group", y="effectiveness_score", hue="gender", data=df)
This would draw multiple boxes for each
treatment_group
split bygender
.
Behind the Scenes
- Seaborn automatically assigns different colors (from the current palette) to each subgroup level.
- Grouped box plots help you quickly evaluate whether different populations respond differently across your main categories.
- Apply Seaborn themes with
-
Challenge
Build and Customize Interactive Box Plots with Plotly
Step 3: Build and Customize Interactive Box Plots with Plotly
So far, you’ve created static box plots using Seaborn. While static plots are useful for many tasks, sometimes you need more interactivity when exploring or presenting your data. That’s where Plotly comes in.
In this step, you’ll use Plotly Express to build interactive box plots. With just a few lines of code, you can:
- Hover over data points to see exact values.
- Add subgroup comparisons automatically.
- Dynamically adjust orientation, ordering, and more.
Plotly integrates well with Pandas DataFrames and gives you immediate access to highly interactive visuals — perfect for clinical trial data exploration, presentations, or dashboards.
What You’ll Learn in This Step
- Create your first interactive box plot using
px.box()
. - Control the orientation (horizontal or vertical) of your plot.
- Customize category ordering using
category_orders
to control how groups appear.
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Navigate to the workspace in the right panel.
- Open the file:
3-step-three.ipynb
.
info> Important: You must save your notebook (Ctrl/Cmd + S) before clicking Validate. Validation checks the most recent saved checkpoint.
How to Complete Each Task
> * Find the matching code cell labeled `Task 3.1`, `Task 3.2`, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or by pressing `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > * You do not need to use the terminal, create additional files, or call `plt.savefig()`. All code and output will appear inline.plotly.express.box()
creates interactive box plots directly from your DataFrame in a single function call. Compared to Seaborn, Plotly gives you full interactivity by default — no additional configuration required.You can immediately:
- Hover to see exact values.
- Zoom, pan, and export your plots.
- Display multiple subgroup comparisons.
px.box()
Syntaxpx.box( data_frame, x=None, y=None, color=None, facet_col=None, orientation=None, points=None, category_orders=None, title=None, labels=None )
Key Parameters You'll Use
data\_frame
: The full Pandas DataFramex
: Categorical column for group comparisonsy
: Numeric column for measurement valuescolor
: Optional second categorical column to split colorsorientation
:'v'
or'h'
to control vertical/horizontalpoints
: Controls whether individual data points are shown ("all"
,"outliers"
,"suspectedoutliers"
,"false"
)
Controlling Box Plot Orientation in Plotly
By default, box plots in Plotly Express are drawn vertically — with categories on the x-axis and values on the y-axis. But sometimes, a horizontal orientation makes your chart easier to read, especially if:
- Category labels are long
- You have many groups to display
- The numeric range fits better horizontally
Plotly makes this change simple using the
orientation
parameter insidepx.box()
.
orientation
Parameterorientation = "v" # vertical (default) orientation = "h" # horizontal
"v"
(vertical): Categories on x-axis, values on y-axis"h"
(horizontal): Categories on y-axis, values on x-axis
Key Rule
When switching orientations, you also swap your
x
andy
arguments.# Vertical (default) px.box(df, x="category_col", y="value_col") # Horizontal px.box(df, x="value_col", y="category_col", orientation="h")
Plotly does not automatically infer which column is numeric — you must place your numeric column accordingly when changing orientation. ### Customizing Category Order with Plotly Express
By default, Plotly determines category order based on how they appear in your dataset. This might work for simple plots, but often you’ll want full control over how groups are displayed:
- Logical sorting (e.g. control group before treatment groups)
- Clinical priority
- Presentation-ready ordering
Plotly Express allows you to fully control this order using the
category_orders
parameter.
category_orders
Parametercategory_orders = { "column_name": ["first_value", "second_value", "third_value", ...] }
- The key is the column you want to control.
- The list defines the exact order of categories.
- This ensures categories appear exactly in the order you specify, regardless of how they're ordered in the DataFrame.
Important
category_orders
accepts multiple columns if you need to control order for multiple dimensions (useful in faceting).- Category names inside the list must exactly match the text values in your dataset.
- Category names are case-sensitive.
"Control"
and"control"
are treated as different categories.
Controlling Point Display in Plotly Box Plots
Plotly box plots have the ability to show individual data points along with the boxes themselves. This can be helpful for:
- Showing the actual distribution of your observations
- Highlighting outliers
- Revealing clusters or gaps in the data
You control how points are displayed using the
points
argument insidepx.box()
.points
Options| Value | What It Does | | --------------------- | --------------------------------------------------------- | |
"outliers"
| Show only points beyond the whiskers (default). | |"all"
| Show all individual data points. | |"suspectedoutliers"
| Show only points beyond 1.5×IQR, but not extreme outliers. | |False
| Do not show any individual points. |Example
points="all"
This shows every observation as a jittered dot.
Tip: Showing all points can be helpful for presentations, but can look cluttered if you have very large datasets. ### Refining Box Width, Jitter, and Marker Opacity
When you display all data points on a box plot, you can improve readability by customizing:
- Box Width: Controls the width of each box relative to spacing between categories
- Jitter: Adds horizontal spread to individual points so they don’t overlap
- Opacity: Makes points partially transparent, reducing visual clutter when many observations overlap
These parameters help create clearer, presentation-ready visuals — especially when you have lots of data.
Parameters to Use:
| Parameter | What It Does | | --------- | ----------------------------------------------------------------- | |
boxmode
| Controls whether boxes are grouped or stacked | |jitter
| Controls how far apart individual points are spread horizontally | |opacity
| Controls how transparent the points are |Tip: A jitter between
0.2
and0.4
is usually enough to reduce overlap without creating noise. ### Applying Titles, Labels, and Styling for PresentationOnce your plot shows all points and adjusted spacing, the last step is to prepare it for communication. This includes:
- Titles: Provide clear context about what the chart shows
- Axis Labels: Rename axes to be more descriptive and readable
- Templates: Apply a cohesive style for fonts, gridlines, and colors
Common Arguments for Presentation Styling
| Parameter | Purpose | | ---------- | --------------------------------------- | |
title
| Sets the main chart title | |labels
| Re-maps column names to friendly labels | |template
| Applies a built-in visual style | -
Challenge
Add Advanced Interactivity to Plotly Box Plots
Step 4: Add Advanced Interactivity to Plotly Box Plots
You've built interactive box plots using Plotly Express — now it's time to take advantage of some of Plotly's more advanced interactivity features.
These options allow you to:
- Control exactly what appears in tooltips when users hover over data points
- Display additional fields directly in the hover text
- Customize how quartiles are calculated
- Apply built-in visual templates for consistent presentation styling
These features are widely used in interactive dashboards, presentations, and exploratory data apps — helping users understand patterns more deeply by surfacing contextual details on demand.
What You’ll Learn in This Step
- Customize hover tooltips using
hovertemplate
for full text control. - Add multiple fields to the hover data using
hover_data
. - Control quartile calculation behavior with
quartilemethod
. - Apply pre-built Plotly templates to quickly style your charts.
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Navigate to the workspace in the right panel.
- Open the file:
4-step-four.ipynb
.
info> Important: You must save your notebook (Ctrl/Cmd + S) before clicking Validate. Validation checks the most recent saved checkpoint.
How to Complete Each Task
> * Find the matching code cell labeled `Task 4.1`, `Task 4.2`, etc. > * Write your code directly in that cell. > * Run the cell using the **Run** button or by pressing `Shift+Enter`. > * Save your progress using the **Save icon** or **File > Save and Checkpoint**. > * You do not need to use the terminal, create additional files, or call `plt.savefig()`. All code and output will appear inline.When you hover over a box plot in Plotly, a tooltip appears with information about the data point. This tooltip is useful, but you may want more control over what appears and how it's displayed.
The
hovertemplate
property lets you define the entire content of the tooltip using placeholder variables.
hovertemplate
Syntax:fig.update_traces( hovertemplate="Label: %{x}<br>Value: %{y}<extra></extra>" )
%{x}
and%{y}
insert the values from the x and y axes.<br>
adds a line break.<extra></extra>
removes the default trace label (which you often don’t need in box plots).
### Add Extra Fields to Hover Tooltips with `hover_data`Supported Placeholders
%{x}
: Category label or axis value%{y}
: Numeric value or axis value%{color}
: Grouping label (if color is applied)%{customdata[i]}
: Value from a custom data array (advanced)<extra></extra>
: Hides trace label in the tooltip
While
hovertemplate
gives you full control over tooltip formatting, sometimes you simply want to include extra data fields without customizing every line. That’s where thehover_data
parameter comes in.This parameter lets you specify which columns from your dataset should be included in the tooltip — and whether each one is shown or hidden.
hover_data
Syntaxpx.box( data_frame=df, x="group_col", y="value_col", hover_data=["field1", "field2", "field3"] )
- Provide a list of column names to include in the tooltip.
- Columns must exist in your DataFrame.
- Plotly will automatically format and display them as additional lines.
### Control Quartile Calculation and Apply Built-in TemplatesAdvanced Usage with Formatting
You can pass a dictionary instead of a list to control formatting and visibility:
hover_data={ "field1": True, # Show this field "field2": False, # Hide but keep data accessible "field3": ":.2f" # Format numbers to two decimals }
In a box plot, the box represents the middle 50% of values — between the first and third quartile. The whiskers extend outward to capture variability beyond that.
Plotly lets you control how these boundaries are calculated with the
quartilemethod
argument.
quartilemethod
Optionsquartilemethod="linear" # Default method (interpolated) quartilemethod="inclusive" # Includes data endpoints quartilemethod="exclusive" # Excludes endpoints for stricter bounds
These options change how the box edges and whiskers are computed — which affects the shape and spread of the box plot. Most of the time, you’ll use
"linear"
, but other methods are helpful in regulated or statistical settings.
Plotly Templates
Plotly also supports built-in themes to instantly change the look of your chart — background, fonts, gridlines, and color cycles.
Use the
template
argument inside yourpx.box()
call to apply one:template="plotly_dark"
Popular Built-in Templates
"plotly"
(default)"plotly_white"
"plotly_dark"
"ggplot2"
"seaborn"
"simple_white"
"presentation"
-
Challenge
Plotly Advanced Layouts and Presentation
Step 5: Customize Layout and Axis Behavior in Plotly
Once your box plots are built and styled, you may need to refine their layout — especially for presentations, dashboards, or scientific reports. This step teaches you how to go beyond default styles and take full control over how your charts behave and appear.
You’ll learn how to annotate, arrange, and fine-tune axis behavior in ways that increase clarity and impact.
What You’ll Learn in This Step
- Add annotations to highlight key insights.
- Use facet panels to compare distributions side by side.
- Adjust axis ranges for more readable scaling.
- Apply multiple layout customizations in one plot.
You'll complete all tasks in a Jupyter notebook rather than in standalone
.py
files.Open the Notebook
- Navigate to the workspace in the right panel.
- Open the file:
5-step-five.ipynb
info> Important: You must save your notebook (Ctrl/Cmd + S) before clicking Validate. Validation checks the most recent saved checkpoint.
How to Complete Each Task
- Find the matching code cell labeled
Task 5.1
,Task 5.2
, etc. - Write your code directly in that cell.
- Run the cell using the Run button or by pressing
Shift+Enter
. - Save your progress using the Save icon or File > Save and Checkpoint.
You do not need to use the terminal or create additional files. All code and output will appear inline.
Annotations help you call out specific data points, groupings, or trends directly on a chart. This is especially useful in dashboards, presentations, or scientific communication where clarity is critical.
Plotly supports multiple ways to add annotations, but the most flexible is
add_annotation()
— a method offig.update_layout()
.
Common Use Cases for Annotations:
- Highlighting the median or outlier in a group
- Calling attention to a specific treatment group
- Labeling a notable range in your plot
add_annotation()
Syntaxfig.update_layout( annotations=[ dict( x=..., # X coordinate in data or pixel y=..., # Y coordinate in data or pixel text="Label", # Text to display showarrow=True, # Whether to draw an arrow arrowhead=1 # Style of the arrowhead ) ] )
### Split Your Plot into Panels by Category with FacetsAnnotation Position Tips
x
andy
should match values from your data (e.g."Group A"
,4.5
).- Use
xref="x"
andyref="y"
to pin to data coordinates. - Use
xref="paper"
andyref="paper"
to anchor by plot size (from 0 to 1). - Arrows are optional — they help when pointing at small outliers.
Sometimes it’s better to show multiple side-by-side plots rather than combine everything into one. Plotly lets you do this using facets — mini plots split by the values in a column.
This is ideal for visually comparing subgroups using consistent axes and styling.
How Faceting Works
Facets are created using:
facet_col=
: Splits into vertical panelsfacet_row=
: Splits into horizontal panels
You provide the name of a categorical column, and Plotly makes a subplot for each unique value.
### Adjust Axis Ranges for Better Focus and ReadabilityFacet Plot Parameters
facet_col="columnName" facet_row="columnName" facet_col_wrap=2
facet_col
: Assigns one panel per category along columnsfacet_row
: Does the same, but along rowsfacet_col_wrap
: Controls how many go per row before wrapping
By default, Plotly automatically sets the axes to fit your data. But sometimes, auto-scaling can distract or distort how data is perceived — especially if you want to zoom in on key ranges or create consistent axes across multiple charts.
This is where
range_x
andrange_y
come in handy.
What You Can Control
With
range_x
andrange_y
, you can:- Zoom into a specific value range (e.g., [2, 8]).
- Create consistent scaling across multiple charts.
- Limit visual clutter when outliers are irrelevant.
### Stack Multiple Layout Options in One PlotAxis Range Parameters
range_y=[lower_limit, upper_limit] range_x=[lower_limit, upper_limit]
These parameters accept a list of two numeric values.
Example context only:
range_y=[3, 7]
As your visualizations evolve, you’ll often want to combine multiple layout-level controls into a single figure. This helps align your plot with its communication purpose — whether for dashboards, reports, or presentations.
In this final task, you’ll apply everything you’ve learned to produce a polished, custom box plot.
What Can Be Combined
Plotly Express lets you pass layout arguments directly into
px.box()
, including:- Axis range:
range_y=[...]
,range_x=[...]
- Figure style:
template="..."
,title="..."
- Axis behavior:
points=
,notched=
, etc.
## Lab Summary: Review What You BuiltExample Parameters You Might Combine
range_y=[2, 8] template="plotly_dark" title="Trial Response by Group" points="all"
Note: This is illustrative, not prescriptive.
You’ve now completed a full data visualization pipeline using both Seaborn and Plotly Express — two very powerful libraries for visualizing statistical distributions in Python.
Across this lab, you:
- Learned how box plots reveal medians, quartiles, outliers, and overall spread
- Customized chart styling and layout using Seaborn themes and Plotly templates
- Used interactivity to explore subgroup effects and annotate key insights
- Practiced visual storytelling techniques relevant to real-world healthcare and analytics workflows
Explore the Final Chart
If you'd like to see how all of these techniques can be combined in a professional-quality chart, open the file below:
File:
lab-recap.ipynb
This recap notebook shows a fully customized interactive box plot that combines color, layout, interactivity, and styling — all using the same dataset you've worked with throughout the lab.
What's a lab?
Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.
Provided environment for hands-on practice
We will provide the credentials and environment necessary for you to practice right within your browser.
Guided walkthrough
Follow along with the author’s guided walkthrough and build something new in your provided environment!
Did you know?
On average, you retain 75% more of your learning if you get time for practice.