Analyze Trends with Aggregates in T-SQL

Libraries: If you want this lab, consider one of these libraries.
Data

Analyze Trends with Aggregates in T-SQL

In this Code Lab, you will perform advanced data aggregation and analysis in SQL Server using only executable T-SQL queries. You will start by practicing core aggregate functions such as COUNT, SUM, AVG, MIN, and MAX to summarize data. Next, you will group your data using GROUP BY and filter aggregated results with the HAVING clause. Then, you will combine grouping, filtering, and sorting to create targeted reports. After that, you will work with window functions to calculate running totals and rankings while preserving row-level detail. Finally, you will build a multi-step query using Common Table Expressions (CTEs) to uncover top-performing products within underperforming regions.

Get started Contact sales

Lab Info

Level

Intermediate

Last updated

Oct 16, 2025

Duration

55m

Challenge

Step 1: Foundations of Aggregation
Step 1: Foundations of Aggregation

Raw datasets can contain thousands of rows, which makes them difficult to interpret at a glance. Aggregate functions provide a way to summarize this data into meaningful insights. Instead of analyzing individual rows, you can calculate totals, averages, minimums, maximums, or distinct counts.

In this step, you will practice using core aggregate functions against the dbo.SalesOrderHeader table. These functions will allow you to answer essential questions such as: How many rows are in a table? How many orders have a shipping date? How much revenue has been generated? What is the average order value?

By the end of this step, you’ll be comfortable using aggregate functions to summarize raw sales data into high-level metrics.

What You’ll Learn in This Step

You’ll learn how to:
- Use COUNT(*) to return the total number of rows in a table
- Use COUNT(column) to exclude nulls from a count
- Use COUNT(DISTINCT column) to measure unique values
- Use SUM() and AVG() to calculate totals and averages
- Use MIN() and MAX() to identify boundary values in a column
info> You will write all of your T-SQL answers directly in the 1-step-one.sql file provided in this environment.

When you are ready to run the query, simply press Run in the Terminal tab.
This will automatically execute the contents of 1-step-one.sql against the database and display the results.
You do not need to type any commands in the terminal manually.
Optional: How to Log Into the Database Manually

You don’t need to do this for this lab, as it will not be able to be validated. However, if you’d like to explore connecting to the database directly using the mssql-cli, here’s how to do it using the sqlcmd tool.

Database Login and Basics

To connect to the SQL Server manually, run this command in the terminal:

sqlcmd -S localhost -U SA -P 'P@ssword1'

#### What Happens Next

Once connected, you’ll see a prompt that looks like this:

1>

This means SQL Server is ready to accept commands.

Every query you type must end with GO on its own line. Example:

SELECT @@VERSION; GO

To exit, type:

QUIT

Again, this is purely optional and not needed to complete the tasks, it’s just here if you’d like to explore running SQL queries directly.
### Understanding COUNT(*): Counting All Rows

The COUNT() function in T-SQL is used to return the number of rows in a result set. When used with an asterisk, COUNT(*), it counts every row in the table, regardless of whether any columns contain null values.

This function is the simplest way to measure table size and verify that rows exist. It can be used with or without a WHERE clause.

Syntax
```
SELECT COUNT(*)
FROM table_name;
GO
```
- COUNT(*), counts all rows in the table
- FROM table_name, specifies the table to query
- GO, signals the end of the batch for execution in SQL Server
In this task, you’ll use COUNT(*) against the dbo.SalesOrderHeader table to determine the total number of orders in the database. ### Using COUNT(column): Count Only Non-NULL Values

The COUNT() function can count all rows or only those with a non-null value in a specific column. When written as COUNT(column_name), SQL Server ignores rows where column_name is NULL, and counts only rows with actual values.

Syntax
```
SELECT COUNT(column_name)
FROM table_name;
GO
```
- COUNT(column_name), returns the number of rows where the column is not null
- GO, ends the batch for execution in SQL Server
In this task, you will use the ShipDate field from dbo.SalesOrderHeader. ShipDate is a DATETIME column, and some orders may have a null value if a shipment has not been recorded. Using COUNT(ShipDate) returns the count of orders that have a non-null shipping date. ### Using COUNT(DISTINCT column): Counting Unique Values

The DISTINCT keyword removes duplicate values from a result set. When combined with the COUNT() function, it allows you to count the number of unique, non-null values in a column.

This pattern is helpful when you need to measure uniqueness, such as the number of unique customers or territories represented in a dataset.

Syntax
```
SELECT COUNT(DISTINCT column_name)
FROM table_name;
GO
```
- DISTINCT column_name, ensures duplicate values are eliminated before counting
- COUNT(...), counts the distinct, non-null values
- GO, ends the batch for execution in SQL Server
In this task, you will use COUNT(DISTINCT CustomerID) on the dbo.SalesOrderHeader table. The CustomerID column is an INT representing the customer who placed each order. Using this query shows how many unique customers exist in the data. ### Using SUM(): Calculating Totals

The SUM() function in T-SQL returns the total of all non-null values in a numeric column. It adds together the values row by row and produces a single aggregated result.

This is most often used for calculating revenue, quantities, or any cumulative metric.

Syntax
```
SELECT SUM(column_name)
FROM table_name;
GO
```
- SUM(column_name), returns the total of all non-null values in the column
- GO, marks the end of the batch in SQL Server
In this task, you will use SUM(TotalDue) on the dbo.SalesOrderHeader table. The TotalDue column is a DECIMAL field representing the total amount due for each order. Summing this column calculates the overall revenue across all orders in the dataset. ### Using AVG(): Calculating Averages

The AVG() function in T-SQL calculates the arithmetic mean of all non-null values in a numeric column. It adds all the values together, then divides the total by the number of non-null entries.

This function is used to measure typical values, such as the average transaction size or the mean quantity sold.

Syntax
```
SELECT AVG(column_name)
FROM table_name;
GO
```
- AVG(column_name), calculates the mean value of the column, ignoring nulls
- GO, ends the batch for execution in SQL Server
In this task, you will use AVG(TotalDue) on the dbo.SalesOrderHeader table. The TotalDue column is a DECIMAL that stores the total amount due for each order. Applying AVG() will return the average order value across all orders. ### Using MIN() and MAX(): Finding Boundary Values

The MIN() and MAX() functions in T-SQL return the lowest and highest non-null values from a column, respectively. They are often used on numeric or date columns to find boundaries in the data.

Both functions return a single value, and they can be combined in the same query to report both extremes.

Syntax
```
SELECT MIN(column_name), MAX(column_name)
FROM table_name;
GO
```
- MIN(column_name), returns the smallest non-null value
- MAX(column_name), returns the largest non-null value
- GO, ends the batch for execution in SQL Server
In this task, you will use MIN() and MAX() on the OrderDate column of the dbo.SalesOrderHeader table. OrderDate is a DATETIME field that stores when each order was placed. By applying these functions, you can identify the earliest and latest order dates in the dataset.
Challenge

Step 2: Dynamic Summaries with Conditional Logic
Step 2: Conditional Aggregation with CASE

Sometimes, not all rows in a dataset should be treated the same. Instead of returning a single overall total, you may need to separate values into categories and calculate different results depending on conditions. T-SQL provides this flexibility through the CASE expression.

By embedding CASE inside aggregate functions like SUM() or COUNT(), you can build conditional aggregations. This allows you to calculate totals for categories such as sales in different territories or counts of rows that meet certain criteria, all in one query.

In this step, you’ll work with conditional aggregation to:
- Use CASE expressions inside aggregate functions
- Write conditional SUM() queries
- Write conditional COUNT() queries
- Compare subtotals across different categories
By the end of this step, you will be able to apply logic to aggregations and create queries that separate data into meaningful conditional summaries.

What You’ll Learn in This Step

You’ll learn how to:
- Write CASE expressions inside SUM() and COUNT()
- Apply conditional logic to aggregate calculations
- Return multiple category-based results in a single query
info> You will write all of your T-SQL answers directly in the 2-step-two.sql file provided in this environment.

When you are ready to run the query, simply press Run in the Terminal tab.
This will automatically execute the contents of 2-step-two.sql against the database and display the results.
You do not need to type any commands in the terminal manually.

### Using CASE with SUM(): Conditional Totals

The CASE expression in T-SQL provides row-level conditional logic, similar to an IF statement. When used inside an aggregate function such as SUM(), it allows you to total values only for rows that meet specific conditions.

This makes it possible to calculate multiple subtotals in a single query, without having to write separate queries for each condition.

Syntax
```
SELECT
  SUM(CASE WHEN condition THEN value ELSE 0 END)
FROM table_name;
GO
```
- CASE WHEN condition THEN value ELSE 0 END, evaluates each row and returns the chosen value if the condition is true, otherwise returns 0
- SUM(...), adds together all returned values
- GO, ends the batch for execution
In this task, you will compare sales across regions by using SUM() with CASE. The dbo.SalesOrderHeader table contains TotalDue values, and it links to the dbo.SalesTerritory table through TerritoryID. By joining these tables, you can calculate total revenue for North America and Europe in the same query. ### Using CASE with COUNT(): Conditional Counts

The COUNT() function can be combined with a CASE expression to count only the rows that meet a specific condition. By returning a non-null value inside the CASE expression, the row will be included in the count. If the CASE returns NULL, the row will not be counted.

This allows you to create multiple conditional counts in a single query.

Syntax
```
SELECT
  COUNT(CASE WHEN condition THEN 1 END),
  COUNT(CASE WHEN other_condition THEN 1 END)
FROM table_name;
GO
```
- CASE WHEN condition THEN 1 END, returns 1 for matching rows and null otherwise
- COUNT(...), counts all non-null values, effectively counting only the rows that meet the condition
- GO, ends the batch for execution
In this task, you will use the DiscountAmount column in the dbo.SalesOrderDetail table to count order lines that include a discount and order lines that do not.
Challenge

Step 3: Multi-level Grouping and Reporting Structures
Step 3: Multi-level Grouping and Reporting Structures

When you use GROUP BY, aggregate functions like SUM() or COUNT() return totals for each group. However, in many cases, you need to filter those grouped results after aggregation. This is done with the HAVING clause.

The HAVING clause allows you to filter entire groups based on aggregate conditions, such as keeping only territories with revenue above a threshold or categories that meet a minimum number of orders. When combined with WHERE, you can apply conditions both before and after aggregation.

In this step, you’ll work with grouped queries to:
- Filter results using HAVING
- Apply thresholds to totals and counts
- Combine WHERE and HAVING to manage data at both the row level and the group level
By the end of this step, you’ll be able to create multi-level filtering logic that produces reports focused only on meaningful groups.

What You’ll Learn in This Step

You’ll learn how to:
- Filter grouped results using HAVING
- Apply conditions to aggregates like totals and counts
- Combine WHERE and HAVING for more precise reporting
info> You will write all of your T-SQL answers directly in the 3-step-three.sql file provided in this environment.

When you are ready to run the query, simply press Run in the Terminal tab.
This will automatically execute the contents of 3-step-three.sql against the database and display the results.
You do not need to type any commands in the terminal manually.

### Filtering Groups with HAVING

The HAVING clause filters groups after aggregation is applied. While WHERE removes rows before grouping, HAVING is used to remove entire groups that do not meet specified aggregate conditions.

This makes it possible to show only those groups that exceed a threshold.

Syntax
```
SELECT column_name, AGGREGATE_FUNCTION(other_column)
FROM table_name
GROUP BY column_name
HAVING AGGREGATE_FUNCTION(other_column) condition;
GO
```
- GROUP BY, creates groups for aggregation
- HAVING, applies conditions to the aggregate results
- GO, ends the batch for execution
In this task, you will calculate sales totals per territory, then filter so only territories where total sales are greater than 1,000,000 are returned. ### Combining WHERE and HAVING

WHERE and HAVING can be used together.
- WHERE filters rows before they are grouped
- HAVING filters the groups after aggregation
This allows precise filtering at both the row and group level.

Syntax
```
SELECT column_name, AGGREGATE_FUNCTION(other_column)
FROM table_name
WHERE row_condition
GROUP BY column_name
HAVING AGGREGATE_FUNCTION(other_column) condition;
GO
```
In this task, you will use WHERE to restrict rows by year, then use HAVING to return only those groups with sufficient order counts.
Challenge

Step 4: Post-aggregation Filtering Techniques
Step 4: Post-aggregation Filtering Techniques

Once you’ve grouped and aggregated data, the next step is to create polished reports that highlight meaningful results. This often requires applying additional filtering and ordering so the output is focused, relevant, and easy to interpret.

The HAVING clause can filter aggregated groups, while the ORDER BY clause can sort results in ascending or descending order. By combining these with TOP, you can highlight the most important categories or territories.

In this step, you’ll refine grouped queries to:
- Filter categories by sales totals using HAVING
- Sort results in descending order by revenue
- Use TOP with ORDER BY to display the highest-performing groups
By the end of this step, you’ll be able to produce polished, ordered reports that highlight key trends across categories and territories.

What You’ll Learn in This Step

You’ll learn how to:
- Apply HAVING to filter grouped results
- Use ORDER BY to sort aggregated data
- Combine TOP with ORDER BY to highlight top results
info> You will write all of your T-SQL answers directly in the 4-step-four.sql file provided in this environment.

When you are ready to run the query, simply press Run in the Terminal tab.
This will automatically execute the contents of 4-step-four.sql against the database and display the results.
You do not need to type any commands in the terminal manually.

### Combining GROUP BY, HAVING, and ORDER BY

After grouping data, you can apply filtering and sorting to refine your results.
- GROUP BY groups rows into categories.
- HAVING filters out groups that do not meet a threshold.
- ORDER BY sorts the aggregated results, usually in descending order to show the highest values first.
These three clauses are often used together to create ranked and focused reports.

Syntax
```
SELECT group_column, AGGREGATE_FUNCTION(value_column)
FROM table_name
GROUP BY group_column
HAVING AGGREGATE_FUNCTION(value_column) condition
ORDER BY AGGREGATE_FUNCTION(value_column) DESC;
GO
```
In this task, you will group sales by product category, filter to include only categories with total sales greater than 200,000, and then order the results in descending order of sales totals. ### Using TOP with ORDER BY for Ranked Results

The TOP keyword limits the number of rows returned by a query. When combined with ORDER BY, it allows you to highlight the top results based on a specific metric.

This is especially useful for ranking scenarios, such as displaying the top-performing territories by revenue.

Syntax
```
SELECT TOP (n) group_column, AGGREGATE_FUNCTION(value_column)
FROM table_name
GROUP BY group_column
ORDER BY AGGREGATE_FUNCTION(value_column) DESC;
GO
```
- TOP (n), limits results to the first n rows
- GROUP BY, groups data by the specified column
- ORDER BY ... DESC, sorts results from highest to lowest
- GO, ends the batch for execution
In this task, you will calculate sales totals by territory, then return only the top 5 highest-revenue territories using TOP and ORDER BY.
Challenge

Step 5: Window Functions for Comparative Analysis
Step 5: Window Functions for Comparative Analysis

Aggregate functions summarize data across groups, but they remove row-level detail. Window functions extend this idea by applying aggregate and ranking logic across partitions of rows while still returning individual rows. This allows you to compare each row against group totals, assign rankings, or calculate running values.

Window functions are written with an OVER() clause, which can include:
- ORDER BY to define row order
- PARTITION BY to divide rows into groups
In this step, you’ll use window functions to:
- Assign row numbers within groups
- Rank products by sales totals
- Calculate running totals across all orders
- Reset running totals by partitioning data by year
By the end of this step, you’ll be able to apply ranking, ordering, and cumulative calculations that preserve row-level detail while incorporating group metrics.

What You’ll Learn in This Step

You’ll learn how to:
- Use ROW_NUMBER() with PARTITION BY and ORDER BY
- Apply RANK() to assign rankings within partitions
- Calculate cumulative totals with SUM() OVER(ORDER BY ...)
- Reset running totals using PARTITION BY in window functions
info> You will write all of your T-SQL answers directly in the 5-step-five.sql file provided in this environment.

When you are ready to run the query, simply press Run in the Terminal tab.
This will automatically execute the contents of 5-step-five.sql against the database and display the results.
You do not need to type any commands in the terminal manually.

### Using ROW_NUMBER() with PARTITION BY

The ROW_NUMBER() function assigns a unique sequential number to each row within a partition. The numbering is determined by the ORDER BY clause inside OVER().

This function is useful for creating ordered lists, identifying top records, or referencing row positions within a group.

Syntax
```
SELECT ROW_NUMBER() OVER(PARTITION BY column_name ORDER BY value_column DESC) AS RowNum
FROM table_name;
GO
```
- ROW_NUMBER(), assigns sequential numbers starting from 1
- PARTITION BY, divides rows into groups (the numbering restarts for each group)
- ORDER BY, defines the order of rows within each group
- GO, ends the batch for execution
In this task, you will assign row numbers to orders within each territory. The results will be partitioned by TerritoryID and ordered by TotalDue in descending order. ### Using RANK() with PARTITION BY

The RANK() function assigns a ranking number to rows within a partition, based on the specified ORDER BY. Rows with the same value receive the same rank, but the next rank will be skipped. For example, if two rows share rank 1, the following row will be ranked 3.

This makes RANK() ideal for comparing items within categories, such as ranking products by sales within each territory.

Syntax
```
SELECT column_name,
       RANK() OVER(PARTITION BY group_column ORDER BY value_column DESC) AS Rank
FROM table_name;
GO
```
- RANK(), assigns ranking values starting at 1
- PARTITION BY, restarts the ranking for each group
- ORDER BY, determines the order of ranking within each partition
- GO, ends the batch for execution
In this task, you will calculate total sales per product and rank products by sales within each territory using PARTITION BY TerritoryID. ### Calculating Running Totals with SUM() OVER(ORDER BY ...)

A running total is a cumulative sum that adds values in sequence. In T-SQL, you can calculate running totals using SUM() as a window function with OVER(ORDER BY ...).

This allows you to track cumulative metrics, such as sales over time, without collapsing the rows into groups.

Syntax
```
SELECT column_name,
       SUM(value_column) OVER(ORDER BY column_name) AS RunningTotal
FROM table_name;
GO
```
- SUM(value_column), performs the aggregation
- OVER(ORDER BY column_name), defines the order for the cumulative calculation
- GO, ends the batch for execution
In this task, you will calculate a running total of sales across all orders by ordering results by OrderDate. ### Resetting Running Totals with PARTITION BY

Window functions can be partitioned so that calculations restart for each group. By adding PARTITION BY inside the OVER() clause, you can reset rankings, running totals, or other calculations whenever the partition value changes.

When combined with a running total, this ensures each group starts its cumulative calculation from zero.

Syntax
```
SELECT column_name,
       SUM(value_column) OVER(PARTITION BY group_column ORDER BY order_column) AS RunningTotal
FROM table_name;
GO
```
- PARTITION BY group_column, restarts the calculation for each group
- ORDER BY, defines the order of rows within each partition
- SUM(value_column), accumulates totals within each partition
- GO, ends the batch for execution
In this task, you will calculate a running total of sales that resets for each year using PARTITION BY YEAR(OrderDate).
Challenge

Step 6: Synthesizing Insights Through Complex Queries
Step 6: Synthesizing Insights Through Complex Queries

Individually, aggregate functions, grouping, filtering, and window functions provide powerful insights. Combined, they allow you to design advanced queries that deliver multi-layered analysis. Common Table Expressions (CTEs) provide a way to structure these complex queries so they remain readable and reusable.

A CTE is defined with the WITH clause and acts like a temporary result set. You can use it to prepare aggregated data, then apply additional calculations such as ranking or percentages in the outer query.

In this step, you’ll bring together everything you’ve learned to:
- Build a CTE that joins multiple tables and calculates grouped sales data
- Apply WHERE and HAVING for multi-level filtering
- Use ranking functions like DENSE_RANK() to order categories by sales
- Calculate percentages using windowed functions to compare categories within territories
By the end of this step, you’ll be able to construct advanced queries that combine aggregation, grouping, filtering, and window functions into a single solution.

What You’ll Learn in This Step

You’ll learn how to:
- Write a Common Table Expression (CTE) with WITH
- Aggregate and filter sales data across multiple tables
- Apply ranking with DENSE_RANK()
- Calculate percentages of group totals using window functions
info> You will write all of your T-SQL answers directly in the 6-step-six.sql file provided in this environment.

When you are ready to run the query, simply press Run in the Terminal tab.
This will automatically execute the contents of 6-step-six.sql against the database and display the results.
You do not need to type any commands in the terminal manually.

### Using Common Table Expressions (CTEs)

A Common Table Expression (CTE) is a temporary result set defined using the WITH clause. CTEs are useful for breaking down complex queries into smaller, manageable steps. You can join multiple tables, aggregate data, and filter results inside the CTE, then select from it in the outer query.

Syntax
```
WITH cte_name AS (
    SELECT column_name, AGGREGATE_FUNCTION(value_column)
    FROM table_name
    WHERE condition
    GROUP BY column_name
    HAVING AGGREGATE_FUNCTION(value_column) condition
)
SELECT *
FROM cte_name;
GO
```
- WITH cte_name AS (...), defines the CTE
- Inside the parentheses, you can use joins, grouping, filtering, and aggregation
- The outer SELECT retrieves results from the CTE
- GO, ends the batch for execution
In this task, you will create a CTE that joins orders, order details, products, and territories. You will group by both territory and product category, calculate sales totals and order counts, and filter for orders in 2023 and categories with at least 10 orders. ### Applying Rankings and Percentages to Aggregated Results

Once a CTE has been created to aggregate sales data, you can apply additional calculations in the outer query to compare categories and highlight top performers.

Two common techniques are:
- DENSE_RANK() → Assigns rankings without gaps, restarting for each partition. This allows you to order categories by sales within each territory.
- Percentage of totals → Using a windowed SUM() with PARTITION BY to calculate each category’s share of the overall total in its territory.
Syntax
```
SELECT column_name,
       DENSE_RANK() OVER(PARTITION BY group_column ORDER BY value_column DESC) AS Rank,
       value_column * 1.0 / SUM(value_column) OVER(PARTITION BY group_column) AS Percentage
FROM cte_name;
GO
```
- DENSE_RANK(), assigns a rank to each row in a partition without skipping ranks
- PARTITION BY, defines the group in which ranks or percentages are calculated
- SUM(value_column) OVER(PARTITION BY group_column), computes the group total to use in a percentage calculation
- Multiplying by 1.0 ensures correct decimal division
- GO, ends the batch for execution
In this task, you will apply ranking and percentage calculations to the results of your aggregated CTE, ordering by territory and sales rank. ### Conclusion

In this lab, you practiced core T-SQL techniques for aggregating and filtering data:
- Calculated metrics using functions like COUNT(), SUM(), AVG(), MIN(), and MAX()
- Applied conditional logic inside aggregate functions
- Grouped data using GROUP BY and filtered results with HAVING
- Used ROW_NUMBER(), RANK(), and DENSE_RANK() to assign row-level orderings
- Created running totals and percentage calculations using window functions
- Built Common Table Expressions (CTEs) to structure multi-step queries
These skills support data analysis tasks such as summarizing trends, ranking results, and segmenting data. Review each concept as needed to reinforce your understanding.

About the author

Pluralsight Code Lab Authors

Pluralsight Code Labs offer an opportunity to get hands-on learning in real-time. Be it a Challenge, Sandbox, or Guided Lab, these provide the real world experience needed to succeed in your development role.

Real skill practice before real-world application

Hands-on Labs are real environments created by industry experts to help you learn. These environments help you gain knowledge and experience, practice without compromising your system, test without risk, destroy without fear, and let you learn from your mistakes. Hands-on Labs: practice your skills before delivering in the real world.

Learn by doing

Engage hands-on with the tools and technologies you’re learning. You pick the skill, we provide the credentials and environment.

Follow your guide

All labs have detailed instructions and objectives, guiding you through the learning process and ensuring you understand every step.

Turn time into mastery

On average, you retain 75% more of your learning if you take time to practice. Hands-on labs set you up for success to make those skills stick.