In other guides, you have learned how to write basic SQL queries to retrieve data from a table. In real-life applications, you would need to fetch data from multiple tables to achieve your goals. To do so, you would need to use SQL joins. In this guide, you will learn how to query data from multiple tables using joins.
In order to understand joins, you should first understand the concept of a Cartesian product. In mathematics, this is the product of two sets. For example, a set of two items multiplied by a set of six items will produce a set of twelve items.
In databases, a Cartesian product is the result of joining every row of one input table to all rows of another table. Therefore, the product of a table of two rows and a table of three rows would be a set of six rows.
Example: A Cartesian product of orders and items
Table 1: Orders
orderid |
---|
1 |
2 |
Table 2: Items
itemid |
---|
1 |
2 |
3 |
Result: Each row of orders would multiply by each row of items, and the result would be as below.
orderID | itemid |
---|---|
1 | 1 |
1 | 2 |
1 | 3 |
2 | 1 |
2 | 2 |
2 | 3 |
In SQL, to fetch data from multiple tables, the join operator is used. The join operator adds or removes rows in the virtual table that is used by SQL server to process data before the other steps of the query consume the data. Joins can be of the following categories:
A cross join, also known as a Cartesian product, adds all possible combinations of the two input tables' rows to the virtual table. All data filtering of the rows will be done in the where
clause.
The inner join operator first creates a Cartesian product, and then filters the results using the predicate supplied in the ON clause, removing any rows from the virtual table that do not satisfy the predicate. It is the most commonly used join type.
Inner joins are used to retrieve data that has been stored across multiple tables. As discussed earlier, an inner join begins its logical processing phase as a Cartesian product that is then filtered to remove any rows that don't match the predicate.
It is important to note that INNER JOINS returns only rows where a match is found in both input tables.
The inner join syntax is as follows:
1SELECT o.orderid, o.amount, i.description
2FROM orders o
3INNER JOIN items i
4ON o.itemid = i.itemid
Some key elements to note from the query above:
o
and is used in the ON statement as well as in the SELECT statement.Consider the tables below.
Table Orders
orderid | itemid | qty |
---|---|---|
1 | 1 | 3 |
2 | 2 | 4 |
3 | 2 | 1 |
4 | 3 | 1 |
Table Items
itemid | itemprice | itemdesc |
---|---|---|
1 | 10 | stuff |
2 | 5 | lorum |
In the tables above, if we needed to retrieve orderid
, order quantity
, item price
, and itemdesc
for all orders that have an item, the query would be as follows.
1SELECT o.orderid, o.qty, i.itemprice, i.itemdesc
2FROM orders o
3INNER JOIN items i
4on o.itemid = i.itemid
The result set would be as below.
orderid | qty | itemprice | itemdesc |
---|---|---|---|
1 | 3 | 10 | stuff |
2 | 4 | 5 | lorum |
3 | 1 | 5 | lorum |
From the result above, not that orderid 4 did not come into the result set because itemid 3 in the orders table is not found in the items table. Since INNER JOIN only returns rows that exists in both tables, orderid 4 was filtered out from the result set.
So far, you've learned how to use inner joins to match rows in separate tables. SQL Server built the results of an inner join query by filtering out rows that failed to meet the conditions expressed in the ON clause predicate. The result is that only rows that matched from both tables were displayed. With an outer join, you may choose to display all the rows from one table, along with those that match from the second table.
Here are some important notes about outer joins:
The following will return all rows from the first table (a) and only matching rows from the second table (b).
1FROM a LEFT OUTER JOIN b
2on a.id = b.id
The following will return all rows from the second table (b) and only matching rows from the first table (a).
1FROM a RIGHT OUTER JOIN b
2on a.id = b.id
Consider the tables of orders and items used in the inner join example above.
Let's say we need to write a query to retrieve the orderid, order quantity, item price and itemdesc of all orders. If the order does not have an itemid in the items table, the order details should be returned, but the item details should be displayed as NULL.
In such a scenario, a LEFT (outer) JOIN would be preferred because it will return all rows from the first table and only matching rows from the second table.
The query would be as follows.
1SELECT o.orderid, o.qty, i.itemprice, i.itemdesc
2FROM orders o
3LEFT JOIN items i
4on o.itemid = i.itemid
The result set would be as below.
orderid | qty | itemprice | itemdesc |
---|---|---|---|
1 | 3 | 10 | stuff |
2 | 4 | 5 | lorum |
3 | 1 | 5 | lorum |
4 | 3 | NULL | NULL |
Cross join queries create a Cartesian product, which you have learned about earlier in this guide.
To explicitly create a Cartesian product, you would use the CROSS JOIN operator. This will create a result set with all possible combinations of input rows.
When writing queries with CROSS JOIN, remember that there is no matching of rows performed, and therefore no ON clause is required.
Consider the example below.
Table A
A |
---|
1 |
2 |
3 |
Table B
B |
---|
X |
Y |
Z |
1SELECT *
2FROM A
3CROSS JOIN B
The result would be
A | B |
---|---|
1 | X |
1 | Y |
1 | Z |
2 | X |
2 | Y |
2 | Z |
3 | X |
3 | Y |
3 | Z |
So far, you have learned about joins that involve separate multiple tables. There may be scenarios in which you need to compare and retrieve data stored in the same table.
For example, in a human resources application, an Employees table might include data about the supervisor of each employee in the employee's own row. Each supervisor is also listed as an employee. The table below could demonstrate such a scenario.
empid | empname | supervisorid |
---|---|---|
1 | emp a | 2 |
2 | emp b | 3 |
3 | emp c | 4 |
4 | emp d | 1 |
To retrieve the employee information and match it to the related supervisor, you can use the table twice in your query, joining it to itself for the purposes of the query.
To accomplish tasks like this, you should consider the following guidelines:
1SELECT e.empid ,e.empname AS empname, s.empname AS supervisorname
2FROM HR.Employees AS e
3JOIN HR.Employees AS s
4ON e.supervisorid=s.empid
This would return each employee with their respective supervisor.
empid | empname | supervisorname |
---|---|---|
1 | emp a | emp b |
2 | emp b | emp c |
3 | emp c | emp d |
4 | emp d | emp a |
In this article, you learned how to query query data using joins in SQL Server. In the next guide, we'll discuss how to handle errors in SQL Server.
Happy coding!
Explore these SQL Server courses from Pluralsight to continue learning: