SELECT statements are the most popular commands among SQL developers. SELECT statements can be used to query tables and views. Moreover, you can manipulate the data received from SQL Server using the SELECT statement.
In this guide, you will learn:
How to write simple SELECT statements.
How to eliminate duplicates using the DISTINCT clause.
How to use column and table aliases.
How to use case expressions.
The elements of a select statement include:
SELECT: specifies which column to return.
FROM: specifies from which table to fetch the data.
WHERE: specifies how to filter the data.
GROUP BY: arranges the data to be returned by groups. ‘
HAVING: filters groups by predicates.
ORDER BY: sorts the results.
While a query is generally written in the sequence above, it is important to be mindful of the sequence that SQL Server executes it in. Keep in mind that the FROM, WHERE, GROUP BY, and HAVING clauses are evaluated by the query engine before the contents of the SELECT clause. Consequently, elements that you write in the SELECT clause will not be visible to the other clauses; this is particularly important for calculated columns and aliases. To learn more about the order of execution of SQL queries, please read my previous guide here.
The SELECT clause is where you specify the columns that will be returned from your table or view. The FROM clause determines from where SQL Server would find those columns. Generally, you would select data from a table of a view. However, in SQL Server, you can have several tables or views that share the same name; this can cause confusion and SQL Server will return an error. To avoid those errors, considering querying a table by using both the SCHEMA name and the table or view name.
In the query below, “sales" is the schema name and “orders” is the table name.
SELECT itemid, itemprice, itemquantity FROM sales.orders
As shown above, to display columns in your query, you need to delimit the column names by a comma. The order of the columns in your list will determine how they are displayed in the output, regardless of how they are listed in the tables.
You can also select all columns from a table by using the SELECT * clause.
SELECT * FROM sales.orders
This is suitable for a quick test and as a best practice, but you should avoid using it in production work. Changes made to the table will cause the query to retrieve all current columns in the table’s current defined order. This could cause bugs or other failures in reports or applications expecting a known number of columns returned in a defined order.
The SELECT clause not only helps you to retrieve columns from the source tables, it also allows you to make calculations and create calculated columns on the fly.
1 2 3 4 5
Using the table above, you could easily calculate the total price (itemprice x itemquantity)l by multiplying the column itemprice by itemquantity. The result would be as follows: ```sql SELECT itemid, itemprice, itemquantity, (itemprice*itemquantity) FROM sales.orders
The result would appear in a new column which would be repeated once per row of the result set. Calculated expressions in a SELECT clause must be scalar - they must return only a single value. The following operations can be used in calculated columns.
1 2 3 4 5 6 7
| Operator | Description | |----------|-------------------| | + | Add/Concatenate | | - | Subtract | | * | Multiply | | / | Divide | | % | Modulo |
Even if a table enforces uniqueness by using primary keys, T-SQL queries may display duplicate rows. This is because, by default, a SELECT statement fetches all of the rows from a table.
For example, consider a customer table.
SELECT city FROM customer
Below is a partial result of the table. As you may notice, even if the customer table enforces uniqueness using a primary key, in this case, the “city” column has duplicates because several customers may live in the same city.
1 2 3 4 5 6 7
| City | |-------------------| | New York | | Cape Town | | New York | | Delhi | | New York |
To get a unique list of cities in the customer table, we should use the keyword DISTINCT. By replacing the default SELECT ALL clause with SELECT DISTINCT, all of the duplicates will be removed in the resultset.
SELECT DISTINCT specifies that the result set must contain only unique rows. However, it is important to note that the DISTINCT option only works on the set of columns returned by the SELECT clause. It does not take into consideration any other unique columns in the table.
Please see below an example of how DISTINCT can be used.
SELECT DISTINCT city FROM customer
1 2 3 4 5
| City | |-------------------| | New York | | Cape Town | | Delhi |
Moreover, DISTINCT also works with multiple columns. In these cases, DISTINCT will return the unique combination of each column.
For example, consider a values table.
SELECT category, subcategory FROM values
1 2 3 4 5 6 7
| Category | Subcategory | |---------------|-------------------| | A | X | | A | X | | B | Y | | C | Y | | B | Y |
Therefore, selecting distinct categories and subcategories will return the unique combination of those columns.
SELECT DISTINCT category, subcategory FROM values
1 2 3 4 5
| Category | Subcategory | |---------------|-------------------| | A | X | | B | Y | | C | Y |
When selecting data from a table or a view, SQL Server will name each column the same as its source. You can however, rename each column with the name of your choice by using aliases. Concerning tables, you can use aliases in the FROM clause to provide a convenient way of referring to a table elsewhere in the query; thus enhancing readability.
There are three ways that aliases can be used in a query:
Using the AS keyword SELECT city AS location FROM customer
Using the equal sign SELECT location = city FROM customer
Using the alias following the column name
SELECT city location FROM customer
All of the above would have the result as follows
1 2 3 4 5 6 7
| Location | |-------------------| | New York | | Cape Town | | New York | | Delhi | | New York |
Moreover, aliases can also be used for calculated fields. If you remember, in the previous example where we used a calculated column, the header of the result was blank. To solve this issue, consider using an alias for the calculated column.
SELECT itemid, itemprice, itemquantity, (itemprice*itemquantity) AS total FROM sales.orders
1 2 3 4
| itemid | itemprice | itemquantity | total | |---------|-----------|--------------|---------| | 1 | 10 | 3 | 30 | | 2 | 5 | 4 | 20 |
Aliases can also be used in the FROM clause to refer to a table. This is often used to improve readability and as a shorthand for SQL developers when they need to reference long table names several times in a query.
A table alias can be used as follows:
Using the AS keyword SELECT city FROM customer as c
Without the AS keyword SELECT city FROM customer c
Combining table and column alias SELECT c.city as location FROM customer as c
When using an alias, it is important to pay attention to the logical processing order of queries on SQL Server. This can cause some issues if it is not used correctly.
You may recall from my previous guide that SQL Server executes queries in the following order:
Consider the query below. It will not execute because the alias total is in the SELECT statement which executes after the WHERE clause. That is, when the WHERE clause is being executed, SQL Server won't know the alias "total".
1 2 3
SELECT itemid, itemprice, itemquantity, (itemprice*itemquantity) AS total FROM sales.orders WHERE total > 10
However, the following will work fine because ORDER BY is executed last when the SELECT has been executed. In this example, SQL Server knows the alias "total".
1 2 3
SELECT itemid, itemprice, itemquantity, (itemprice*itemquantity) AS total FROM sales.orders ORDER BY total
A CASE expression enhances the ability of a SELECT clause to manipulate data when it is retrieved. When writing queries, you often need to substitute a value in a column with another value; this is where CASE expressions are used.
For example, assume you are selecting data from the sales.orders table and, for some reason, when the price is less than 50 you need to display "low" and when the price is greater than or equals 50 you need to display "high". You would then use the CASE expression as follows:
1 2 3 4 5 6
SELECT itemid, itemprice, itemquantity, CASE itemquantity WHEN < 50 THEN 'LOW' ELSE 'HIGH' END AS price_category FROM sales.orders
The results would then look like:
1 2 3 4
| itemid | itemprice | itemquantity | price_category | |---------|-----------|--------------|----------------| | 1 | 10 | 350 | HIGH | | 2 | 5 | 4 | LOW |
In this guide, you learned how to query data in SQL Server.