For writing a correct and robust T-SQL code, it’s crucial to first understand the roots of the language, as well as a concept called logical query processing. You also need to understand the structure of the SELECT statement and how to use its clauses to perform data manipulation tasks like filtering and sorting. We often are required to combine data from different sources, and one of the ways to achieve this in T-SQL is using set operators.

Select (Transact-SQL)

Retrieves rows from the database and enables the selection of one or many rows or columns from one or many tables in SQL Server. The full syntax of the SELECT statement is complex, but the main clauses can be summarized as –

[ WITH { [ XMLNAMESPACES ,] [ <common_table_expression> ] } ]

SELECT select_list [ INTO new_table ] [ FROM table_source ] [ WHERE search_condition ] [ GROUP BY group_by_expression ] [ HAVING search_condition ] [ ORDER BY order_expression [ ASC | DESC ] ]

The UNION, EXCEPT, and INTERSECT operators can be used between queries to combine or compare their results into one result set.

Syntax

Due to the complexity of the SELECT statement, detailed syntax elements and arguments are shown by clause –

WITH XMLNAMESPACES

The order of the clauses in the SELECT statement is significant. Any one of the optional clauses can be omitted, but when the optional clauses are used, they must appear in the appropriate order.

SELECT statements are permitted in user-defined functions only if the select lists of these statements contain expressions that assign values to variables that are local to the functions.

A four-part name constructed with the OPENDATASOURCE function as the server-name part can be used as a table source wherever a table name can appear in a SELECT statement. A four-part name cannot be specified for Azure SQL Database.

Some syntax restrictions apply to SELECT statements that involve remote tables.

Logical Processing Order of the SELECT statement

The following steps show the logical processing order, or binding order, for a SELECT statement. This order determines when the objects defined in one step are made available to the clauses in subsequent steps. For example, if the query processor can bind to (access) the tables or views defined in the FROM clause, these objects and their columns are made available to all subsequent steps. Conversely, because the SELECT clause is step 8, any column aliases or derived columns defined in that clause cannot be referenced by preceding clauses. However, they can be referenced by subsequent clauses such as the ORDER BY clause. The actual physical execution of the statement is determined by the query processor and the order may vary from this list.

  1. FROM
  2. ON
  3. JOIN
  4. WHERE
  5. GROUP BY
  6. WITH CUBE or WITH ROLLUP
  7. HAVING
  8. SELECT
  9. DISTINCT
  10. ORDER BY
  11. TOP

Permissions

Selecting data requires SELECT permission on the table or view, which could be inherited from a higher scope such as SELECT permission on the schema or CONTROL permission on the table. Or requires membership in the db_datareader or db_owner fixed database roles, or the sysadmin fixed server role. Creating a new table using SELECTINTO also requires both the CREATETABLE permission, and the ALTERSCHEMA permission on the schema that owns the new table.

Illustration

A. Using SELECT to retrieve rows and columns

This section shows three code examples. This first code example returns all rows (no WHERE clause is specified) and all columns (using the *) from the DimEmployee table.

SELECT *
FROM DimEmployee
ORDER BY LastName;

This next example using table aliasing to achieve the same result.

SELECT e.*
FROM DimEmployee AS e
ORDER BY LastName;

This example returns all rows (no WHERE clause is specified) and a subset of the columns (FirstName, LastName, StartDate) from the DimEmployee table in the AdventureWorksPDW2012 database. The third column heading is renamed to FirstDay.

SELECT FirstName, LastName, StartDate AS FirstDay
FROM DimEmployee
ORDER BY LastName;

This example returns only the rows for DimEmployee that have an EndDate that is not NULL and a MaritalStatus of ‘M’ (married).

SELECT FirstName, LastName, StartDate AS FirstDay
FROM DimEmployee
WHERE EndDate IS NOT NULL
AND MaritalStatus = ‘M’
ORDER BY LastName;

B. Using SELECT with column headings and calculations

The following example returns all rows from the DimEmployee table, and calculates the gross pay for each employee based on their BaseRate and a 40-hour work week.

SELECT FirstName, LastName, BaseRate, BaseRate * 40 AS GrossPay
FROM DimEmployee
ORDER BY LastName;

C. Using DISTINCT with SELECT

The following example uses DISTINCT to generate a list of all unique titles in the DimEmployee table.

SELECT DISTINCT Title
FROM DimEmployee
ORDER BY Title;

D. Using GROUP BY

The following example finds the total amount for all sales on each day.

SELECT OrderDateKey, SUM(SalesAmount) AS TotalSales
FROM FactInternetSales
GROUP BY OrderDateKey
ORDER BY OrderDateKey;

Because of the GROUP BY clause, only one row containing the sum of all sales is returned for each day.

E. Using GROUP BY with multiple groups

The following example finds the average price and the sum of Internet sales for each day, grouped by order date and the promotion key.

SELECT OrderDateKey, PromotionKey, AVG(SalesAmount) AS AvgSales, SUM(SalesAmount) AS TotalSales
FROM FactInternetSales
GROUP BY OrderDateKey, PromotionKey
ORDER BY OrderDateKey;

F. Using GROUP BY and WHERE

The following example puts the results into groups after retrieving only the rows with order dates later than August 1, 2002.

SELECT OrderDateKey, SUM(SalesAmount) AS TotalSales
FROM FactInternetSales
WHERE OrderDateKey > ‘20020801’
GROUP BY OrderDateKey
ORDER BY OrderDateKey;

G. Using GROUP BY with an expression

The following example groups by an expression. You can group by an expression if the expression does not include aggregate functions.

SELECT SUM(SalesAmount) AS TotalSales
FROM FactInternetSales
GROUP BY (OrderDateKey * 10);

H. Using GROUP BY with ORDER BY

The following example finds the sum of sales per day, and orders by the day.

SELECT OrderDateKey, SUM(SalesAmount) AS TotalSales
FROM FactInternetSales
GROUP BY OrderDateKey
ORDER BY OrderDateKey;

I. Using the HAVING clause

This query uses the HAVING clause to restrict results.

SELECT OrderDateKey, SUM(SalesAmount) AS TotalSales
FROM FactInternetSales
GROUP BY OrderDateKey
HAVING OrderDateKey > 20010000
ORDER BY OrderDateKey;

SQL Best Practices

We will now Here are some SQL programming guidelines and best practices we collected, keeping quality, performance and maintainability in mind. This list is not complete at this moment, and will be constantly updated.

  • Do not use SELECT * in your queries.
  • Always use table aliases when your SQL statement involves more than one source.
  • Use the more readable ANSI-Standard Join clauses instead of the old style joins.
  • Do not use column numbers in the ORDER BY clause.
  • Always use a column list in your INSERT statements.
  • Don’t ever use double quotes in your T-SQL code.
  • Do not prefix your stored procedure names with “sp_”.
  • Always use a SQL formatter to format your sql like Instant SQL Formatter(Free and Online)

Do not use SELECT * in your queries, write out the full syntax.

Always write the required column names after the SELECT statement, like:

SELECT CustomerID, CustomerFirstName, City from Emp;

This technique results in reduced disk I/O and better performance.

Always use table aliases when your SQL statement involves more than one source
If more than one table is involved in a from clause, each column name must be qualified using either the complete table name or an alias. The alias is preferred. It is more human readable to use aliases instead of writing columns with no table information.

Use the more readable ANSI-Standard Join clauses instead of the old style joins
With ANSI joins, the WHERE clause is used only for filtering data.
Where as with older style joins, the WHERE clause handles both the join condition and filtering data.

Furthermore ANSI join syntax supports the full outer join.

The first of the following two queries shows the old style join,
while the second one shows the new ANSI join syntax:

— old style join

SELECT a.Au_id,

   t.Title

FROM TITLES t,

   AUTHORS a,

   TITLEAUTHOR ta

WHERE a.Au_id = ta.Au_id

   AND ta.Title_id = t.Title_id

   AND t.Title LIKE ‘%Computer%’ 

–ANSI join syntax

SELECT a.Au_id,

   t.Title

FROM AUTHORS a

   INNER JOIN TITLEAUTHOR ta

     ON a.Au_id = ta.Au_id

   INNER JOIN TITLES t

     ON ta.Title_id = t.Title_id

WHERE t.Title LIKE ‘%Computer%’

Do not use column numbers in the ORDER BY clause
Always use column names in an order by clause. Avoid positional references. Consider the following example in which the second query is more readable than the first one:

SELECT OrderID, OrderDate

FROM Orders

ORDER BY 2
SELECT OrderID, OrderDate

FROM Orders

ORDER BY OrderDate

Always use a column list in your INSERT statements
Always specify the target columns when executing an insert command. This helps in avoiding problems when the table structure changes (like adding or dropping a column).

Consider the following table:

CREATE TABLE EUROPEANCOUNTRIES

(

 Countryid   INT PRIMARY KEY,

 Countryname VARCHAR(25)

)
Here’s an INSERT statement without a column list , that works perfectly:

INSERT INTO EuropeanCountries

VALUES (1, ‘Ireland’)

Now, let’s add a new column to this table:

ALTER TABLE EuropeanCountries

ADD EuroSupport bit

Now run the above INSERT statement. You get the following error from SQL Server:

Server: Msg 213, Level 16, State 4, Line 1

Insert Error: Column name or number of supplied values does not match table definition.

This problem can be avoided by writing an INSERT statement with a column list as shown below:

INSERT INTO EuropeanCountries

(CountryID, CountryName)

VALUES (1, ‘England’)

Don’t ever use double quotes in your T-SQL code
Use single quotes for string constants. If it’s necessary to qualify an object name, use (non-ANSI SQL standard) brackets around the name, like table name: ORDER DETAILS in this SQL.

SELECT od.[Discount],

   od.[Quantity],

   od.[Unitprice]

FROM [northwind].[dbo].[ORDER DETAILS] AS od

Do not prefix your stored procedure names with “sp_”
The prefix sp_ is reserved for system stored procedure that ship with SQL Server. Whenever SQL Server encounters a procedure name starting with sp_, it first tries to locate the procedure in the master database, then it looks for any qualifiers (database, owner) provided, then it tries dbo as the owner.

So you can really save time in locating the stored procedure by avoiding the “sp_” prefix.

Always use a SQL formatter to format your sql like Instant SQL Formatter(Free and Online)
The formatting of SQL code may not seem that important,but consistent formatting makes it easier for others to scan and understand your code.

  • SQL statements have a structure, and having that structure be visually evident makes it much easier to locate and verify various parts of the statements.
  • Uniform formatting also makes it much easier to add sections to and remove them from complex T-SQL statements for debugging purposes.
  • Instant SQL Formatter is a free online SQL tidy tool that makes your SQL script readable instantly.

Set Operators (Transact-SQL)

SQL Server provides the following set operators. Set operators combine results from two or more queries into a single result set.

EXCEPT and INTERSECT (Transact-SQL)

EXCEPT returns any distinct values from the left query that are not also found on the right query.

INTERSECT returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.

The basic rules for combining the result sets of two queries that use EXCEPT or INTERSECT are the following –

  • The number and the order of the columns must be the same in all queries.
  • The data types must be compatible.

Syntax

Arguments

Is a query specification or query expression that returns data to be compared with the data from another query specification or query expression. The definitions of the columns that are part of an EXCEPT or INTERSECT operation do not have to be the same, but they must be comparable through implicit conversion. When data types differ, the type that is used to perform the comparison and return results is determined based on the rules for data type precedence.

When the types are the same but differ in precision, scale, or length, the result is determined based on the same rules for combining expressions. For more information, see Precision, Scale, and Length (Transact-SQL).

The query specification or expression cannot return xml, text, ntext, image, or nonbinary CLR user-defined type columns because these data types are not comparable.

EXCEPT
Returns any distinct values from the query to the left of the EXCEPT operand that are not also returned from the right query.

INTERSECT
Returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.

Note – When the data types of comparable columns that are returned by the queries to the left and right of the EXCEPT or INTERSECT operands are character data types with different collations, the required comparison is performed according to the rules of collation precedence. If this conversion cannot be performed, the SQL Server Database Engine returns an error.

When you compare rows for determining distinct values, two NULL values are considered equal.

The column names of the result set that are returned by EXCEPT or INTERSECT are the same names as those returned by the query on the left side of the operand.

Column names or aliases in ORDER BY clauses must reference column names returned by the left-side query.

The nullability of any column in the result set returned by EXCEPT or INTERSECT is the same as the nullability of the corresponding column that is returned by the query on the left side of the operand.

If EXCEPT or INTERSECT is used together with other operators in an expression, it is evaluated in the context of the following precedence:

  • Expressions in parentheses
  • The INTERSECT operand

EXCEPT and UNION evaluated from left to right based on their position in the expression

If EXCEPT or INTERSECT is used to compare more than two sets of queries, data type conversion is determined by comparing two queries at a time, and following the previously mentioned rules of expression evaluation.

EXCEPT and INTERSECT cannot be used in distributed partitioned view definitions, query notifications.

EXCEPT and INTERSECT may be used in distributed queries, but are only executed on the local server and not pushed to the linked server. Therefore, using EXCEPT and INTERSECT in distributed queries may affect performance.

Fast forward-only and static cursors are fully supported in the result set when they are used with an EXCEPT or INTERSECT operation. If a keyset-driven or dynamic cursor is used together with an EXCEPT or INTERSECT operation, the cursor of the result set of the operation is converted to a static cursor.

When an EXCEPT operation is displayed by using the Graphical Showplan feature in SQL Server Management Studio, the operation appears as a left anti semi join, and an INTERSECT operation appears as a left semi join.

UNION (Transact-SQL)

Combines the results of two or more queries into a single result set that includes all the rows that belong to all queries in the union. The UNION operation is different from using joins that combine columns from two tables.

The following are basic rules for combining the result sets of two queries by using UNION:

  • The number and the order of the columns must be the same in all queries.
  • The data types must be compatible.

Syntax

Arguments

Is a query specification or query expression that returns data to be combined with the data from another query specification or query expression. The definitions of the columns that are part of a UNION operation do not have to be the same, but they must be compatible through implicit conversion. When data types differ, the resulting data type is determined based on the rules for data type precedence. When the types are the same but differ in precision, scale, or length, the result is determined based on the same rules for combining expressions. For more information, see Precision, Scale, and Length (Transact-SQL).

Columns of the xml data type must be equivalent. All columns must be either typed to an XML schema or untyped. If typed, they must be typed to the same XML schema collection.

UNION
Specifies that multiple result sets are to be combined and returned as a single result set.

ALL
Incorporates all rows into the results. This includes duplicates. If not specified, duplicate rows are removed.

Menu