1. Indexes
When it comes to query performance tuning, adding an index to a table can be a game changer. In this lesson we'll examine a brief overview of indexes and how they can be used to improve query performance.
2. What is an index?
An index is a structure added to a table to improve the speed of accessing data. Indexes are used to locate data quickly without having to scan or search every row in a table, this makes them particularly useful for improving performance of queries with filter conditions. Indexes are applied to table columns and they can be added at any time. Adding an index to a table is typically done by the database administrator. SQL Server supports several different index types. The two most common table indexes are Clustered and Nonclustered.
3. Clustered and nonclustered indexes
A good analogy for a clustered index is a dictionary where words are stored alphabetically. With a clustered index the data, in the data pages, are stored and ordered by the column(s) with the index. Because the data is ordered one way, a table can only have one clustered index. Clustered indexes reduce the number of data page reads by a query which helps speed up search operations.
4. Clustered and nonclustered indexes
For a nonclustered index a good analogy is a text book with an index at the back. Data in the book is unordered and the index at the back indicates the page numbers containing a search condition. Like a text book a table with a nonclustered index allows data to be unordered in the table data pages. Another layer in the index structure contains ordered pointers to the data pages. A table can contain more than one nonclustered index. Nonclustered indexes are commonly used to improve table insert and update operations.
5. Clustered index: B-tree structure
Let's take a look at how a clustered index can improve performance. A clustered index creates what is called a B-tree structure on a table. We can think of a B-tree as an upside tree where the trunk is called the root node, branches the branch nodes, and leaves the page nodes.
6. Clustered index: B-tree structure
The root node contains ordered pointers to branch nodes which in turn contain ordered pointers to page nodes. The page node level contains all the 8 kilobyte data pages from the table with the data physically ordered by the column(s) with the clustered index. Without a clustered index, there is no guarantee the data is ordered in the table data pages.
7. Customers table without clustered index
Using our Customers table as an example, if we queried for all rows where CustomerID is equal to PARIS, without an index, all pages in the table would be scanned for a match.
8. Customers table without clustered index
9. Customers table without clustered index
10. Customers table without clustered index
11. Customers table with clustered index
If our Customers table has a clustered index on CustomerID, then the search would enter the B-tree at pointers nearest to our match,
12. Customers table with clustered index
in this case, between OLDWO and WOLZA.
13. Customers table with clustered index
The ordered pointers at each branch node would in turn direct the search to the
14. Customers table with clustered index
page, or pages,
15. Customers table with clustered index
containing the CustomerID we’re searching for.
16. Clustered index: example
Let’s look at this in action. We’ll query the PlayerStats table for all rows where the Team is Oklahoma City, team code OKC. We’ll turn on STATISTICS IO so we can determine the number of page reads used by our query. Without an index the whole table is scanned for rows that meet the filter condition, logical reads indicate twelve pages read. When we add a clustered index to the Team column, the number of page reads drops to two. In general, the fewer pages read to complete the query, the faster it will run.
17. Let's practice!
Let's practice.