Execution Plans can be very useful in debugging why a query is running slowly and can give insight into what SQL Server is doing behind the scenes to return the data. Depending on the complexity of the query, Execution Plans will show where the data is being pulled from, how much data is being pulled, what manipulations are being performed, and what the cost is for each of those operations. There are many more things that a complex Execution Plan can, and will, show; but for simple, one table queries like the ones below, the Execution Plan is pretty straight forward.
To start, we’ll create a small employee table like so:
CREATE TABLE Employees
(
EmployeeID INT IDENTITY(1, 1)
,FirstName VARCHAR(50)
,MiddleInitial CHAR(1)
,LastName
VARCHAR(50)
,Address1
VARCHAR(50)
,Address2
VARCHAR(50)
,City VARCHAR(50)
,[State]
CHAR(2)
,Zip CHAR(10)
,CONSTRAINT
PK_Employees PRIMARY KEY
CLUSTERED (EmployeeID) WITH FILLFACTOR = 100)
And
add a small nonclustered index like so:
CREATE NONCLUSTERED INDEX
NCL_Emp_FName_LName ON Employees(FirstName, LastName) WITH FILLFACTOR = 100
Next,
I populated the table with about 500,000 employees, each with unique values.
Now, entering in 500,000 employee records isn't really feasible, so I used Red Gate’s SQL Data Generator. After the data has been
loaded, we can start looking at Execution Plans.
This
exercise assumes that we will be using SQL Server Management Studio (SSMS) to
generate and review Execution Plans and that all the statistics on the table are up to date.
There are several ways to extract and view Execution Plans, but we will stick
with the simpliest which is viewing the actual Execution Plan in SSMS after the
query executes. This can be acomplished by keying CTRL+M in the query window.
Now, after the query executes, the actual Execution Plan will be shown beside
the messages tab.
Now
that we have SSMS setup to show the Execution Plan, we can run our query that
was running well prior to the modification.
SELECT FirstName
,LastName
FROM Employees
WHERE LastName = 'Whitehead'
AND FirstName = 'Aisha'
This query returned 3 rows and
generated a nice Index Seek on the nonclustered index that we defined on the
FirstName and LastName column. Now, let’s say that a developer gets a request to return FirstName, LastNane, and MiddleInitial. Using the following query, we can return that information.
,LastName
,MiddleInitial
FROM Employees
WHERE LastName = 'Whitehead'
AND FirstName = 'Aisha'
The
problem is now, what once was a sub-second query is now taking much longer.
Why? All the developer did is was add 1 column to the query. The answer is in
the Execution Plan.
What we see now is two added steps, a Key Lookup and Nested Loop Join. In this instance, the Key Lookup is the operator that is causing the issue and you can see this by looking at the actual cost listed under the operator, which is 52% of the entire query. This is because the nonclustered index that we created at the beginning of the project has most, but not all, of the columns the modified query is looking for. In order the satisfy the request, the query processor must go outside the index and back to the clustered index in order to find MiddleInitial and return it as a part of the query. As you can see, this is an expensive operation and one that could be averted if only the nonclustered index included the MiddleInitial column.
For this exercise, we will add the MiddleInitial to the index as a part of the composite index rather than adding it as an Included Column in the index.
IF EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[Employees]') AND name = N'NCL_Emp_FName_LName')
DROP INDEX [NCL_Emp_FName_LName] ON
[dbo].[Employees] GO
CREATE NONCLUSTERED INDEX
[NCL_Emp_FName_LName] ON [dbo].[Employees]
([FirstName] ASC,
[LastName] ASC,
[MiddleInitial] ASC
)WITH (FILLFACTOR = 100) ON [PRIMARY]
GO
Now
when we go back and run our modified query, we will see the following Execution
Plan.
And our query is back to running like it was before… fast.
No comments:
Post a Comment