Tuesday, September 13, 2011

Help! My Query Is Running Slow! How Execution Plans Can Save The Day.

Have you ever said this or has a developer ever come running to you in a panic saying what happened? How many times have you heard: “We just promoted this small report change and now it’s taking twice as long to run”? Probably more times than you would like to admit. When this happens, where do you start looking for issues? The answer can likely be found in Execution Plans.
Execution Plans can be very useful in debugging why a query is running slowly and can give insight into what SQL Server is doing behind the scenes to return the data.  Depending on the complexity of the query, Execution Plans will show where the data is being pulled from, how much data is being pulled, what manipulations are being performed, and what the cost is for each of those operations. There are many more things that a complex Execution Plan can, and will, show; but for simple, one table queries like the ones below, the Execution Plan is pretty straight forward.

To start, we’ll create a small employee table like so:

CREATE TABLE Employees
       (
        EmployeeID INT IDENTITY(1, 1)
        ,FirstName VARCHAR(50)
       ,MiddleInitial CHAR(1)
       ,LastName VARCHAR(50)
       ,Address1 VARCHAR(50)
       ,Address2 VARCHAR(50)
       ,City VARCHAR(50)
       ,[State] CHAR(2)
       ,Zip CHAR(10)
       ,CONSTRAINT PK_Employees PRIMARY KEY CLUSTERED (EmployeeID) WITH FILLFACTOR = 100)
And add a small nonclustered index like so:
CREATE NONCLUSTERED INDEX NCL_Emp_FName_LName ON Employees(FirstName, LastName) WITH FILLFACTOR = 100
Next, I populated the table with about 500,000 employees, each with unique values. Now, entering in 500,000 employee records isn't really feasible, so I used Red Gate’s SQL Data Generator. After the data has been loaded, we can start looking at Execution Plans.
This exercise assumes that we will be using SQL Server Management Studio (SSMS) to generate and review Execution Plans and that all the statistics on the table are up to date. There are several ways to extract and view Execution Plans, but we will stick with the simpliest which is viewing the actual Execution Plan in SSMS after the query executes. This can be acomplished by keying CTRL+M in the query window. Now, after the query executes, the actual Execution Plan will be shown beside the messages tab.
Now that we have SSMS setup to show the Execution Plan, we can run our query that was running well prior to the modification.
SELECT  FirstName
       ,LastName
FROM    Employees
WHERE   LastName = 'Whitehead'
        AND FirstName = 'Aisha'
This query returned 3 rows and generated a nice Index Seek on the nonclustered index that we defined on the FirstName and LastName column.








Now, let’s say that a developer gets a request to return FirstName, LastNane, and MiddleInitial. Using the following query, we can return that information.
SELECT  FirstName
       ,LastName
       ,MiddleInitial   
FROM    Employees
WHERE   LastName = 'Whitehead'
        AND FirstName = 'Aisha'

The problem is now, what once was a sub-second query is now taking much longer. Why? All the developer did is was add 1 column to the query. The answer is in the Execution Plan.









What we see now is two added steps, a Key Lookup and Nested Loop Join. In this instance, the Key Lookup is the operator that is causing the issue and you can see this by looking at the actual cost listed under the operator, which is 52% of the entire query. This is because the nonclustered index that we created at the beginning of the project has most, but not all, of the columns the modified query is looking for. In order the satisfy the request, the query processor must go outside the index and back to the clustered index in order to find MiddleInitial and return it as a part of the query. As you can see, this is an expensive operation and one that could be averted if only the nonclustered index included the MiddleInitial column.

For this exercise, we will add the MiddleInitial to the index as a part of the composite index rather than adding it as an Included Column in the index.

IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[Employees]') AND name = N'NCL_Emp_FName_LName')
DROP INDEX [NCL_Emp_FName_LName] ON [dbo].[Employees]
GO

CREATE NONCLUSTERED INDEX [NCL_Emp_FName_LName] ON [dbo].[Employees]
(
      [FirstName] ASC,
      [LastName] ASC,
      [MiddleInitial] ASC
)WITH (FILLFACTOR = 100) ON [PRIMARY]
GO

Now when we go back and run our modified query, we will see the following Execution Plan.

 




And our query is back to running like it was before… fast.

No comments:

Post a Comment