Sql-server – Confirmation on query optimization and performance understanding (seek and scan)

sql serversql-server-2012

I am following along with this video on SQL Server query optimization and I would like to check my understanding of the results I am seeing.

I am using SQL Server 2012 with the AdventureWorks 2012 database.

Testing the first three examples

Example one:

SELECT Customer.ModifiedDate
FROM Sales.Customer
WHERE CAST(CONVERT(VARCHAR(10), Customer.ModifiedDate, 101) AS DATETIME) = '20081013'

SELECT Customer.ModifiedDate
FROM Sales.Customer
WHERE Customer.ModifiedDate >= '20081013' AND Customer.ModifiedDate < '20081014'
GO

-- Table 'Customer'. Scan count 1, logical reads 47, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- Table 'Customer'. Scan count 1, logical reads 47, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Example two:

SELECT Customer.ModifiedDate
FROM Sales.Customer
WHERE DATEADD(YEAR, 6, Customer.ModifiedDate) < '20131231'

SELECT Customer.ModifiedDate
FROM Sales.Customer
WHERE Customer.ModifiedDate < DATEADD(YEAR, -6, '20131231') 
GO

-- Table 'Customer'. Scan count 1, logical reads 47, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- Table 'Customer'. Scan count 1, logical reads 47, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Example three:

SELECT Customer.ModifiedDate
FROM Sales.Customer
WHERE YEAR(Customer.ModifiedDate) = 2008

SELECT Customer.ModifiedDate
FROM Sales.Customer
WHERE Customer.ModifiedDate >= '20081101' AND Customer.ModifiedDate < '20090101'
GO

-- Table 'Customer'. Scan count 1, logical reads 47, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- Table 'Customer'. Scan count 1, logical reads 47, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Unlike the video, there is no difference between the number of logical reads between any of the queries as is. But, after playing around with the criteria and comparing results my reasoning for not seeing a difference comes down to

'In all the queries, all the records meet the criteria and so all of them have 
 the same number of logical reads. When there is a predicate that excludes records 
 from the result then you will see an improvement in the number of logical reads. 
 This is due to SQL Server being able to utilize an index seek and thus only read 
 the records that meat the criteria. If the indexed column is used in an expression
 than this will prevent SQL Server from being able to use an index seek.'

And this idea is also why I always see an improvement with the fourth example

SELECT COUNT(ProductId)
FROM Sales.SalesOrderDetail
WHERE ProductID=710

SELECT COUNT(ProductId)
FROM Sales.SalesOrderDetail
WHERE ProductID*2=1420
GO

-- Table 'SalesOrderDetail'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
-- Table 'SalesOrderDetail'. Scan count 1, logical reads 276, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The first query in this example utilizes a seek but the second query in this has to scan and thus read more (all?) of the index.

Questions:

  1. Is my understanding and explanation of what is happening correct?
  2. Are there any occasions when an indexed column can be involved in expression and still have its index 'seeked'?
  3. Does not having a clustered index on a table have any effect on the ability to use a seek?

Best Answer

To answer your questions:

  1. Since you say 'all the records meet the criteria' I would advise doing tests on a larger set of rows where some rows are outside the criteria.

  2. There are some circumstances where using an expression against an indexed field will result in the index being seeked, such as WHERE [MyField] LIKE 'SomeString%'. However, using an expression directly in the field will result in the query not being sargable, and will induce an index scan. For instance, WHERE LEFT([MyField], 1) = 'T'.

  3. Not having a clustered index on a table (meaning the table is a heap) does not affect the ability for a particular SQL query to use a seek operation on an index. Not having a clustered index on a table has its own set of issues that is way beyond the scope of this answer. Check this blog post for more info on heaps vs clustered indexes.