Sql-server – Even after Disabling Read-Ahead and clearing buffer , why there is Logical Read

cachequerysql server

I have to find the total time taken for query to execute in SQL-SERVER 2016 when all the data pages should be in Disk only.
To accomplish that , I cleared all buffers in the SQL-SERVER using the following commands :

DBCC FREEPROCCACHE WITH NO_INFOMSGS
GO
DBCC DROPCLEANBUFFERS
GO
DBCC FREESESSIONCACHE
GO
DBCC FREESYSTEMCACHE ('ALL') WITH MARK_IN_USE_FOR_REMOVAL
GO 
DBCC FLUSHPROCINDB (@dbId)
GO

After this I again found that logical reads were there, then I got to know about Read-Ahead , so I disabled read-ahead by the trace flag

DBCC TRACEON(652,-1)
GO

and also the prefetch flag

DBCC TRACEON(8744,-1)
GO

Now still I have logical reads , but Physical reads are significantly higher than just clearing the buffer.
Is there any way to achieve 0 logical reads, or my understanding of logical read is wrong?

Please guide me through, I'am new to SQL-SERVER world

Best Answer

A logical read is counted when a single page is retrieved from buffer cache during query execution. This is counted regardless of whether a physical or read-ahead was used to cache the page, or if the page already existed in the buffer cache. Consequentially, logical reads is a measure of how many times pages were actually touched in memory during query execution.

A physical read is counted when the storage engine reads one or more pages from storage because the requested page is not already in cache. This excludes physical IO done by read-ahead scans.

A read-ahead read is counted when the storage engine reads one or more full extents (each extent being 8 contiguous pages) from storage during sequential scans. Read-ahead reads prefetches data into memory aggressively so that the pages are likely to be already cached when needed by the query. The number of extents read by read-ahead scans varies by SQL Server version and edition as well as fragmentation. I've seen read-ahead scans read as much as 4MB at once using a single scatter-gather IO during scans in Enterprise Edition using the latest SQL Server version.

Note that the SQL Server storage engine behaves differently with a cold cache (e.g. after restart or DBCC DROPCLEANBUFFERS). When the cache is not yet warmed up (Buffer Manager's Target Pages met), the storage engine performs full 64K extent reads instead of single 8K page reads to read pages into cache. This warms the cache much more quickly than would otherwise occur.

I prefer to focus on logical reads rather than physical for the purpose of query and index tuning because that's a better measure of the work performed by the query. It is only by happenstance whether data is cached or not. Testing with a cold cache is more of a measure of storage performance than query performance, IMHO, and isn't representative of what will occur under a production workload with a warm cache.

Related Solutions

Sql-server – Which of these queries is best for performance

Sometimes I wonder if SHORT scripts really is the best thing to focus on.

The size of a script has little to do with how efficiently the query will execute. A more compact statement will likely consume fewer resources in terms of compilation, but (re)compilation is usually a rare occurrence in a live system.

Fewer table accesses is usually desirable, though, and this does lead to more compact code.

Very generally speaking, a smaller execution plan will yield better results, and a lower estimated cost will yield better results. Again, though, it's highly situational. Cost estimates in particular can be way off in some cases. It's important to measure the actual execution time, because at the end of the day, that's what matters.

With left joins i can achieve what i want with just a few lines. But then I tried with a longer script, using unions. Which is the best method?

First of all, we need to know how much data will be in these tables in a real system. Right now there's so little it will be difficult to use the STATISTICS TIME performance metrics to figure out a winner -- the results that come back will be dominated by factors other than the query execution. With more data, it's likely the plans will change, thus rendering the comparison here moot.

Having said that, by looking at the query plans as they are now from a logical point of view, the first one is the winner.

You can see that the Clustered Index Scan of quantities appears once in the first plan, while it appears four times in the second one. The second plan also contains an expensive Distinct Sort as a result of using UNIONs (this operator could be eliminated by using UNION ALLs instead, which won't change the results).

The first query could also probably be improved, by getting index seeks on the colors and sizes tables, instead of table scans. It might be worth trying a hash match plan as well (which is what you'll probably see when quantities and products are larger), but for tables this small, the startup cost may be too much overhead to be of benefit.

What I would suggest you do is run each of the statements you want to test 10,000+ times in a loop, figure out the average execution time, and then compare.

How to Reset SQL Server Usage for Performance Testing

Short of restarting the SQL Server service or the box this is enough. If you want to be absolutely fresh you might want to drop user created statistics from your last run.

SELECT stats.name, objects.name, schemas.name,objects.create_date
FROM sys.stats
JOIN sys.objects ON stats.OBJECT_ID = objects.OBJECT_ID
JOIN sys.schemas ON objects.schema_id = schemas.schema_id
WHERE stats.stats_id > 0
  AND stats.stats_id < 255
  AND objects.is_ms_shipped = 0
  AND auto_created = 1
  order by objects.create_date desc

Best Answer

Related Solutions

Sql-server – Which of these queries is best for performance

How to Reset SQL Server Usage for Performance Testing

Related Question