It all depends on the definitions and the key (and non-key) columns defined in the nonclustered index. The clustered index is the actual table data. Therefore it contains all of the data in the data pages, whereas the nonclustered index is only containing columns' data as defined in the index creation DDL.
Let's set up a test scenario:
use testdb;
go
if exists (select 1 from sys.tables where name = 'TestTable')
begin
drop table TestTable;
end
create table dbo.TestTable
(
id int identity(1, 1) not null
constraint PK_TestTable_Id primary key clustered,
some_int int not null,
some_string char(128) not null,
some_bigint bigint not null
);
go
create unique index IX_TestTable_SomeInt
on dbo.TestTable(some_int);
go
declare @i int;
set @i = 0;
while @i < 1000
begin
insert into dbo.TestTable(some_int, some_string, some_bigint)
values(@i, 'hello', @i * 1000);
set @i = @i + 1;
end
So we've got a table loaded with 1000 rows, and a clustered index (PK_TestTable_Id
) and a nonclustered index (IX_TestTable_SomeInt
). As you've seen in your testing, but just for thoroughness:
set statistics io on;
set statistics time on;
select some_int
from dbo.TestTable -- with(index(PK_TestTable_Id));
set statistics io off;
set statistics time off;
-- nonclustered index scan (IX_TestTable_SomeInt)
-- logical reads: 4
Here we have a nonclustered index scan on the IX_TestTable_SomeInt
index. We have 4 logical reads for this operation. Now let's force the clustered index to be used.
set statistics io on;
set statistics time on;
select some_int
from dbo.TestTable with(index(PK_TestTable_Id));
set statistics io off;
set statistics time off;
-- clustered index scan (PK_TestTable_Id)
-- logical reads: 22
Here with the clustered index scan we have 22 logical reads. Why? Here's why. It all matters on how many pages that SQL Server has to read in order to grab the entire result set. Get the average row count per page:
select
object_name(i.object_id) as object_name,
i.name as index_name,
i.type_desc,
ips.page_count,
ips.record_count,
ips.record_count / ips.page_count as avg_rows_per_page
from sys.dm_db_index_physical_stats(db_id(), object_id('dbo.TestTable'), null, null, 'detailed') ips
inner join sys.indexes i
on ips.object_id = i.object_id
and ips.index_id = i.index_id
where ips.index_level = 0;
Take a look at my result set of the above query:
As we can see here, there are an average of 50 rows per page on the leaf pages for the clustered index, and an average of 500 rows per page on the leaf pages for the nonclustered index. Therefore, in order to satisfy the query more pages need to be read from the clustered index.
The percentage costs on an execution plan are from the optimizer estimates, even when an actual execution plan produced. The actual execution plan does use the exact plan and include both the estimated rows and the actual row counts. Discrepancies between the row counts can be useful to determine how accurate the estimate was.
Somewhere between the subquery and comparing it to the column in derived
, the optimizer wasn't able to correctly estimate how many rows would match. It guessed that there would be 18 rows from derived
when there were actually over 220,000. An additional clue is the warning message Cardnality Estimate: CONVERT(nvarchar(35),[mssqlsystemresource].[sys].[spt_values].[name],0)
on the SELECT
node.
If you were to check the query run length with something else, such as STATISTICS TIME
, I would expect them to be much closer, and likely the second query running faster.
Here's another plan analysis with a somewhat similar situation. (SQL Server Plan Explorer) (with hat tip to Kendra Little on how to fool the optimizer)
The estimate shows a 93%/7% cost split, but by looking at the actual CPU, time, or IO, the difference is not that extreme. IO is about 75%/25% and CPU is roughly 60%/40%. (I tried to come up with something more even, but wasn't able to.)
Best Answer
If SQL Server reads 10 pages they don't all have to be in memory at the same time. It can also count reads of the same page multiple times.
To simplify, let's say SQL Server has limited the buffer pool to 8KB (conveniently, the size of a page). Now, you run a query that requires SQL Server to read 3 pages. It can't fit all the pages into memory at once, so:
Now, let's say you use a CTE or self-join or something else, if SQL Server isn't able to re-use the same page while it is in memory, it will have to do that all over again. Even if it doesn't, if it has to access the same page twice while it is in memory, that's two logical reads. If an operator in a query plan (like a key lookup) has 10 executions, those are additional logical reads. If a foreign key has to be validated, there are some more. If a trigger runs as part of the transaction, worktables are involved, a sort spills to disk, etc. just keep piling on the reads. Just because all those reads can't fit into memory doesn't mean they don't still have to happen.
There is no direct correlation between read activity and max server memory / buffer pool size. You just need to remember that page reads aren't distinct and they don't all have to happen at the same instant in time, especially if you have a lot of churn in your buffer pool, in which case they have be spread out.