Sql-server – Avoiding code duplication for counting and selecting the same resultset

pagingsql serversql server 2014stored-procedures

I have a requirement to return the total number of records and paged data from a stored procedure. The row count should only be computed if a boolean parameter is true. How do I perform the 2 queries (counting + paged result) without having to replicate the entire FROM + WHERE sections?

My FROM and WHERE are non-trivial, so I'd like to avoid having to duplicate them across the 2 queries. My primary goal is avoiding duplicate code as opposed to improving performance.

This is what I have currently:

CREATE PROCEDURE [dbo].[myProcedure] 
    @includeCount bit,
    --... where parameters
    @skip int,
    @take int,
    @totalCount int OUTPUT
AS
BEGIN  
    IF (@includeCount = 1)
        SELECT @totalCount = COUNT(*) FROM <Complex joins + where>

    SELECT (alias1.field1, alias2.field2, alias3.field3) 
    FROM <Complex joins + where>
    ORDER BY [alias1].[Key]
    OFFSET @skip ROWS
    FETCH NEXT @take ROWS ONLY
END
GO

Is there a way to "extract" the common parts (the FROM+SELECT statements) in a way as to avoid the massive duplication while retaining the same performance characteristics of the original queries? Would it pose a problem to my ORDER BY and SELECT sections which depend on aliased tables inside FROM? How would I access those?

Best Answer

I was checking ways to make this work and it appears that an inline table valued function is a decent option (even though I'd prefer something that didn't add extra stuff to the database).

CREATE FUNCTION [dbo].[myFunction] 
(   
    @filter1 int,
    @filter2 int
)
RETURNS TABLE 
AS
RETURN 
(
    SELECT (alias1.field1 as Key, alias2.field2, alias3.field3)
    FROM <Complex joins + where @filter1/@filter2>
)

By extracting the main complexity into the function, I can now reuse it for both queries easily enough:

CREATE PROCEDURE [dbo].[myProcedure] 
    @includeCount bit,
    --... where parameters
    @skip int,
    @take int,
    @totalCount int OUTPUT
AS
BEGIN  
    IF (@includeCount = 1)
        SELECT @totalCount = COUNT(*) FROM [dbo].[myFunction](where parameters)

    SELECT *
    FROM [dbo].[myFunction](where parameters)
    ORDER BY [Key]
    OFFSET @skip ROWS
    FETCH NEXT @take ROWS ONLY
END
GO

There is still some form of duplication due to the parameter passing, but I guess that's unavoidable as there is no such thing as a closure in SQL.

The ORDER BY needing to be outside the function is also not very intuitive, but SQL doesn't allow me to specify it by itself in the procedure without carrying over the paging statements.

This works for me so I'm marking my own answer as the answer for now. Obviously open to better suggestions if there are any.

Related Solutions

Sql-server – Combine Table Hints INDEX and FORCESEEK with Two Joins Not On PK

I would also test this rewriting (aasuming that sID is the primary key of docSVsys):

SELECT COUNT(*)              
FROM [docSVsys] AS d
WHERE EXISTS
      ( SELECT * 
        FROM [docSVtext] AS sv
        WHERE sv.[sID] = d.[sID] 
          AND sv.[value] = 'doug'
      )
   OR EXISTS
      ( SELECT * 
        FROM [docMVtext] AS mv
        WHERE mv.[sID] = d.[sID] 
          AND mv.[value] = 'doug'
      ) ;

Sql-server – Logical reads amount difference with almost identical indexes

It all depends on the definitions and the key (and non-key) columns defined in the nonclustered index. The clustered index is the actual table data. Therefore it contains all of the data in the data pages, whereas the nonclustered index is only containing columns' data as defined in the index creation DDL.

Let's set up a test scenario:

use testdb;
go

if exists (select 1 from sys.tables where name = 'TestTable')
begin
    drop table TestTable;
end
create table dbo.TestTable
(
    id int identity(1, 1) not null
        constraint PK_TestTable_Id primary key clustered,
    some_int int not null,
    some_string char(128) not null,
    some_bigint bigint not null
);
go

create unique index IX_TestTable_SomeInt
on dbo.TestTable(some_int);
go

declare @i int;
set @i = 0;

while @i < 1000
begin
    insert into dbo.TestTable(some_int, some_string, some_bigint)
    values(@i, 'hello', @i * 1000);

    set @i = @i + 1;
end

So we've got a table loaded with 1000 rows, and a clustered index (PK_TestTable_Id) and a nonclustered index (IX_TestTable_SomeInt). As you've seen in your testing, but just for thoroughness:

set statistics io on;
set statistics time on;

select some_int
from dbo.TestTable -- with(index(PK_TestTable_Id));

set statistics io off;
set statistics time off;
-- nonclustered index scan (IX_TestTable_SomeInt)
-- logical reads: 4

Here we have a nonclustered index scan on the IX_TestTable_SomeInt index. We have 4 logical reads for this operation. Now let's force the clustered index to be used.

set statistics io on;
set statistics time on;

select some_int
from dbo.TestTable with(index(PK_TestTable_Id));

set statistics io off;
set statistics time off;
-- clustered index scan (PK_TestTable_Id)
-- logical reads: 22

Here with the clustered index scan we have 22 logical reads. Why? Here's why. It all matters on how many pages that SQL Server has to read in order to grab the entire result set. Get the average row count per page:

select
    object_name(i.object_id) as object_name,
    i.name as index_name,
    i.type_desc,
    ips.page_count,
    ips.record_count,
    ips.record_count / ips.page_count as avg_rows_per_page
from sys.dm_db_index_physical_stats(db_id(), object_id('dbo.TestTable'), null, null, 'detailed') ips
inner join sys.indexes i
on ips.object_id = i.object_id
and ips.index_id = i.index_id
where ips.index_level = 0;

Take a look at my result set of the above query:

enter image description here

As we can see here, there are an average of 50 rows per page on the leaf pages for the clustered index, and an average of 500 rows per page on the leaf pages for the nonclustered index. Therefore, in order to satisfy the query more pages need to be read from the clustered index.

Best Answer

Related Solutions

Sql-server – Combine Table Hints INDEX and FORCESEEK with Two Joins Not On PK

Sql-server – Logical reads amount difference with almost identical indexes

Related Question