SQL Server – How to Get Different Metrics for Same Table with One Query

sql server

I have a lot of queries to the same table. All queries looking like this

SELECT COUNT(*) FROM <sameTable> WHERE <whereClause> GROUP BY <groupBy>

Let's the amount of such queries is 40. So the table is iterates 40 times, I'm trying to reduce amount of iterations. I have tried different approaches and finished with this:

SELECT
        SUM(CASE WHEN ((p.statusId = 9)) THEN 1 ELSE 0 END) as metric1
        ,SUM(CASE WHEN ((p.statusId IN (10, 1088))) THEN 1 ELSE 0 END) as metric2
        ,SUM(CASE WHEN ((p.statusId = 11)) THEN 1 ELSE 0 END) as metric3
        ,SUM(CASE WHEN ((p.statusId = 20)) THEN 1 ELSE 0 END) as metric4
        ,(SELECT TOP 1 COUNT(DISTINCT p.CaseId) FROM vw_DashboardWorkbench p
LEFT JOIN vw_DashboardCaseStatusHistory history on p.CaseId = history.CaseId
LEFT JOIN vw_EzProviderUser provider on p.SecondaryPhysicianAdvisorId = provider.ProviderId
WHERE p.statusId = 12
GROUP BY history.AssignedByUser ORDER BY COUNT(DISTINCT p.CaseId) DESC) as metric5
FROM
    vw_DashboardWorkbench p
LEFT JOIN vw_DashboardCaseStatusHistory history on p.CaseId = history.CaseId
LEFT JOIN vw_EzProviderUser provider on p.SecondaryPhysicianAdvisorId = provider.ProviderId

There are two issues:

  1. Summ works on all records, but some metrics need unique count of ids matching the expression.
  2. The metric5 is subquery, I have to use it because I have failed to use aggregation function to get max value of cases in status 12 by user.

Best Answer

Another way of writing the query (without using the vw_EzProviderUser provider table because it is unused in the example would be:

;WITH CTE AS
(
SELECT COUNT(DWB.CaseId) as CountingCaseId, DWB.CaseId,DWB.statusId
FROM
dbo.DashboardWorkbench DWB
GROUP BY DWB.CaseId,DWB.statusId
),
CTE2 AS
(
SELECT  COUNT(DISTINCT DWB.CaseId) as CountingCaseIdDistinct, 
        DWB.statusId
FROM dbo.DashboardWorkbench DWB
LEFT JOIN dbo.DashboardCaseStatusHistory DCSH on DWB.CaseId = DCSH.CaseId
GROUP BY DCSH.AssignedByUser,DWB.statusId
)
SELECT  SUM(CASE WHEN CTE.statusId = 9 THEN CountingCaseId ELSE 0 END) as metric1,
        SUM(CASE WHEN CTE.statusId IN (10, 1088) THEN CountingCaseId ELSE 0 END) as metric2,
        SUM(CASE WHEN CTE.statusId = 11 THEN CountingCaseId ELSE 0 END) as metric3,
        SUM(CASE WHEN CTE.statusId = 20 THEN CountingCaseId ELSE 0 END) as metric4,
        (SELECT MAX(CountingCaseIdDistinct) FROM CTE2
        WHERE CTE2.statusId = 12) as metric5,
        (SELECT MAX(CountingCaseIdDistinct) FROM CTE2
        WHERE CTE2.statusId = 10) as metric6
FROM CTE;

Performance

As of performance this could help depending on your data due to the early grouping on DWB.CaseId,DWB.statusId.

YMMV

There should be better solutions as to make it more performant such as columnstore indexes / better rewrites / indexing / ....

Temp tables

I would advise storing the result of CTE2 (or your subquery) in a temporary table if you are calling it multiple times.

This way, you are only reading from the resultset, not evaluating the same query each time.

(Metric5 and Metric6 in the example).


Testing

This was tested on SQL Server 2017

When adding some indexes to cover all options:

CREATE INDEX IX_statusId_CaseId
ON dbo.DashboardWorkbench(statusId,CaseId);
CREATE INDEX IX_CaseId_statusId
ON dbo.DashboardWorkbench(CaseId,statusId);

CREATE INDEX IX_AssignedByUser_CaseId
ON dbo.DashboardCaseStatusHistory(AssignedByUser,CaseId);
CREATE INDEX IX_CaseId_AssignedByUser
ON dbo.DashboardCaseStatusHistory(CaseId,AssignedByUser);

The only real 'benefit' compared to your plan is not having the compute scalar operator or the multiple SUM(CASE WHEN StatusId = 9 THEN 1 ELSE 0), statusid = 10, ... in your query plan on all data.

In my rewrite, it is only doing that on the grouped count of DWB.CaseId & DWB.statusId.

Part of the plan of your query on my test YMMV

enter image description here

The compute scalar from the part above (second from the right)

enter image description here

Wheras there is earlier grouping in my plan + count(DWB.CaseId)

enter image description here

This part of the plan represents the T-SQL statements in CTE (1)

SELECT COUNT(DWB.CaseId) as CountingCaseId, DWB.CaseId,DWB.statusId
FROM
dbo.DashboardWorkbench DWB
GROUP BY DWB.CaseId,DWB.statusId

the computes are done afterwards, on the result of the above query:

enter image description here

The rest of the plan is virtually the same.

Extra info

There will definetely be better solutions, but somebody might be able to use the DB<>Fiddle below and test for something better.

DB<>Fiddle with test data, execution plans etc.

Temp tables can help if CTE2 or your subquery is evaluated multiple times, indexes matter. And as mentioned in the comments there are other options to explore, but we would need more information.