Sql-server – How to optimize dashboard performance

database-designperformance-dashboardsql server

I have a table like

FieldA, FieldB, FieldC, FieldD, TheDate, Count

I have a web app which provides a dashboard of the top few Count by each of the fields. Not wanting to prematurely optimize, the original, brute force queries for these charts are like:

SELECT TOP 10
    FieldA,
    SUM(Counts) AS Counts
FROM TheTable
WHERE @StartDate <= TheDate AND TheDate <= @EndDate
GROUP BY FieldA
ORDER BY Counts

And the same for the other fields. But the server ends up selecting by date range independently for each chart and when there is a lot of data, the system is bogging down.

It seems wrong to get all the data in the app (once) then do the summary by the columns locally. And maybe the RDBMS caches a lot of the records so the second through fourth chart are more efficient that the first.

I'm using Azure SQL and neither SQL Server Management Studio nor DataGrip suggest any missing indices which could help.

Any thoughts on techniques to do multiple, similar options on the same data? In general or for Azure SQL. Thanks.

Best Answer

suggest any missing indices which could help

You need to show what indexes you currently, and the size & shape of your data (approx how many rows historically?, and how many new are added each day/hour/other?, how much do fields vary?, how often are they NULL?), have defined for us to give absolutely relevant advice.

Any thoughts on

As most of your queries follow the stated pattern, you are always performing a query on a range of dates so at very least you need an index on [date]. Further more it would probably be a good idea for this to be your clustered index[†] to reduce the page accesses needed for such range queries.

If you can afford the extra space (and the extra RAM needed for your common working set to stay in there to avoid IO thrashing) then you may get a noticeable boost from having indexes on [date],FieldA, [date],FieldB, ... so that the grouping does not have to perform a sort operation (once the data is found by date it is already in order in the index used). If there are particular fields that are queried for much more often than the others, then perhaps just do this to help the queries on those fields instead of spending the resources doing it for all of them.

^{[†] even if you have a unique integer as your primary key[‡] (or something else like a UUID)

[‡] and you should have a surrogate key in this example otherwise you could have rows that are otherwise identical which doesn't fit the relational model and could cause issues}

As a side-note: date is a keyword, as it is a type, so I would avoid using that as a column name even in examples.

Related Solutions

Sql-server – Optimize this SQL query for performance

No shortcut at all, there is no elegant string concatention in SQL Server

You are mixing varchar and nvarchar though: datatype precedence means the entire expression will be nvarchar.

Sql-server – How to optimize the SQL Query (Multiple Rounds)

You can probably use a query with a single SELECT like this one:

SELECT CAST(10001 AS varchar)) AS KdNr
    , CAST(2014 AS varchar)) AS LastYear
    , CAST(2015 AS varchar)) AS CurrentYear
    , LY1 = ROUND(SUM(
            CASE WHEN AuftragsKennung <> 4 
                AND MONTH(Datum_erfassung)=1 
                AND YEAR(Datum_erfassung)=2014 
            THEN Summen_gesamt-Summen_ust_gesamt 
            ELSE 0 END
            - 
            CASE WHEN AuftragsKennung = 4 
                AND MONTH(Datum_erfassung)=1 
                AND YEAR(Datum_erfassung)=2014
            THEN Summen_gesamt-Summen_ust_gesamt 
            ELSE 0 END
        ), 2)
... add LY2 to LY11 here
    , LY12 = ROUND(SUM(
            CASE WHEN AuftragsKennung <> 4 AND MONTH(Datum_erfassung)=12 AND YEAR(Datum_erfassung)=2014 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
            - CASE WHEN AuftragsKennung = 4 AND MONTH(Datum_erfassung)=12 AND YEAR(Datum_erfassung)=2014 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
        ), 2)
    , LY = ROUND(SUM(
            CASE WHEN AuftragsKennung <> 4 AND YEAR(Datum_erfassung)=2014 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
            - CASE WHEN AuftragsKennung = 4 AND YEAR(Datum_erfassung)=2014 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
        ), 2)
    , CY1 = ROUND(SUM(
            CASE WHEN AuftragsKennung <> 4 AND MONTH(Datum_erfassung)=1 AND YEAR(Datum_erfassung)=2015 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
            - CASE WHEN AuftragsKennung = 4 AND MONTH(Datum_erfassung)=1 AND YEAR(Datum_erfassung)=2015 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
        ), 2)
... add CY2 to CY11 here
    , CY12 = ROUND(SUM(
            CASE WHEN AuftragsKennung <> 4 AND MONTH(Datum_erfassung)=12 AND YEAR(Datum_erfassung)=2015 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
            - CASE WHEN AuftragsKennung = 4 AND MONTH(Datum_erfassung)=12 AND YEAR(Datum_erfassung)=2015 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
        ), 2)
    , CY = ROUND(SUM(
            CASE WHEN AuftragsKennung <> 4 AND YEAR(Datum_erfassung)=2015 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
            - CASE WHEN AuftragsKennung = 4 AND YEAR(Datum_erfassung)=2015 THEN Summen_gesamt-Summen_ust_gesamt ELSE 0 END
        ), 2)
FROM FK_Auftrag 
WHERE AuftragsKennung IN (3, 6, 25, 4) 
    AND bStatus_storniert=0 
    AND KundenNr LIKE KdNr 
    AND YEAR(Datum_erfassung) IN (2014, 2015)

You can replace 2014 and 2015 by parameters from your code.

You must copy/paste and edit missing month between 1 and 12.

This should work as well. It may not be as efficient (try it?) but it is more compact and easier to work with:

SELECT CAST(10001 AS varchar)) AS KdNr
    , CAST(2014 AS varchar)) AS LastYear
    , CAST(2015 AS varchar)) AS CurrentYear
    , PIV.[LY1], PIV.[LY2], PIV.[LY3], PIV.[LY4], PIV.[LY5], PIV.[LY6], PIV.[LY7], PIV.[LY8], PIV.[LY9], PIV.[LY10], PIV.[LY11], PIV.[LY12]
    , LY = PIV.[LY1] + PIV.[LY2] + PIV.[LY3] + PIV.[LY4] + PIV.[LY5] + PIV.[LY6] + PIV.[LY7] + PIV.[LY8] + PIV.[LY9] + PIV.[LY10] + PIV.[LY11] + PIV.[LY12]
    , PIV.[CY1], PIV.[CY2], PIV.[CY3], PIV.[CY4], PIV.[CY5], PIV.[CY6], PIV.[CY7], PIV.[CY8], PIV.[CY9], PIV.[CY10], PIV.[CY11], PIV.[CY12]
    , CY = PIV.[CY1] + PIV.[CY2] + PIV.[CY3] + PIV.[CY4] + PIV.[CY5] + PIV.[CY6] + PIV.[CY7] + PIV.[CY8] + PIV.[CY9] + PIV.[CY10] + PIV.[CY11] + PIV.[CY12]
FROM (
    SELECT y = CASE WHEN YEAR(Datum_erfassung) = 2014 THEN 'LY' ELSE 'CY' END + CAST(MONTH(Datum_erfassung) AS varchar(2))
        , v = CASE WHEN AuftragsKennung = 4 THEN -(Summen_gesamt-Summen_ust_gesamt) ELSE Summen_gesamt-Summen_ust_gesamt END
    FROM FK_Auftrag 
    WHERE AuftragsKennung IN (3,6,25, 4) 
        AND bStatus_storniert=0 
        AND KundenNr LIKE KdNr 
        AND YEAR(Datum_erfassung) IN (2014, 2015) 
) d
PIVOT (
    SUM(v)
    FOR y IN([LY1], [LY2], [LY3], [LY4], [LY5], [LY6], [LY7], [LY8], [LY9], [LY10], [LY11], [LY12]
            , [CY1], [CY2], [CY3], [CY4], [CY5], [CY6], [CY7], [CY8], [CY9], [CY10], [CY11], [CY12]) 
) AS PIV

Best Answer

Related Solutions

Sql-server – Optimize this SQL query for performance

Sql-server – How to optimize the SQL Query (Multiple Rounds)

Related Question