SQL Server – Improving Query Performance with Window Functions

sortingsql serverwindow functions

We write a query that include unpivot, partition by and order by.
Query is:

SELECT PersonId
    ,SalaryDate
    ,ID
    ,Type
    ,SalaryValue
    ,ROW_NUMBER() OVER (
        PARTITION BY PersonId ORDER BY SalaryValue
        ) AS rn
FROM (
    SELECT lp.PersonId
        ,lp.SalaryDate
        ,lp.Salary1
        ,lp.Salary2
        ,lp.Salary3
        ,lp.ID
    FROM rdd.Salaries AS lp WITH (NOLOCK)
    WHERE lp.SalaryDate > DATEADD(day, - 31, getdate())
    ) AS t
unpivot(SalaryValue FOR Type IN (
            lp.Salary1
            ,lp.Salary2
            ,lp.Salary3
            )) AS UnpivotTable

enter image description here

The query returns about 68.000.000 rows and execution time is 20 minutes.

Can I improve the query's performance or rewrite effectively?
What is the alternative of partition by?

Best Answer

You may find that the following index and query rewrite performs better, because it sorts per person rather than once over the whole set, and row estimates are more likely to be accurate:

-- Index
CREATE INDEX IX_Salaries_PersonId_SalaryDate_Inc_ID_Salary1_Salary2_Salary3
ON rdd.Salaries (PersonId, SalaryDate)
INCLUDE (ID, Salary1, Salary2, Salary3);

-- Query
WITH People AS
(
    SELECT DISTINCT
        S.PersonId
    FROM rdd.Salaries AS S
    WHERE 
        S.SalaryDate > DATEADD(DAY, -31, GETDATE())
)
SELECT 
    P.PersonId, 
    CA.SalaryDate, 
    CA.ID, 
    CA.SalaryValue, 
    CA.rn
FROM People AS P
CROSS APPLY
(
    SELECT
        S.SalaryDate, 
        S.ID, 
        V.SalaryValue, 
        rn = ROW_NUMBER() OVER (ORDER BY V.SalaryValue)
    FROM rdd.Salaries AS S
    CROSS APPLY
    (
        SELECT S.Salary1 WHERE S.Salary1 IS NOT NULL
        UNION ALL
        SELECT S.Salary2 WHERE S.Salary2 IS NOT NULL
        UNION ALL
        SELECT S.Salary3 WHERE S.Salary3 IS NOT NULL
    ) AS V (SalaryValue)
    WHERE 
        S.PersonId = P.PersonId
        AND S.SalaryDate > DATEADD(DAY, -31, GETDATE())
) AS CA
ORDER BY
    P.PersonId,
    CA.rn
OPTION (QUERYTRACEON 8649);

You can omit the OPTION clause if you find a parallel query is generated naturally, or if you find non-parallel performance is good enough. The desired plan shape is roughly as follows:

Plan Shape