Sql-server – Better approach to optimize query

ctesql serversql-server-2012

We have a previously designed query that functions fairly well but I am thinking there has got to be a better way of performing the following. Here is the query…

WITH CTE1 ( COL1, COL2)
AS
  (SELECT TOP 10 COL1, COL2
   FROM TABLE 1
   WHERE CREATEDATE < DATEADD(HOUR, -1, GETDATE())
   ORDER BY COL2
  ),
WITH CTE2 ( COL1, COL2)
AS
  (SELECT COL1,MAX(COL2)
   FROM CTE1
   GROUP BY COL1
  )
SELECT TABLE2.*
FROM TABLE2
WHERE TABLE2.IDCOL IN
   (SELECT COL2 
    FROM CTE2)

So CTE1 can have results like this that require the grouping in CTE2.
CTE1 results look like this:

COL1    |   COL22
_________________
10005   |  500
10005   |  501
42344   |  502
12345   |  503

SO CTE2 is designed to consolidate CTE1 down to the single row by grouping to

COL1    |  COL2
_________________
10005   |  501
42344   |  502
12345   |  503
ETC.

I am trying to figure out if I can accomplish this same task without taking CTE1 into CTE2 and then using CTE2 results in the SELECT query.

Best Answer

As I understand your question (and let me know if I have misunderstood your requirements):

  1. Using the TOP 10 rows from Table1 where they fall into some date criteria and are ordered by COL2
  2. Get the MAX(COL2) per COL1 group
  3. Then, select all columns from Table2 where IDCOL = MAX(COL2) from the previous grouping.

I think this solution addresses all of those items. Basically, CTE2 is encapsulated into CTE1.

--Demo setup
Declare @Table1 TABLE (COL1 INT, COL2 INT)
INSERT INTO @Table1(COL1, COL2) VALUES

(10005,500),
(10005,501),
(10005,521),
(10005,521),

(42344,502),
(42344,550),

(12345,503),
(12345,575),
(12345,575)

Declare @Table2 TABLE (IDCOL INT,OtherStuff varchar(10))
INSERT INTO @Table2(IDCOL,OtherStuff) VALUES
(575,'Other1'),(575,'Other2'),(580,'Other3')

--The solution
;WITH CTE1 (COL1,COL2,Rn)
AS (
    SELECT TOP 10 COL1
        ,COL2
        ,ROW_NUMBER() OVER (
            PARTITION BY COL1 ORDER BY COL2 DESC
            ) AS Rn
    FROM @Table1
    --   WHERE CREATEDATE < DATEADD(HOUR, -1, GETDATE())
    ORDER BY COL2
    )
--select * from CTE1 order by col1, rn
SELECT T2.*
FROM CTE1 c1
JOIN @Table2 T2
    ON t2.IDCOL = c1.COL2
        AND c1.Rn = 1

Rows coming out of CTE1 have an additional rn column.

| COL1  | COL2 | Rn |
|-------|------|----|
| 10005 | 521  | 1  |
| 10005 | 521  | 2  |
| 10005 | 501  | 3  |
| 10005 | 500  | 4  |
| 12345 | 575  | 1  |
| 12345 | 575  | 2  |
| 12345 | 503  | 3  |
| 42344 | 550  | 1  |
| 42344 | 502  | 2  |

The last part of the query joins Table2 with the CTE1 where only the rows from CTE1 have rn = 1 and Table2 (IDCOL) = COL2 value for rn = 1.

| IDCOL | OtherStuff |
|-------|------------|
| 575   | Other1     |
| 575   | Other2     |

Ultimately, you will have to judge whether it is a Better approach to optimize query.