How to Use CTE to Fetch Most Recent Row for Each ID in SQL Server

sql serversql-server-2016

I just recently started learning about the wonders of CTEs and how they can be used, instead of WHILE loops, to optimize performance.

For context, I am automating the process of updating firmware on devices in my company. The devices send UDP data to the server and store the messages into a single table, so the table is quite large

There are 2 modules on each device, high and low frequency and the messages are very similar

Previously, I had temp tables, counters, and a while loop to get the most recent firmware message for each device on each module. I have since used the following CTE to extract the same information (only high frequency code shown, table and column names changed for confidentiality)

;WITH HFCTE (DIdentifier, CachedHFFWVersion, CurrentHFFWVersion, HFFWMessage, HFFWMessageDate) AS
(
    --Get all HF Firmware Messages for each device
    SELECT  udp.DIdentifier,
            LEFT(RIGHT(UDPMessage,7),3),
            RIGHT(UDPMessage, 3),
            UDPMessage,
            DateRecorded
        FROM UDPTable udp WITH (NOLOCK)
            INNER JOIN FirmwareInfoTable fw ON udp.DIdentifier = fw.DIdentifier
        WHERE UDPMessage LIKE '%FWMessage:%' 
            AND UDPMessage NOT LIKE '%FWMessage:LF%' 
            AND fw.Status = 'HF Pending'
)

-- Get the most recent HF Firmware Message for each device
SELECT cte.DIdentifier, 
cte.CachedHFFWVersion, 
cte.CurrentHFFWVersion, 
cte.HFFWMessage, 
cte.HFFWMessageDate
FROM HFCTE cte 
    INNER JOIN (
                SELECT DIdentifier,
                    MAX(HFFWMessageDate) AS MaxDate
                FROM HFCTE
                GROUP BY DIdentifier
               ) AS gcte
        ON cte.DIdentifier= gcte.DIdentifierAND cte.HFFWMessageDate = gcte.MaxDate
ORDER BY cte.DIdentifier DESC

The main problem is that before picking out the most recent firmware message using the last SELECT statement, the CTE has to get all FWMessage records for each device, which could be very bad for performance during peak hours.

Is there any way to only pull the one record I need for each device in my initial WITH CTE query? Would it even optimize the overall performance of the query? Should I be using a different method entirely?

P.S. DIdentifier is a PRIMARY_KEY in the FirmwareInfo table and a FOREIGN_KEY (Not to the Firmware table but to the table that holds devices) in the UDP table. The UDP table also has 2 relevant indexes, both non-clustered & non-unique. One includes DIdentifier and UDPMessage, the other includes DIdentifier and DateRecorded. I cannot create a new index for UDP table due to performance constraints

Best Answer

You can try using a window function like ROW_NUMBER() or DENSE_RANK() to generate an ID that represents the latest rows per your sort criteria. Here's an example:

;WITH HFCTE (DIdentifier, CachedHFFWVersion, CurrentHFFWVersionn, HFFWMessage, HFFWMessageDate, SortId) AS
(
    --Get all HF Firmware Messages for each device
    SELECT  udp.DIdentifier,
        LEFT(RIGHT(UDPMessage,7),3),
         RIGHT(UDPMessage, 3),
         UDPMessage,
         DateRecorded,
         DENSE_RANK() OVER (PARTITION BY udp.DIdentifier ORDER BY DateRecorded DESC) AS SortId
    FROM UDPTable udp WITH (NOLOCK)
        INNER JOIN FirmwareInfoTable fw ON udp.DIdentifier = fw.DIdentifier
    WHERE UDPMessage LIKE '%FWMessage:%' 
        AND UDPMessage NOT LIKE '%FWMessage:LF%' 
        AND fw.Status = 'HF Pending'
)
     
SELECT cte.DIdentifier, 
    cte.CachedHFFWVersion, 
    cte.CurrentHFFWVersion, 
    cte.HFFWMessage, 
    cte.HFFWMessageDate
FROM HFCTE cte 
WHERE cte.SortId = 1
ORDER BY cte.DIdentifier DESC

This eliminates the need to join your CTE to itself in a subquery and further upstreams the logic for determining the latest records by date. Also the window functions tend to be rather performant for this kind of problem.