Sql-server – How to query for records based on most recent version

sql servert-sql

I have the following table structure

ID, ORDERNUMBER, DATE, STATUS
1, 1, 1-1-01, proposal
2, 1, 1-2-01, proposal
3, 1, 1-3-01, active
4, 2, 1-1-02, proposal
5, 2, 1-2-02, cancelled
6, 3, 1-1-03, proposal
7, 3, 1-2-03, proposal
8, 4, 1-1-04, proposal
9, 4, 1-2-04, active
10, 4, 1-3-04, active

Here is the SQL to recreate the table.

CREATE TABLE [dbo].[tableX](
    [ID] [int] NOT NULL,
    [Ordernumber] [int] NULL,
    [Date] [date] NULL,
    [Status] [nvarchar](50) NULL
) ON [PRIMARY]


INSERT INTO [tableX]
    ([ID]
    ,[Ordernumber]
    ,[Date]
    ,[Status])
VALUES
    (1, 1, '1-1-01', 'proposal'),
    (2, 1, '1-2-01', 'proposal'),
    (3, 1, '1-3-01', 'active'),
    (4, 2, '1-1-02', 'proposal'),
    (5, 2, '1-2-02', 'cancelled'),
    (6, 3, '1-1-03', 'proposal'),
    (7, 3, '1-2-03', 'proposal'),
    (8, 4, '1-1-04', 'proposal'),
    (9, 4, '1-2-04', 'active'),
    (10, 4, '1-3-04', 'active')

I need to query for all records that have a most recent state of active, in the case of the example it is ordernumber 1 and 4.
Input parameter: Status = active
Result:

ID, ORDERNUMBER, DATE, STATUS
1, 1, 1-1-01, proposal
2, 1, 1-2-01, proposal
3, 1, 1-3-01, active
8, 4, 1-1-04, proposal
9, 4, 1-2-04, active
10, 4, 1-3-04, active

Further I need to split this into history and current statuses so for state active and current the Id's 3 and 10. For state active and history the Id's 1, 2, 8 and 9.
Input parameters: Status = Active, Date = Most current
Result:

ID, ORDERNUMBER, DATE, STATUS
3, 1, 1-3-01, active
10, 4, 1-3-04, active

Input parameters: Status = Active, Date = Not most current
Result:

ID, ORDERNUMBER, DATE, STATUS
1, 1, 1-1-01, proposal
2, 1, 1-2-01, proposal
8, 4, 1-1-04, proposal
9, 4, 1-2-04, active

I've tried a lot of possibilties, but I cannot find a query that works. Can someone help me out or point me in the right direction?

I am working in a SQL Server environment.

Best Answer

You can use ROW_NUMBER() and OVER clause to "group" (partition) the rows per ordernumber, then order them by date (descending) and keep only the first rows (WHERE rn = 1):

WITH cte AS
  ( SELECT  id, ordernumber, [date], status,
            rn = ROW_NUMBER() OVER (PARTITION BY ordernumber
                                    ORDER BY date DESC)
    FROM    tableX
  )
SELECT  id, ordernumber, [date], status,
FROM    cte
WHERE   rn = 1
  AND   status = 'active' ;

And I really do hope that the date column is a DATE column and not a CHAR one. Otherwise the above ORDER BY will not work as you expect.


After the last, edit, it seems that the date column is actually of DATE dataype, which is good.

But really, you should't use '1-3-04' format for input. For output it may be fine (the users of the application may like that dubious format) but for input, no. Because you can never be sure if the date that will actually be stored in the database will be 2004-01-03 or 2004-03-01 or 2001-03-04 (or some other.)

Use a format that is unambiguous, like '2004-01-03' (the ISO YYYY-MM-DD format) or even better (for SQL-Server specifically) the YYYYMMDD format: '20040103'. See the blog by Aaron Bertrand: Bad habits to kick : mis-handling date / range queries