Consider the following table,
CREATE TABLE temp (
[TabName] VARCHAR(255),
[ID] VARCHAR(255),
[AsOfDate] DATE)
INSERT INTO temp VALUES
('TAB1', 'C103', '2019-05-01'),
('TAB1', 'C103', '2019-06-01'),
('TAB2', 'C103', '2019-06-01'),
('TAB2', 'C103', '2019-07-01'),
('TAB1', 'C103', '2019-09-01'),
('TAB1', 'C104', '2019-06-01'),
('TAB1', 'C104', '2019-08-01')
TabName ID AsOfDate
TAB1 C103 '2019-05-01'
TAB1 C103 '2019-06-01'
TAB2 C103 '2019-06-01'
TAB2 C103 '2019-07-01'
TAB1 C103 '2019-09-01'
TAB1 C104 '2019-06-01'
TAB1 C104 '2019-08-01'
TAB1 C105 '2019-04-01'
TAB1 C105 '2019-05-01'
I am wanting to find the IDs from the table that have skipped dates. In this table, I would want to identify that ID C103 and C104 have skipped dates as they have jumped from '2019-07-01' to '2019-09-01' and '2019-06-01' to '2019-8-01' respectively.
I have found the following question asked previously Find Missing Dates in Data which I believe gives some potential clues on the approach, i.e. using CTE, however, I am unsure how to apply them to this question where the dates are not consecutive across the whole table.
Should we be looking at partitioning here?
Best Answer
You can use a recursive CTE to achieve this, in conjunction with two windowing functions (ROW_NUMBER and RANK), to produce a running total of the date difference between the current date value and the previous value (by tabname & id group). You then SELECT only those rows with a date difference higher than 1.
If you're on SQL Server 2012 or higher, there is a much simpler way. Use the LAG function to retrieve the previous value (grouped by tabname and id) and performt he DATEDIFF on that.
Examples are included below, and you can see them in action in this db<>fiddle.
LAG Example:
Recursive CTE example: