I want to get a list of dates where there are consecutive dates and it's corresponding count.
For example, if the I have the following data set
Date
2021-07-28
2021-07-27
2021-07-26
2021-07-25
2021-07-24
2021-07-23
2021-07-22
2021-07-21
2021-07-18
2021-07-17
2021-07-14
2021-07-11
2021-07-09
2021-07-06
2021-07-04
2021-07-03
2021-07-02
The result I would like is where consecutive date count > x
2021-07-21 8
2021-07-17 2
2021-07-02 3
I'm not really sure how to approach this problem. If an explanation could be provided with the query that would be great, although not required.
Best Answer
As correctly noted by Charlieface, this is a Gaps and Islands problem. Another way of solving this specific variation – also involving a window function, though a different one this time – would go like this:
This solution relies on the fact that the difference between a representation of a date as an integer (
DATEDIFF(...)
) and the date's numerical position in an ordered sequence (ROW_NUMBER() OVER ...
) is a constant value. If we looked at the intermediate values returned by the functions in thePartID
expression, we would find the following:As you can see, the difference between
DATEDIFF
andROW_NUMBER
(represented by the columnPartID
) is the same where dates are consecutive, and it is different for different sequences, which makes it a perfect candidate for a GROUP BY criterion. And that is exactly what the query is using it for. By the way, the date1970-01-01
has no specific meaning in this case. Any date could be used instead of it as long as it is a constant value.Another important note to make – and it makes this answer substantially different from Charlieface's suggestion – is that all the dates must be unique for the method to work as expected.
A live demo of this solution can be found at db<>fiddle.