MySQL – Get Count of Consecutive Dates

group bymariadbMySQL

I want to get a list of dates where there are consecutive dates and it's corresponding count.

For example, if the I have the following data set

The result I would like is where consecutive date count > x

2021-07-21  8
2021-07-17  2
2021-07-02  3

I'm not really sure how to approach this problem. If an explanation could be provided with the query that would be great, although not required.

Best Answer

As correctly noted by Charlieface, this is a Gaps and Islands problem. Another way of solving this specific variation – also involving a window function, though a different one this time – would go like this:

WITH
  partitioned AS
  (
    SELECT
      *
    , DATEDIFF(Date, '1970-01-01') - ROW_NUMBER() OVER (ORDER BY Date ASC) AS PartID
    FROM
      YourTable
  )
SELECT
  MIN(Date) AS StartDate
, COUNT(*)  AS DayCount
FROM
  partitioned
GROUP BY
  PartID
HAVING
  COUNT(*) > 1
ORDER BY
  PartID
;

This solution relies on the fact that the difference between a representation of a date as an integer (DATEDIFF(...)) and the date's numerical position in an ordered sequence (ROW_NUMBER() OVER ...) is a constant value. If we looked at the intermediate values returned by the functions in the PartID expression, we would find the following:

Date	DATEDIFF(Date, '1970-01-01')	ROW_NUMBER() OVER (ORDER BY Date ASC)	PartID
2021-07-02	18810	1	18809
2021-07-03	18811	2	18809
2021-07-04	18812	3	18809
2021-07-06	18814	4	18810
2021-07-09	18817	5	18812
2021-07-11	18819	6	18813
2021-07-14	18822	7	18815
2021-07-17	18825	8	18817
2021-07-18	18826	9	18817
2021-07-21	18829	10	18819
2021-07-22	18830	11	18819
2021-07-23	18831	12	18819
2021-07-24	18832	13	18819
2021-07-25	18833	14	18819
2021-07-26	18834	15	18819
2021-07-27	18835	16	18819
2021-07-28	18836	17	18819

As you can see, the difference between DATEDIFF and ROW_NUMBER (represented by the column PartID) is the same where dates are consecutive, and it is different for different sequences, which makes it a perfect candidate for a GROUP BY criterion. And that is exactly what the query is using it for. By the way, the date 1970-01-01 has no specific meaning in this case. Any date could be used instead of it as long as it is a constant value.

Another important note to make – and it makes this answer substantially different from Charlieface's suggestion – is that all the dates must be unique for the method to work as expected.

A live demo of this solution can be found at db<>fiddle.

Related Solutions

SQL – How to Get Max Date for Each Year from List of Dates

It is possible that I am missing something, but you should be able to get the result easily using the max() aggregate and a GROUP BY:

select
  max(date) maxdate,
  year
from yourtable
group by year;

See SQL Fiddle with Demo

Mysql – How to get the SUM between dates in a WHERE clause

You might laugh when you here this, by there is an aggregate clause that triggers summary between breaks in values. The aggregrate clause is WITH ROLLUP. (Look in MySQL Documentation under GROUP BY Modifiers)

let's take your query and make the following changes

Remove ORDER BY
Substitute WITH ROLLUP
Remove name

You get this

SELECT
cdr.datefield AS 'date',
a.id AS 'id',
SUM(t.debit_amount - t.credit_amount) AS 'balance'
FROM calendar cdr
JOIN transactions t ON (cdr.datefield >= t.value_date)
JOIN accounts a ON (a.id = t.account_id)
WHERE cdr.datefield IN ('2014-03-31', '2013-03-31', '2012-03-31')
GROUP BY cdr.datefield, a.id
WITH ROLLUP;

Here is what should happen

In between each datefield, there will be
- datefield
- id NULL
- a sum of balance for that date
The last row return
- datefield NULL
- id NULL
- a sum of balance for all dates

Based on the desired output, you want Cash, Payables and Issued Capital as zero in case there is no transaction type for the given date. You also done want the last row because you don't want the grand total. What you do is take the new query, make it a subquery, and LEFT JOIN the accounts table to the subquery, returning zero for balance if it is NULL and exclude any row whose date is NULL.

Here is the new query

SELECT
    BB.date,AA.id,AA.name,IFNULL(BB.balance,0) balance
FROM accounts AA LEFT JOIN
(SELECT
cdr.datefield AS 'date',
a.id AS 'id',
SUM(t.debit_amount - t.credit_amount) AS 'balance'
FROM calendar cdr
JOIN transactions t ON (cdr.datefield >= t.value_date)
JOIN accounts a ON (a.id = t.account_id)
WHERE cdr.datefield IN ('2014-03-31', '2013-03-31', '2012-03-31')
GROUP BY cdr.datefield, a.id
WITH ROLLUP) BB USING (id) WHERE AA.date IS NOT NULL
ORDER BY BB.date DESC,AA.id;

I have used WITH ROLLUP to answer many posts in the DBA StackExchange

Jun 20, 2014 : Get only overall summary WITH ROLLUP and GROUP BY for multiple fields
Aug 12, 2014 : Fetch data from same table using two group by clauses in mysql
Apr 11, 2014 : Why is MySQL not using the index with the higher cardinality?
Sep 21, 2014 : Monthly report by time
Jul 31, 2013 : TokuDB database size unknown in phpmyadmin
Jul 10, 2013 : How to estimate/predict data size and index size of a table in MySQL
Jul 03, 2013 : Information about Disk Storage MySQL
Apr 25, 2013 : WITH ROLLUP WHERE x IS NULL

Give it a Try !!!

Best Answer

Related Solutions

SQL – How to Get Max Date for Each Year from List of Dates

Mysql – How to get the SUM between dates in a WHERE clause

Related Question