Mysql – How to collapse contiguous single day start/end date ranges in MySQL

gaps-and-islandsMySQLquery

(edited – I oversimplified the original problem)

I'm trying to figure out a MySQL query that will collapse one-day long contiguous start/end date ranges. Here is my sample input data:

INSERT INTO my_table(tag, start_date, end_date) VALUES ('AA', '2020-07-01', '2020-07-02');
INSERT INTO my_table(tag, start_date, end_date) VALUES ('XB', '2020-07-02', '2020-07-03');
INSERT INTO my_table(tag, start_date, end_date) VALUES ('YC', '2020-07-03', '2020-07-04');
INSERT INTO my_table(tag, start_date, end_date) VALUES ('ZD', '2020-07-04', '2020-07-05');
INSERT INTO my_table(tag, start_date, end_date) VALUES ('0E', '2020-07-05', '2020-07-06');

INSERT INTO my_table(tag, start_date, end_date) VALUES ('CA', '2020-07-06', '2020-07-15');

INSERT INTO my_table(tag, start_date, end_date) VALUES ('CA', '2020-07-20', '2020-07-21');
INSERT INTO my_table(tag, start_date, end_date) VALUES ('CA', '2020-07-21', '2020-07-22');
INSERT INTO my_table(tag, start_date, end_date) VALUES ('CA', '2020-07-22', '2020-07-23');

INSERT INTO my_table(tag, start_date, end_date) VALUES ('EA', '2020-07-23', '2020-07-31');

Here is the output I want:

AA,2020-07-01,2020-07-06
CA,2020-07-06,2020-07-15
CA,2020-07-20,2020-07-23
EA,2020-07-23,2020-07-31

It seems like it should be simple, but I can't figure out how to do it in pure MySQL. I've tried joining the table to itself and using various combinations of MIN() and MAX(), but the solution eludes me. I can do it in PHP code, but would prefer to let MySQL do the work.

I can't rely on the tag field for grouping (it can be anything). The date collapse should be done only on the date ranges that are 1-day long and contiguous. Date ranges longer than one day should not be collapsed. The selected tag should be the value associated with the first date in the range

Best Answer

After a little more research, I figured out the solution to this "gaps and islands" problem:

SELECT t.tag, begin AS start_date, MAX(end) AS end_date
FROM (
    SELECT
    a.*
    CASE WHEN (DATEDIFF(start_date, @prev_date) = 1 AND DATEDIFF(end_date, start_date) = 1)
        THEN @begin := @begin
        ELSE @begin := start_date
    END AS begin,
    end_date AS end,
    @prev_date := start_date
    FROM (SELECT @prev_date := NULL, @begin := NULL) AS vars
    JOIN my_table a
    ORDER BY start_date
) AS t
GROUP BY begin;