Mysql – identify multiple gaps and islands for n columns

gaps-and-islandsMySQL

In this previous question, I asked for a way of identifying some gaps and islands. with a table t below;

+---------------------+------------+---------------+
| ts                  | EngOilP_sd | CompOilLVL_sd |
+---------------------+------------+---------------+
| 2015-06-24 20:28:07 |          0 |             0 |
| 2015-06-24 20:30:20 |          1 |             0 |
| 2015-06-24 20:36:47 |          1 |             0 |
| 2015-06-24 20:41:11 |          1 |             0 |
| 2015-06-24 20:43:29 |          1 |             0 |
| 2015-06-24 20:45:42 |          0 |             0 |
| 2015-06-24 20:47:51 |          0 |             0 |
| 2015-06-24 20:49:59 |          0 |             1 |
| 2015-06-24 20:52:01 |          0 |             1 |
| 2015-06-24 20:54:17 |          0 |             0 |
+---------------------+------------+---------------+

The following view will give me the islands of data where values are successively true;

create view t_pivot as
    select 'EngOilP_sd' as sd, ts, EngOilP_sd as val from t
    union all
    select 'CompOilLVL_sd', ts, CompOilLVL_sd from t;

# explain
select sd, timediff(stop, start) as duration, start, stop
from (
    select x.sd, x.start, min(case y.val when 0 then y.ts end) as stop
    from (
        select sd, min(case val when 1 then ts end) as start
        from t_pivot
        group by sd
    ) as x
    join t_pivot as y
        on x.sd = y.sd
       and y.ts > x.start
    group by x.sd, x.start
) as z
order by sd desc;

Gives

+---------------+----------+---------------------+---------------------+
| sd            | duration | start               | stop                |
+---------------+----------+---------------------+---------------------+
| EngOilP_sd    | 00:15:22 | 2015-06-24 20:30:20 | 2015-06-24 20:45:42 |
| CompOilLVL_sd | 00:04:18 | 2015-06-24 20:49:59 | 2015-06-24 20:54:17 |
+---------------+----------+---------------------+---------------------+
2 rows in set (0.00 sec)

Now, I am stumped as how to look at the following table for n gaps

+---------------------+------------+---------------+---------------+
| ts                  | EngOilP_sd | CompOilLVL_sd | Brake_sd      |
+---------------------+------------+---------------+---------------+
| 2015-06-24 20:28:07 |          0 |             0 |             0 |
| 2015-06-24 20:30:20 |          1 |             0 |             0 |
| 2015-06-24 20:36:47 |          1 |             1 |             0 |
| 2015-06-24 20:41:11 |          1 |             1 |             1 |
| 2015-06-24 20:43:29 |          1 |             1 |             1 |
| 2015-06-24 20:45:42 |          1 |             0 |             1 |
| 2015-06-24 20:47:51 |          0 |             0 |             1 |
| 2015-06-24 20:49:59 |          0 |             1 |             0 |
| 2015-06-24 20:52:01 |          1 |             1 |             0 |
| 2015-06-24 20:53:01 |          1 |             1 |             0 |
| 2015-06-24 20:54:01 |          1 |             1 |             1 |
| 2015-06-24 20:56:01 |          1 |             1 |             1 |
| 2015-06-24 20:57:01 |          1 |             1 |             1 |
| 2015-06-24 20:58:01 |          1 |             1 |             1 |
| 2015-06-24 21:02:01 |          1 |             1 |             1 |
| 2015-06-24 21:32:01 |          0 |             1 |             1 |
| 2015-06-24 21:52:01 |          0 |             0 |             1 |
| 2015-06-24 22:12:01 |          0 |             0 |             0 |
| 2015-06-24 22:20:01 |          0 |             0 |             0 |
| 2015-06-24 22:49:01 |          0 |             1 |             1 |
| 2015-06-24 23:52:01 |          0 |             1 |             1 |
| 2015-06-24 23:54:17 |          0 |             0 |             0 |
+---------------------+------------+---------------+---------------+

The result set I am looking for here would be as below

+---------------+----------+---------------------+---------------------+
| sd            | duration | start               | stop                |
+---------------+----------+---------------------+---------------------+
| EngOilP_sd    | 00:15:22 | 2015-06-24 20:30:20 | 2015-06-24 20:45:42 |
| EngOilP_sd    | 00:10:00 | 2015-06-24 20:52:20 | 2015-06-24 21:02:42 |
| CompOilLVL_sd | 00:06:42 | 2015-06-24 20:36:47 | 2015-06-24 20:43:29 |
| CompOilLVL_sd | 00:42:02 | 2015-06-24 20:49:59 | 2015-06-24 21:32:01 |
| CompOilLVL_sd | 00:03:00 | 2015-06-24 20:49:59 | 2015-06-24 20:54:17 |
| Brake_sd      | 00:06:40 | 2015-06-24 20:41:11 | 2015-06-24 20:47:51 |
| Brake_sd      | 00:58:01 | 2015-06-24 20:54:01 | 2015-06-24 21:52:01 |
| Brake_sd      | 01:02:00 | 2015-06-24 22:49:01 | 2015-06-24 23:52:01 |
+---------------+----------+---------------------+---------------------+
8 rows in set (0.00 sec)

EDIT Initially, I did not have the column Brake_sd in the question. User @tombom's answer correctly describes the case where the Brake_sd column is not present, although that is the only difference

Best Answer

Sample data:

CREATE TABLE t
    (`ts` varchar(19), `EngOilP_sd` int, `CompOilLVL_sd` int)
;

INSERT INTO t
    (`ts`, `EngOilP_sd`, `CompOilLVL_sd`)
VALUES
    ('2015-06-24 20:28:07', 0, 0),
    ('2015-06-24 20:30:20', 1, 0),
    ('2015-06-24 20:36:47', 1, 1),
    ('2015-06-24 20:41:11', 1, 1),
    ('2015-06-24 20:43:29', 1, 1),
    ('2015-06-24 20:45:42', 1, 0),
    ('2015-06-24 20:47:51', 0, 0),
    ('2015-06-24 20:49:59', 0, 1),
    ('2015-06-24 20:52:01', 1, 1),
    ('2015-06-24 20:53:01', 1, 1),
    ('2015-06-24 20:54:01', 1, 1),
    ('2015-06-24 20:56:01', 1, 1),
    ('2015-06-24 20:57:01', 1, 1),
    ('2015-06-24 20:58:01', 1, 1),
    ('2015-06-24 21:02:01', 1, 1),
    ('2015-06-24 21:32:01', 0, 1),
    ('2015-06-24 21:52:01', 0, 0),
    ('2015-06-24 22:12:01', 0, 0),
    ('2015-06-24 22:20:01', 0, 0),
    ('2015-06-24 22:49:01', 0, 1),
    ('2015-06-24 23:52:01', 0, 1),
    ('2015-06-24 23:54:17', 0, 0)
;

create view t_pivot as
    select 'EngOilP_sd' as sd, ts, EngOilP_sd as val from t
    union all
    select 'CompOilLVL_sd', ts, CompOilLVL_sd from t;

Query:

SELECT sd, TIMEDIFF(MAX(ts), MIN(ts)) AS duration, MIN(ts) AS startime, MAX(ts) AS endtime
FROM (
    SELECT 
    tp.*
    , @val_change := IF(@prev_val != tp.val, @val_change + 1, @val_change) AS vc
    , @prev_val := tp.val
    FROM
    t_pivot tp
    , (SELECT @prev_val := NULL, @val_change := 0) var_init_subquery
    ORDER BY sd, ts
) sq
WHERE val = 1
GROUP BY sd, vc;

Result:

|            sd |                  duration |            startime |             endtime |
|---------------|---------------------------|---------------------|---------------------|
| CompOilLVL_sd | January, 01 1970 00:06:42 | 2015-06-24 20:36:47 | 2015-06-24 20:43:29 |
| CompOilLVL_sd | January, 01 1970 00:42:02 | 2015-06-24 20:49:59 | 2015-06-24 21:32:01 |
| CompOilLVL_sd | January, 01 1970 01:03:00 | 2015-06-24 22:49:01 | 2015-06-24 23:52:01 |
|    EngOilP_sd | January, 01 1970 00:15:22 | 2015-06-24 20:30:20 | 2015-06-24 20:45:42 |
|    EngOilP_sd | January, 01 1970 00:10:00 | 2015-06-24 20:52:01 | 2015-06-24 21:02:01 |

Explanation:

With the subquery in the FROM clause we are initializing our variables. The ORDER BY is very important! Here's why: In the SELECT clause each line is processed. The order in the SELECT clause is also important. Here

, @val_change := IF(@prev_val != tp.val, @val_change + 1, @val_change) AS vc

we increment our variable @val_change if the value in the column val differs from the "previous" row (that's why ORDER BY is so important). The "previous row" is in the variable @prev_val. The value of the current row is assigned to it in this line

, @prev_val := tp.val

Since MySQL processes each line in the SELECT clause in the order the SELECT clause is written, the @prev_val variable holds the value of the previous row (or NULL when no row has been processed yet) in the previous line of the SELECT clause.

Now all we have to do is GROUP BY sd and this @val_change variable in the outer query.

  • Read more about how to use user-defined variables in the manual here.

CAVEAT:

This will result in a full table scan.

sorry for bad english, hope it's understandable

UPDATED QUESTION:

The query still is the same. You just have to adjust the view t_pivot

create view t_pivot as
    select 'EngOilP_sd' as sd, ts, EngOilP_sd as val from t
    union all
    select 'CompOilLVL_sd', ts, CompOilLVL_sd from t
    union all
    select 'Brake_sd', ts, Brake_sd from t;

to include Brake_sd.