MySQL 5.5 – Find Total Duration of Each Consecutive Series of Rows

innodbMySQLmysql-5.5

MySQL Version

The code will run in MySQL 5.5

Background

I have a table like the following one

CREATE TABLE t
( id INT NOT NULL AUTO_INCREMENT
, patient_id INT NOT NULL
, bed_id INT NOT NULL
, ward_id INT NOT NULL
, admitted DATETIME NOT NULL
, discharged DATETIME
, PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

This table is about patients in a hospital and it stores the beds where each patient spent some time while being hospitalized.

Each ward may have multiple beds and each patient may move to a different bed within the same ward.

Objective

What I want to do is to find how much time each patient spent in a specific ward without having moved to a different ward. I.e I want to find the total duration of the consecutive time he spent within the same ward.

Test case

-- Let's assume that ward_id = 1 corresponds to ICU (Intensive Care Unit)
INSERT INTO t
  (patient_id, bed_id, ward_id, admitted, discharged)
VALUES

-- Patient 1 is in ICU, changes some beds, then he is moved 
-- out of ICU, back in and finally he is out.
(1, 1, 1, '2015-01-06 06:05:00', '2015-01-07 06:04:00'),
(1, 2, 1, '2015-01-07 06:04:00', '2015-01-07 07:08:00'),
(1, 1, 1, '2015-01-07 07:08:00', '2015-01-08 08:11:00'),
(1, 4, 2, '2015-01-08 08:11:00', '2015-01-08 09:11:00'),
(1, 1, 1, '2015-01-08 09:11:00', '2015-01-08 10:11:00'),
(1, 3, 1, '2015-01-08 10:11:00', '2015-01-08 11:11:00'),
(1, 1, 2, '2015-01-08 11:11:00', '2015-01-08 12:11:00'),

-- Patient 2 is out of ICU, he gets inserted in ICU, 
-- changes some beds and he is back out
(2, 1, 2, '2015-01-06 06:00:00', '2015-01-07 06:04:00'),
(2, 1, 1, '2015-01-07 06:04:00', '2015-01-07 07:08:00'),
(2, 3, 1, '2015-01-07 07:08:00', '2015-01-08 08:11:00'),
(2, 1, 2, '2015-01-08 08:11:00', '2015-01-08 09:11:00'),

-- Patient 3 is not inserted in ICU
(3, 1, 2, '2015-01-08 08:10:00', '2015-01-09 09:00:00'),
(3, 2, 2, '2015-01-09 09:00:00', '2015-01-10 10:01:00'),
(3, 3, 2, '2015-01-10 10:01:00', '2015-01-11 12:34:00'),
(3, 4, 2, '2015-01-11 12:34:00', NULL),

-- Patient 4 is out of ICU, he gets inserted in ICU without changing any beds
-- and goes back out.
(4, 1, 2, '2015-01-06 06:00:00', '2015-01-07 06:04:00'),
(4, 2, 1, '2015-01-07 06:04:00', '2015-01-07 07:08:00'),
(4, 1, 2, '2015-01-07 07:08:00', '2015-01-08 09:11:00'),

-- Patient 5 is out of ICU, he gets inserted in ICU without changing any beds
-- and he gets dismissed.
(5, 1, 2, '2015-01-06 06:00:00', '2015-01-07 06:04:00'),
(5, 3, 2, '2015-01-07 06:04:00', '2015-01-07 07:08:00'),
(5, 1, 1, '2015-01-07 07:08:00', '2015-01-08 09:11:00'),

-- Patient 6 is inserted in ICU and he is still there
(6, 1, 1, '2015-01-11 12:34:00', NULL);

In the real table the rows are not consecutive but for each patient the discharge timestamp from one row == the admission timestamp of the next row.

SQLFiddle

http://sqlfiddle.com/#!2/b5fe5

Expected Result

I would like to write something like the following:

SELECT pid, ward_id, admitted, discharged
FROM  (....)
WHERE ward_id = 1;

(1, 1, '2015-01-06 06:05:00', '2015-01-08 08:11:00'),
(1, 1, '2015-01-08 09:11:00', '2015-01-09 11:11:00'),
(2, 1, '2015-01-07 06:04:00', '2015-01-08 08:11:00'),
(4, 1, '2015-01-07 06:04:00', '2015-01-07 07:08:00'),
(5, 1, '2015-01-07 07:08:00', '2015-01-08 09:11:00'),
(6, 1, '2015-01-11 12:34:00', NULL);

Please, do take note that we can't group by patient_id. We must retrieve a separate record for each ICU visit.

To put it more plainly, if a patient spends time in ICU, then moves out of it and then returns back there, I need to retrieve the total time he spent in each ICU visit (i.e. two records)

Best Answer

Query 1, tested in SQLFiddle-1

SET @ward_id_to_check = 1 ;

SELECT
    st.patient_id,
    st.bed_id AS starting_bed_id,          -- the first bed a patient uses
                                           -- can be omitted
    st.admitted,
    MIN(en.discharged) AS discharged
FROM
  ( SELECT patient_id, bed_id, admitted, discharged
    FROM t 
    WHERE t.ward_id = @ward_id_to_check
      AND NOT EXISTS
          ( SELECT * 
            FROM t AS prev 
            WHERE prev.ward_id = @ward_id_to_check
              AND prev.patient_id = t.patient_id
              AND prev.discharged = t.admitted
          )
  ) AS st
JOIN
  ( SELECT patient_id, admitted, discharged
    FROM t 
    WHERE t.ward_id = @ward_id_to_check
      AND NOT EXISTS
          ( SELECT * 
            FROM t AS next 
            WHERE next.ward_id = @ward_id_to_check
              AND next.patient_id = t.patient_id
              AND next.admitted = t.discharged
          )
  ) AS en
    ON  st.patient_id = en.patient_id
    AND st.admitted <= en.admitted
GROUP BY
    st.patient_id,
    st.admitted ;

Query 2, which is the same as 1 but without the derived tables. This will probably have better execution plan, with proper indexes. Test in SQLFiddle-2:

SET @ward_id_to_check = 1 ;

SELECT
    st.patient_id,
    st.bed_id AS starting_bed_id,
    st.admitted,
    MIN(en.discharged) AS discharged
FROM
    t AS st    -- starting period
  JOIN
    t AS en    -- ending period
      ON  en.ward_id = @ward_id_to_check
      AND st.patient_id = en.patient_id
      AND NOT EXISTS
          ( SELECT * 
            FROM t AS next 
            WHERE next.ward_id = @ward_id_to_check
              AND next.patient_id = en.patient_id
              AND next.admitted = en.discharged
          )
      AND st.admitted <= en.admitted
WHERE 
      st.ward_id = @ward_id_to_check
  AND NOT EXISTS
      ( SELECT * 
        FROM t AS prev 
        WHERE prev.ward_id = @ward_id_to_check
          AND prev.patient_id = st.patient_id
          AND prev.discharged = st.admitted
      )
GROUP BY
    st.patient_id,
    st.admitted ;

Both queries assume that there is a unique constraint on (patient_id, admitted). If the server runs with strict ANSI settings, the bed_id should be added in the GROUP BY list.