Since gadget_id is a vehicle, you need to monitor two things as you look at each row
- when a gadget_id switches location
- when a gadget_id switches to another gadget_id
The solution lies in organizing a set of user variables to monitor that change. Please forgive you are about to see:
First, let's load your data in the test database in a table called gadget_location
:
mysql> use test
Database changed
mysql> drop table gadget_location;
Query OK, 0 rows affected (0.07 sec)
mysql> create table gadget_location
-> (
-> id int not null auto_increment,
-> gadget_id int,
-> location varchar(30),
-> submitted_date date,
-> primary key (id)
-> );
Query OK, 0 rows affected (0.13 sec)
mysql> insert into gadget_location (gadget_id,location,submitted_date) values
-> (1,'calicut' , '2012-07-15'), (1,'calicut' , '2012-07-14'),
-> (1,'calicut' , '2012-07-13'), (2,'thrissur', '2012-07-12'),
-> (1,'calicut' , '2012-07-11'), (1,'kannur' , '2012-07-10'),
-> (2,'thrissur', '2012-07-09'), (1,'calicut' , '2012-07-08');
Query OK, 8 rows affected (0.05 sec)
Records: 8 Duplicates: 0 Warnings: 0
mysql> select * from gadget_location;
+----+-----------+----------+----------------+
| id | gadget_id | location | submitted_date |
+----+-----------+----------+----------------+
| 1 | 1 | calicut | 2012-07-15 |
| 2 | 1 | calicut | 2012-07-14 |
| 3 | 1 | calicut | 2012-07-13 |
| 4 | 2 | thrissur | 2012-07-12 |
| 5 | 1 | calicut | 2012-07-11 |
| 6 | 1 | kannur | 2012-07-10 |
| 7 | 2 | thrissur | 2012-07-09 |
| 8 | 1 | calicut | 2012-07-08 |
+----+-----------+----------+----------------+
8 rows in set (0.00 sec)
mysql>
OK, here comes the mess:
SET @dupcount = 0;
SET @group_number = 0;
SET @cur_gadget_id = 0;
SET @cur_location = MD5("1");
SET @cur_gadget_location = MD5("1");
SELECT gadget_id,location,GroupNumber,COUNT(1) DaysStopped FROM
(
SELECT
*,
@dupcount := IF(@cur_gadget_location=gadget_location,@dupcount+1,1) Dup_Count,
@group_number := IF(@cur_gadget_location=gadget_location,@group_number,@group_number+1) GroupNumber,
@cur_gadget_location := gadget_location
FROM
(
SELECT *,CONCAT(gadget_id,'-',cur_loc) gadget_location FROM
(
SELECT *,
@cur_location := IF(MD5(location)=@cur_location,@cur_location,MD5(location)) cur_loc,
@cur_gadget_id := IF(gadget_id=@cur_gadget_id,@cur_gadget_id,gadget_id) cur_gadget
FROM gadget_location
) AAA
) AA ) A GROUP BY gadget_id,location,GroupNumber;
Want to see it work ??? Here it goes:
mysql> SET @dupcount = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> SET @group_number = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> SET @cur_gadget_id = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> SET @cur_location = MD5("1");
Query OK, 0 rows affected (0.00 sec)
mysql> SET @cur_gadget_location = MD5("1");
Query OK, 0 rows affected (0.00 sec)
mysql> SELECT gadget_id,location,GroupNumber,COUNT(1) DaysStopped FROM
-> (
-> SELECT
-> *,
-> @dupcount := IF(@cur_gadget_location=gadget_location,@dupcount+1,1) Dup_Count,
-> @group_number := IF(@cur_gadget_location=gadget_location,@group_number,@group_number+1) GroupNumber,
-> @cur_gadget_location := gadget_location
-> FROM
-> (
-> SELECT *,CONCAT(gadget_id,'-',cur_loc) gadget_location FROM
-> (
-> SELECT *,
-> @cur_location := IF(MD5(location)=@cur_location,@cur_location,MD5(location)) cur_loc,
-> @cur_gadget_id := IF(gadget_id=@cur_gadget_id,@cur_gadget_id,gadget_id) cur_gadget
-> FROM gadget_location
-> ) AAA
-> ) AA ) A GROUP BY gadget_id,location,GroupNumber;
+-----------+----------+-------------+-------------+
| gadget_id | location | GroupNumber | DaysStopped |
+-----------+----------+-------------+-------------+
| 1 | calicut | 1 | 3 |
| 1 | calicut | 3 | 1 |
| 1 | calicut | 6 | 1 |
| 1 | kannur | 4 | 1 |
| 2 | thrissur | 2 | 1 |
| 2 | thrissur | 5 | 1 |
+-----------+----------+-------------+-------------+
6 rows in set (0.02 sec)
mysql>
According to this output, here is what you have:
Gadget 1
- Group 1 : stopped at
calicut
for 3 days
- Group 3 : left and came back for 1 day
- Group 4 : left
calicut
and went to kannur
for 1 day
- Group 6 : left
kannur
and went to calicut
for 1 day
Gadget 2
- Group 2 : stopped for
thrissur
for 1 day
- Group 5 : left and came back for 1 day
For some reason, the GroupNumbers came out different in SQLFiddle
`. Notwithstanding, the rest of the output is the same.
I hope this is right ...
Query 1, tested in SQLFiddle-1
SET @ward_id_to_check = 1 ;
SELECT
st.patient_id,
st.bed_id AS starting_bed_id, -- the first bed a patient uses
-- can be omitted
st.admitted,
MIN(en.discharged) AS discharged
FROM
( SELECT patient_id, bed_id, admitted, discharged
FROM t
WHERE t.ward_id = @ward_id_to_check
AND NOT EXISTS
( SELECT *
FROM t AS prev
WHERE prev.ward_id = @ward_id_to_check
AND prev.patient_id = t.patient_id
AND prev.discharged = t.admitted
)
) AS st
JOIN
( SELECT patient_id, admitted, discharged
FROM t
WHERE t.ward_id = @ward_id_to_check
AND NOT EXISTS
( SELECT *
FROM t AS next
WHERE next.ward_id = @ward_id_to_check
AND next.patient_id = t.patient_id
AND next.admitted = t.discharged
)
) AS en
ON st.patient_id = en.patient_id
AND st.admitted <= en.admitted
GROUP BY
st.patient_id,
st.admitted ;
Query 2, which is the same as 1 but without the derived tables. This will probably have better execution plan, with proper indexes. Test in SQLFiddle-2:
SET @ward_id_to_check = 1 ;
SELECT
st.patient_id,
st.bed_id AS starting_bed_id,
st.admitted,
MIN(en.discharged) AS discharged
FROM
t AS st -- starting period
JOIN
t AS en -- ending period
ON en.ward_id = @ward_id_to_check
AND st.patient_id = en.patient_id
AND NOT EXISTS
( SELECT *
FROM t AS next
WHERE next.ward_id = @ward_id_to_check
AND next.patient_id = en.patient_id
AND next.admitted = en.discharged
)
AND st.admitted <= en.admitted
WHERE
st.ward_id = @ward_id_to_check
AND NOT EXISTS
( SELECT *
FROM t AS prev
WHERE prev.ward_id = @ward_id_to_check
AND prev.patient_id = st.patient_id
AND prev.discharged = st.admitted
)
GROUP BY
st.patient_id,
st.admitted ;
Both queries assume that there is a unique constraint on (patient_id, admitted)
. If the server runs with strict ANSI settings, the bed_id
should be added in the GROUP BY
list.
Best Answer
Here's how it works. We define three variables. One holding a group number and two holding the values of the previous row values of status and userId. Note, that in a relational database there is no order unless you specify it. This is very important. In the select clause, we first check, if the variable values differ from the current row. If yes, we increment the group number, if not we leave it as it is. After that, we assign the values of the current row. So when the variables are evaluated when the next row is processed, they still hold the values of the previous row. So the order is important here, too. In the outer query we simply can group by this group number to get the min and max values of the timestamp.