MySQL – How to Execute Complex Nested Select SQL

MySQL

I have two tables, table1 is as following, id is primary key, table2_id is foreign key

id date                 aid table2_id message
1  2014-03-19 16:21:02  121   1       login from xxxx
2  2014-03-20 14:00:32  123   2       decrease budget
3  2014-03-20 18:00:32  121   2       increase budget
4  2014-03-21 16:21:02  121   3       login from xxxx
5  2014-03-21 16:21:02  121   3       login from xxxx
6  2014-03-22 10:21:02  121   2       increase budget

table2 is a type table

id type
1  login
2  change_budget
3  search

I need to analyse the relationship between activity change_budget and other activity. I am not sure how could I deal with the big data solution, but, anyway, following is an idea of mine:

I try to write SQL for each time one aid change_budget, count the all other actions of this aid within one day. if one aid have a change_budget at 2014-03-14 16:21:02, I need all the action of this aid from 2014-03-13 16:21:02 to 2014-03-14 16:21:02. Then, group by the action and count.and selected table will be look as following:

aid login search change_budget
121 1     0      1
123 0     0      -1
121 0     2      1 

-1 means change_budget is decrease,1 means increase.
Other number is how many time this action of this aid has happen within oneday before this aid change_budget happens.

Anyone can teach me how to write this?

Best Answer

I have tested this query:

SELECT budget.aid,
       sum(CASE change_type.table2_id WHEN 1 THEN 1 ELSE 0 END) as login,
       sum(CASE change_type.table2_id WHEN 3 THEN 1 ELSE 0 END) as search,
       CASE budget.table2_id WHEN 2 THEN IF(LEFT(budget.message, 8) = 'increase', 1, -1) ELSE 0 END as budget
FROM table1 budget  
LEFT JOIN table1 change_type 
ON budget.aid = change_type.aid 
AND change_type.date BETWEEN budget.date - INTERVAL 1 DAY AND budget.date
AND change_type.table2_id <> 2
WHERE budget.table2_id = 2
GROUP BY budget.aid, budget.date;

It was a fun query, I made it with joins, maybe the subselect version could be easier to understand. I perform SELF join to match budgets with other actions in the range requested. Please note that it filters other budget actions and that you will get an overlap if at least 2 budget actions are performed within 1 day.

To fill the second and third column, a sum of a CASE is performed after the grouping by aid and the reference date.

This is the result:

+-----+-------+--------+--------+
| aid | login | search | budget |
+-----+-------+--------+--------+
| 121 |     0 |      0 |      1 |
| 121 |     0 |      2 |      1 |
| 123 |     0 |      0 |     -1 |
+-----+-------+--------+--------+
3 rows in set (0.00 sec)

Please note that no login for the aid 123 so this query returns the right results. There is a mistake on your example. Alternatively, if you do not want to have into account the aid for the actions, then the result would be:

+-----+-------+--------+--------+
| aid | login | search | budget |
+-----+-------+--------+--------+
| 121 |     0 |      0 |      1 |
| 121 |     0 |      2 |      1 |
| 123 |     1 |      0 |     -1 |
+-----+-------+--------+--------+
3 rows in set (0.00 sec)

To obtain this, remove the budget.aid = change_type.aid AND from my query.

Also please note that no query has been done on table2, so I assume values are hardcoded and do not change. You can expand the query to join the other table if that is not correct.