MySQL JOIN two tables and get latest result

MySQLoptimizationperformancequery-performanceschema

I want to JOIN two tables and get the latest result from each one of those two in a single table. I'm kind of concerned about speed too since tables are growing kinda fast. Close to 60-70k records a day. Later on i will go into partitioning but that is another issue. Now i have a main table with the devices info.

+--------+-----------+---------+
|     id |    Name   | type    |
+--------+-----------+---------+
|      1 | Oh        | A       |
|      2 | This      | A       |
|      3 | Is        | B       |
|      4 | Hard      | A       |
+--------+-----------+---------+

According to the type they have some data in different tables
Type A is

+--------+-----------+------------------+---------+---------+
|     id | device_id |   stats_time     | status  |  noise  |
+--------+-----------+------------------+---------+---------+
|      1 | 1         | 2012-10-23 07:50 | foo     |   10    |
|      2 | 1         | 2012-10-23 16:59 | bar     |   12    |
|      3 | 2         | 2012-10-23 15:11 | bar     |   0     |
|      4 | 4         | 2012-10-23 23:23 | foo     |   25    |
+--------+-----------+------------------+---------+---------+

Type B is

+--------+-----------+------------------+---------+---------+
|     id | device_id |   stats_time     | status  |  signal |
+--------+-----------+------------------+---------+---------+
|      1 | 3         | 2012-10-23 04:50 | foo     |  1000   |
|      2 | 3         | 2012-10-23 05:59 | bar     |  450    |
|      3 | 3         | 2012-10-23 09:11 | bar     |  980    |
|      4 | 3         | 2012-10-23 10:23 | foo     |   0     |
+--------+-----------+------------------+---------+---------+

I've been busting my head for a query to end up with something like this

+--------+-----------+------------------+---------+---------+---------+
|     id | device_id |   stats_time     | status  |  signal |   noise |
+--------+-----------+------------------+---------+---------+---------+
|      1 | 1         | 2012-10-23 16:59 | bar     |  12     |         |
|      2 | 2         | 2012-10-23 15:11 | bar     |  0      |         |
|      3 | 3         | 2012-10-23 10:23 | foo     |         |    0    |
|      4 | 4         | 2012-10-23 23:23 | foo     |  25     |         |
+--------+-----------+------------------+---------+---------+---------+

Using the below query is not good since i get two columns of stats_time

SELECT devices.id AS id, A.stats_time , B.stats_time
FROM devices 
LEFT JOIN A ON devices.id = A.device_id 
LEFT JOIN B ON devices.id = B.device_id 
GROUP BY devices.id

Before i ended up using different tables for the device types i used to get the results with the following but ended up going real slow

SELECT *
FROM (
    SELECT *
    FROM A
    ORDER BY stats_time DESC, id ASC
) AS d
RIGHT JOIN devices ON A.device_id = devices.id
GROUP BY devices.id

Best Answer

I see it as 2 steps:

  1. Build tables with just the latest signal (or noise) for each device
  2. JOIN or UNION the two tables.

Step 1 is a variant of groupwise max:

SELECT  device_id, stats_time, status, noise -- The desired columns
    FROM  ( SELECT  @prev := '' ) init
    JOIN  ( SELECT
                device_id != @prev AS first, -- `device_id` is the 'GROUP BY'
                @prev := device_id,          -- the 'GROUP BY'
                device_id, stats_time, status, noise -- Also the desired columns
            FROM  TableA -- The table
            ORDER BY  device_id  DESC, -- The 'GROUP BY'
                      stats_time DESC  -- to get latest
      ) x
    WHERE  first; 

This may be beneficial to performance:

INDEX(device_id, stats_time)

Ditto for TableB and signal. Manually run them to see if I got them right.

Your example does not show a case where both signal and noise exist for the same device_id. I will assume that is really the case, hence UNION:

Step 2:

SELECT device_id, stats_time, status, signal, noise
    FROM
    ( SELECT device_id, stats_time, status, signal, '' AS noise
        ... (the rest of the signal query)
    )
    UNION ALL
    ( SELECT device_id, stats_time, status, '' AS signal, noise
        ... (the rest of the noise query)
    );