Mysql – Table for actual values

data-warehousejoin;MySQL

I have a database where logging happens and is processed in a data-warehouse styled environment. This means there are staging cycles where the data comes in and then is processed. For a reporting system, some log entries for certain parameters need to read, although it can happen that there are no entries in the staging cycle. Therefore there is a table which contains a value for every parameter – the last one which has been read. After the processing finished, the table is updated via

UPDATE jobcontrol.actual_values a
JOIN jobcontrol.processing p
SET a.value = p.value 
WHERE p.t_ns = (select max(p.t_ns) from jobcontrol.processing p WHERE a.para_id = p.para_id) 
AND a.para_id = p.para_id;  

This seems improvable to me, although my try to simplify it via a real join didn't quite work.

Do you have any suggestions on how this could work more easily (and faster)?

-> Explain Statement (This was run on a test table with 36 rows for 8 different parameters)

+----+--------------------+-------+--------+---------------+---------+---------+----------------------+------+-------------+
| id | select_type        | table | type   | possible_keys | key     | key_len | ref                  | rows | Extra       |
+----+--------------------+-------+--------+---------------+---------+---------+----------------------+------+-------------+
|  1 | PRIMARY            | p     | ALL    | PRIMARY       | NULL    | NULL    | NULL                 |   36 | NULL        |
|  1 | PRIMARY            | a     | eq_ref | PRIMARY       | PRIMARY | 4       | jobcontrol.p.para_id |    1 | Using where |
|  2 | DEPENDENT SUBQUERY | p     | ref    | PRIMARY       | PRIMARY | 4       | jobcontrol.a.para_id |    2 | Using index |
+----+--------------------+-------+--------+---------------+---------+---------+----------------------+------+-------------+

-> SHOW CREATE TABLE

CREATE TABLE `processing` (
  `para_id` int(10) unsigned NOT NULL DEFAULT '0',
  `t_ns` bigint(20) unsigned NOT NULL DEFAULT '0',
  `id_inst` smallint(6) NOT NULL DEFAULT '1',
  `value` varchar(255) NOT NULL DEFAULT '',
  `isanchor` tinyint(4) unsigned NOT NULL DEFAULT '1',
  PRIMARY KEY (`para_id`,`t_ns`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 

MySQL Version is 5.6

Best Answer

I edited the Query to the following:

EXPLAIN UPDATE jobcontrol.actual_values a
    JOIN jobcontrol.processing p ON a.para_id = p.para_id
    SET a.value = (select p.value from jobcontrol.processing p WHERE a.para_id = p.para_id ORDER BY t_ns DESC LIMIT 1) 
  • The JOIN ON makes the recordset smaller (DB checks only the parameters which are also in processing, not every single one. The EXPLAIN however indicates that we at least use an index now)
  • In the SELECT : Replacing the MAX() with a ORDER BY ... DESC LIMIT 1 should boost the performance a bit, especially for big tables. If MAX() uses a filesort (like ORDER BY ... DESC does) nothing should change.

    +----+--------------------+-------+--------+---------------+---------+---------+----------------------+------+-----------------------------+ 
    | id | select_type        | table | type   | possible_keys | key     | key_len | ref                  | rows | Extra                       |
    +---+--------------------+-------+--------+---------------+---------+---------+----------------------+------+-----------------------------+
    |  1 | PRIMARY            | p     | index  | PRIMARY       | PRIMARY | 12      | NULL                 |   36 | Using index                 |
    |  1 | PRIMARY            | a     | eq_ref | PRIMARY       | PRIMARY | 4       | jobcontrol.p.para_id |    1 | NULL                        |
    |  2 | DEPENDENT SUBQUERY | p     | ref    | PRIMARY       | PRIMARY | 4       | jobcontrol.a.para_id |    2 | Using where; Using filesort |
    +----+--------------------+-------+--------+---------------+---------+---------+----------------------+------+-----------------------------+