Mysql – Fetch the newest row grouped by a column in MySQL

MySQLquery-refactorselect

My problem seems like it should have a much simpler solution than what I have come up with. Starting with this data set:

log_table

+--------+-----------+------------------+---------+
| log_id | entity_id |       date       | comment |
+--------+-----------+------------------+---------+
|      1 | A         | 2012-10-23 07:50 | foo     |
|      2 | B         | 2012-10-23 07:59 | bar     |
|      3 | B         | 2012-10-23 08:11 | baz     |
|      4 | A         | 2012-10-23 08:23 | bat     |
+--------+-----------+------------------+---------+

Say I wanted to get the latest date of log entries for each entity so that the result looked like:

Results:
+-----------+------------------+--------------+
| entity_id |  last_log_date   | last_comment |
+-----------+------------------+--------------+
| B         | 2012-10-23 08:11 | baz          |
| A         | 2012-10-23 08:23 | bat          |
+-----------+------------------+--------------+

I'm currently using MySQL that looks something like:

SELECT
  `entity_id`,
  `date` AS last_log_date,
  `comment` AS last_comment
FROM (
  SELECT *
  FROM `log_table`
  ORDER BY `date` DESC, log_id ASC
) AS `ordered_log`
GROUP BY `entity_id`

This works fine but it doesn't seem very efficient to me, there has to be a better way of doing this, surely?

Best Answer

Create a Subquery that Gathers Keys from the the log_table with Maximum Date Per Entity. Then, perform an INNER JOIN of that Subquery back to the log_table.

SELECT
    B.entity_id,B.last_log_date,B.last_comment
FROM
(
    SELECT entity_id,MAX(last_log_date) last_log_date
    FROM log_table GROUP BY entity_id
) A INNER JOIN B USING (entity_id,last_log_date);

Give it a Try !!!

You can really speed this up if you have a compound index like this

ALTER TABLE log_table ADD INDEX entity_date_ndx (entity_id,last_log_date);

Indexes on each separate column may yield index merges. This compound index will bypass that.

Please try LEFT JOIN instead of INNER JOIN

SELECT
    B.entity_id,B.last_log_date,B.last_comment
FROM
(
    SELECT entity_id,MAX(last_log_date) last_log_date
    FROM log_table GROUP BY entity_id
) A LEFT JOIN B USING (entity_id,last_log_date);