Mysql – Finding most recent entry per group running very slowly

greatest-n-per-groupMySQL

I have a table (notifications) containing about 300k entries. Each entry relates to a price change on one of about 500 different products (Trigger_ASIN).

I need to frequently pull the most recent notifications entry for each Trigger_ASIN. (I'll be using the resulting query to look up related details in another table)

My query as it currently stands is:

SELECT n1.Trigger_ASIN, n1.UniqueId
from notifications n1
  left outer join notifications n2 on n1.Trigger_ASIN = n2.Trigger_ASIN
and n1.Trigger_TimeOfOfferChange < n2.Trigger_TimeOfOfferChange
WHERE n2.UniqueId is NULL
limit 20;

When limited to 20 records, it runs in 5secs (0 secs to fetch).
Limited to 40 records it takes 15 secs.
Limited to 80 records it take 125 secs.
By the time I get to my c.500 different products it's so slow it's unusable.

The EXPLAIN output on the query shows that an index is being used (one one side of the join at least).

1   SIMPLE  n1  index       notifications__index_most_recent2   221     302974  Using index
1   SIMPLE  n2  ref ix_notifications_Trigger_ASIN,notifications_Trigger_TimeOfOfferChange_index,notifications__index_most_recent_entry,notifications__index_most_recent2    notifications__index_most_recent_entry  63  amz.n1.Trigger_ASIN 240 Using where; Not exists; Using index

I've looked through a number of related questions, but still haven't managed to find the reason for the query running so slowly.

Can anyone see what I'm doing wrong to cause this to run so slowly?

For reference, the notifications table and associated indices is are described below.

Notifications Table:

    -- auto-generated definition
CREATE TABLE notifications
(
  UniqueId                  VARCHAR(50) NOT NULL
    PRIMARY KEY,
  PublishTime               DATETIME    NULL,
  SellerId                  VARCHAR(30) NULL,
  MarketplaceId             VARCHAR(20) NULL,
  Trigger_MarketplaceId     VARCHAR(20) NULL,
  Trigger_ASIN              VARCHAR(20) NULL,
  Trigger_ItemCondition     VARCHAR(50) NULL,
  Trigger_TimeOfOfferChange DATETIME    NULL
)
  ENGINE = InnoDB
  CHARSET = utf8;

CREATE INDEX ix_notifications_PublishTime
  ON notifications (PublishTime);

CREATE INDEX ix_notifications_SellerId
  ON notifications (SellerId);

CREATE INDEX notifications__index_most_recent_entry
  ON notifications (Trigger_ASIN, Trigger_TimeOfOfferChange, UniqueId, Trigger_ItemCondition);

CREATE INDEX ix_notifications_Trigger_ASIN
  ON notifications (Trigger_ASIN);

CREATE INDEX ix_notifications_Trigger_ItemCondition
  ON notifications (Trigger_ItemCondition);

CREATE INDEX notifications__index_most_recent2
  ON notifications (Trigger_TimeOfOfferChange, Trigger_ASIN, UniqueId);

CREATE INDEX notifications_Trigger_TimeOfOfferChange_index
  ON notifications (Trigger_TimeOfOfferChange);

Best Answer

SELECT n1.Trigger_ASIN
     , n1.UniqueId
FROM notifications n1
JOIN ( SELECT n2.Trigger_ASIN
            , MAX(n2.Trigger_TimeOfOfferChange) Trigger_TimeOfOfferChange 
       FROM notifications n2 
       GROUP BY n2.Trigger_ASIN ) n3
WHERE n1.Trigger_ASIN = n3.Trigger_ASIN
  AND n1.Trigger_TimeOfOfferChange = n3.Trigger_TimeOfOfferChange
/* ORDER BY something -- obligatorily, limitation does not make sense without it!!! */ 
LIMIT 20;

And ensure the notifications__index_most_recent_entry index is used.