Mysql – Best Indexing Strategy for Query with Equality[A], Range[B], Group By[C], AND Order By[count(P)]

indexindex-tuningmariadbMySQLperformancequery-performance

I have a poorly performing query:

SELECT  user_id, count(item_id) as count
FROM table items 
WHERE category = 'magazine'
AND created_at > 1384754400
GROUP BY user_id
ORDER BY count(item_id) desc
LIMIT 100

Whats the optimal indexing strategy in order to optimize this query?

Table Details

500million records with the following structure / cardinalities:

  • PRIMARY KEY (item_id) – cardinality: 500 M
  • user_id – cardinality: ~ 25 M
  • category – cardinality: ~ 2.5 M
  • created_at – cardinality: ~ 150 M

Indexing:

  • I have individual indexes on each the user_id, category and created_at fields

I also have the following covering indexes:

  • (category, user_id) – this is the one the query optimizer defaults to when running explain
  • (category, created_at)
  • (category, created_at, user_id) – this is one I attempted to create in order to optimize this query, however, it doesn't seem to be working very well.

Best Answer

If you ONLY want to optimise for this query. This is the best index:

ALTER TABLE items ADD INDEX (category, created_at, user_id)

This optimises the value of the filters, which reduces the total amount of data you touch. By adding user_id, item_id at the end of the query, you make the index covering and it saves you a lookup into the primary index.

We can assume that item_id is NOT NULL (as it is the PRIMARY index).

However, because the MySQL optimiser is pretty stupid, you may need to rewrite like this:

SELECT  user_id, SUM(count) AS count
FROM
(
  SELECT category, created_at, user_id, COUNT(*) as count
  FROM items
  WHERE category = 'magazine'
  AND created_at > 1384754400
  GROUP BY category, created_at, user_id
) AS d
GROUP BY user_id
ORDER BY count DESC
LIMIT 100