I have an InnoDB MySQL table of this structure:
column | type |
---|---|
service | varchar(16) |
url | varchar(255) |
datetime | datetime |
It contains logs for requests made to one of multiple services. It contains ~3 million rows, and gets ~3 million more rows a month (but I will delete old data if need be).
I am trying to generate a report, and get the number of requests made to each service within a date range.
Here is my current query:
SELECT service, COUNT(*) as lastMonthCount
FROM request_logs
WHERE datetime > '2021-02-16 10:51:05'
GROUP BY service
This works, but is painfully slow (~28 seconds).
It outputs this:
service | lastMonthCount |
---|---|
API A | 3056752 |
API B | 38451 |
I have indexes on datetime
and service
. I can see they are of type BTREE
.
How can I radically speed up this query, or restructure my table/indexes so I can achieve the same use case another way?
Best Answer
For a relatively simple query like that there isn't a lot you can do to optimise it as there is little room for change (there isn't a simple way of asking for that data).
You can probably reduce the amount of pages being touched as it runs significantly by having an index on
datetime
andservice
. This way the data it needs to group by will already be available in what it has read to perform the filter on the date. This will increase the amount of data on disk as the index will be larger, and slow down writes a touch for the same reason. You would probably want to replace the existing index ondatetime
with the new composite index instead of just adding it, for those reasons.A little more detail on why this will be faster:
With the index just on the
datetime
, it will need to read the base table data pages to get theservice
column for each matching row which as well as being extra page reads just because of referencing the other structure it will possibly be many more page reads because those pages contain less rows each due to the larger data per row (they include the URL column and any other properties that you add, or might already have, for the services).If that still isn't fast enough...
You may need to look at some form of caching for the counts. There are several options here: