I have a query to bin some log lines on a pre-processed "timeslice" field (this is unique every 15 minutes, for example "2018-04-11-15-3-3"
) There are approximately 6,000,000 rows in the database.
The performance seems slow and inconsistent; sometimes it will return a single timeslice and take 0.5 seconds, sometimes it will return thousands and take 0.0002 seconds.
SELECT count(*), time_slice
FROM logs
WHERE deployment_id = 123
and type="FOO"
and received_at BETWEEN "2019-04-01 00:00:00" AND "2019-04-30 23:59:59"
GROUP BY time_slice
timeslice
is aVARCHAR(18)
deployment_id
is anINT(10)
type
is aVARCHAR(16)
received_at
is aDATETIME
I've created an index on each of those columns, which helps. It's still very inconsistent.
EXPLAIN SELECT
tells me that it's using the deployment_id
and type
indexes.
Alternatively, is there a better way to bin lines on a 15 minute interval?
Best Answer
Most likely, the EXPLAIN plan is probably stating that the query will perform an index merge. That's when two indexes are searched and the intersection between them yields the results.
SUGGESTION
You can probably achieve slight better performance by creating one index as follows:
This will read from the index only to generate results. It should not have to read from the table. The reason ? Every column in the query is in the index.
TRADEOFF
This will make the table bigger but can improve read speed.
If the performance is the same after creating the index, you can probably do without the index at this time.
GIVE IT A TRY !!!