MySQL – Efficiently Deleting Large Amounts of Data Regularly

MySQLPHP

We have a mysql table that at any given time has about 12 million rows. We need to delete old data to keep the size of the table somewhat manageable.

Right now we are running this query daily, at midnight, using a cron job:

DELETE FROM table WHERE endTime < '1393632001'

The last time the query ran it examined 4,602,400, took over 3 minutes and CPU went through the roof.

CPU spiking at midnight

What can we do to keep the CPU, synchronous db connections, disk cue depth etc from spiking unreasonably while still clearing old data?

PS: You will notice that the query is actually happening at a fairly inopportune time in our usage cycle. Assume that we have already shifted the query's timing to occur at the lowest point of usage each day. Also, there is no index on "endTime" and I would prefer to keep it that way if possible because there is a ton of data being inserted very regularly, and not much lookup.

Best Answer

The solution to your problem is a MySQL capability called "partitioning". The documentation is here.

What partitioning does is store a single table in separate "partitions". These are defined by a particular expression, usually a column value or range. In your case, this would probably be based on endTime -- assuming that it is known when a record is created and it doesn't change.

You would store a day's worth of endTime in each partition. Then the deletion step would be truncating a partition rather than deleting a bunch of rows in a big table. The partition truncation would be a much faster method.