Mysql – How to efficiently partition a large MySQL (Aurora) table with multiple search columns

MySQLpartitioning

I have a large and rapidly growing table. It takes in about 5k rows / second and that will be doubling in about a month. The table is Amazon's Aurora (heavily modified InnoDB.)

As of today, total aggregate rows is around 4B.

Data is mostly searched on in 6 ways:

  1. count(distinct col1,col2) where timestamp range (1 day range)
  2. count(distinct col1,col2) where timestamp range (30 day range)
  3. count(1) where country IN (list,of,countries) and timestamp range (30 day range)
  4. count(1) where foreign_key = int and timestamp range (1 day range)
  5. count(1) where foreign_key = int and timestamp range (30 day range)
  6. select * where timestamp range (1 day range)

The foreign_key, timestamp and country columns are all indexed.

Data is only really "active" for a 90 day periods with most select's happening between -42 and -12 days.

How do I effectively partition this able to speed up those 6 queries?

And, almost as important, how do I accomplish this with minimal downtime?
(assumption here is to create a new table, move all inserts to the new table & copy old data in… like I would do with any new index)

Best Answer

I would consider using a date range as the partitioning scheme, and splitting by week which you can do using a function against the timestamp field. Using days would create too many partitions, and months would not help your queries much especially when the range spans two months.

Using the range partition, you can create mutliple partitions covering your active time period, and out into the future for a year, then nearer the end of that range you can add partitions to the schema as required.
If necessary, you can also drop the old partitions when they are no longer needed.

I can't prepare a sample partition schema for you, as I'm running short of time at the moment, but the MySQL docs have pretty good coverage for this type of partitioning.

Hope that helps,
Dave