Innodb – Partitioning and the InnoDB Buffer Pool

buffer-poolinnodbpartitioning

I am trying to maintain a database that contains a table with 40 billion rows (7.2 terabytes) on a server using InnoDB as a storage engine and MariaDB with MySQL 5.5.

When my database reaches about 2.5 terabytes, I can no longer insert data into the table at the rate required in production. Data in the table is rarely queried after 24 hours. The table has a primary key and one secondary index. After doing quite a bit of research, it seems like understanding the InnoDB buffer pool will be critical if I'm going to solve this problem. This is obviously too much data to fit into the buffer pool. I have a few ideas about how I can improve performance by increasing the probability data from the last 24 hours will be in the buffer pool, but it's difficult to test them all with such a large amount of data. How will the InnoDB buffer pool behave in each of the following situations? Is one idea obviously better? Or are they all bad?

  1. Split the large table into partitions by time such that each partition's data and index fit into the buffer pool. – https://mariadb.com/kb/en/partition-maintenance/ suggests that this should improve performance, but I've seen conflicting information about how indexing works for partitioned tables. Is it one massive index? Or several smaller indexes that will fit into the buffer pool? If it's one large index, it's hard to see how this would help.
  2. Create 2 time-partitioned tables. One table will be a large table of archived partitions, and one table will hold only one "active" partition with data that is likely to be queried (maybe a week). When I transition to the next partition in the active table (next week's data), exchange the recently active partition (last week's data) into the archive table. – This seems advantageous because the active table is guaranteed to fit into the buffer pool and queries that might execute a full table scan will not read data that will purge the active data from the buffer pool because the old data is in a different table. However, I'm assuming that when I exchange the recently active partition into the archive table, everything will come to a halt while the index for the large table is read from disk into the buffer pool and recalculated. Then there will be some time where performance suffers after that until the active data makes its way back into RAM.
  3. Create 1 time-partitioned table that holds archived data, and one small table that is minimal in size (probably 24 hours worth of data). Then copy the data more than 24 hours old out of the small table into the partitioned archive table. – It's hard for me to imagine how this could be a good option unless copying data is somehow faster than moving an entire partition.

Any insight is greatly appreciated!

Best Answer

From InnoDB perspective there are no partitions. Partitioning is implemented on the server level and for InnoDB each partition looks like a separate table.

You're going right direction for optimizing writes. If you partition by time the most recent partition will be one that gets new writes. It will be cached in the buffer pool and that'll be good for performance. If the partition is small enough to fit into the buffer pool you will see no reads and there will be as many writes as commits (if innodb_flush_log_at_trx_commit=1) plus some amount for flushing dirty pages. That'll be your capacity limit.

Before you partition do a research with your read queries. The secondary index was created for some reason and I assume the read queries benefit from it. After you partition, and if the read query doesn't change, MySQL will have to do as many index searches as number of partitions. So, the read query will be N times slower (I assume an index search time on a partition and the big table is same which is not strictly true, but ok for rough estimation).

Use EXPLAIN PARTITIONS SELECT... to see what partitions MySQL hits to execute the query. If you can change the SELECT so it hits only one partition (by adding PK clause) that'd be ideal situation.