Partitioning a Large Table vs. Splitting into Multiple Tables in MySQL 5.7

mysql-5.7partitioning

We had a MySQL server old enough to not have partitioning enabled, so we decided to take our largest tables and move all the old rows to another table. In case they were needed at some point, they'd be moved to the "recent table", to make its usage faster. Overall, it would mean we'd have:

  • table_old: holding about 25Gb
  • table_recent: holding about 1Gb

We would keep this proportion forever, trying to balance the amount of old data vs new data and keeping only data used last week onwards in the recent one.

This proved to be a little bit difficult to develop in a clean way, but it's working well now. By just editing the "recent date" threshold, we can balance how much data the recent table will hold. At some point, old rows would go to the old table, and if some old data was needed, it would be moved around beforehand.

Now, we moved to MySQL 5.7, and partitioning is a thing. It seems cleaner at a developer side, but I don't know if it would be as efficient as what we had now, meaning we could update one row's last_used_time field and pop it to the main table.

  1. Would MySQL still move those fields between partitions, if my partition is based on a varying field like this?
  2. Would partitioning the table make the system run faster, or keeping the data in different tables is better?

Best Answer

I doubt anyone can tell you whether the performance of your home-grown system will be better or worse than the performance of MySQL built-in partitioning.

In general, some thoughts:

  • Using MySQL's partitioning, you would get the benefits of any improvements simply be upgrading to a newer version; in most cases, your code would probably not need to change.
  • MySQL's partitioning might allow more flexibility in what you already do, making some things that are impractical today work adequately.
  • You've got to support your home-grown version yourself, and make it work with other changes to your system.
  • Your home-grown version (presumably) works exactly how you need it to, and works adequately well with the rest of your application(s).
  • If you have a special need, you have the option of modifying your home-grown system; while modifying MySQL is possible, it's a bit more involved.
  • To move your current system to MySQL's partitioning would probably require some code changes in your system, and might require tuning of multiple queries (to avoid hitting more partitions than necessary).

Therefore:

  • It might be worthwhile to test switching your table to MySQL partitioning, to get an idea of how it performs with your data (switching and seeing a major performance degradation would be a bad thing).
  • In the long run, you would probably want to try to switch to using MySQL's partitioning (as it's likely to get better with time, and maintaining your solution could grow more complicated with time)...
  • ...unless it turns out that some features of your home-grown solution are absent in MySQL partitioning and are critical to your application(s).
  • In any case, I'd allow for at least as much time and effort in switching to MySQL partitioning as you spent in developing and perfecting your own solution. It may be much simpler, but you may have inadvertently built other parts of your system around your solution that would have to be reworked to function with MySQL partitioning.

This is all completely generic advice - I don't know anything about MySQL partitioning (or about your solution, for that matter). I have built some system where we created our own solution instead of using existing products, though. I know that sometimes they work much better for your needs - but they do require maintenance, and switching to using a tool less tailored to your specifics can eventually save you that maintenance time, but can cost you in terms of flexibility, and can take a while to understand as well as you understand your solution.

If your solution is low-maintenance and works well for you, with no major performance concerns, I'd be inclined to maintain it for now at least. However, given the time to explore and the fact that there may now be a built-in solution as good or better than what you have, I'd want to check it out and see if it'd work for you.