Proper Database Partitioning – Best Practices

database-designdatabase-recommendationdatabase-tuningpartitioning

I had asked a previous question "Proper technique for storing users event data" and the correct answer in my opinion was to create a database partition. Now from what I have read on it there are different ways to partition, but for this question we are going to assume we are doing a horizontal key partition using a date field on a RDBMS such as MySQL… (if you have objection or argument to this, by all means contribute).

The base question is how do you know how many partitions to create?

I understand this is a pretty open question being that it would also heavily rely on what hardware you are running this on, yet either way there should be some guidelines that would point to where the better performance would be, or a proper way of doing so, or even how would you judge something like this? Most all of the documentation I have found use terms such as "large", "big", "a lot"… what do these terms mean when it comes to access speed, or row count, efficiency vs storage or required hardware. Is it something where from trial and error or observed performance where if things start getting a little rough you would just add a partition or two?

I'm very interested in opinions and contradictions to this seemingly common hurdle in large DB schemes.

Thanks

Best Answer

I will tell you my experience in defining the terms of "large", "big", "a lot":

  • a lot is a database that takes around 400 GB for the data of a full month (custom logging information from all our web apps)

  • big is a table in this database that contains much of this space :-).. about half of this size (200 GB a table, from what I kinda remember)

  • large is the data in a full day (about 12-15 GB) - which means a partition of that magic table

This is not a rule or a best practice.. but when you feel that your current indexing strategy is a bit lost and nothing seems to make your queries faster, I believe that it's time to take into account the partitioning.