Sql-server – Partitioning in SQL Server 2008

partitioningsql-server-2008

I want to partition some of our tables in a SQL Server 2008 database. The partition key will be date based.

I have several questions:

  1. Is there a rule of thumb regarding the recommended number of rows per partition?
  2. Is there any special problem creating a partition function with different intervals?
    I mean for example, that the data from 2008, 2009, 2010 will be based on yearly partitions, and from 2011 and on it will be on a monthly based partitions.
  3. Is it recommended to create a different partition function/scheme for each of the tables (they will be identical except their name) or can I create one monthly and one yearly partition function/scheme and assign the tables to them? What are the consequences for each option?
  4. We have around 15 tables that we are considering partitioning. We want to keep all the data, hence only a split mechanism is needed. I thought to create a management table that will contain all the information on the different partition tables and to create one procedure that will run on all the partition functions and will split them. Another option is to create a different procedure for each of the partitioned tables. I prefer the first option because of its manageability, but I don't know if it will cause issues/contentions.

Best Answer

For what you want to do, I would recommend the following (which is pretty much what you were thinking).

1> Create history tables for the historic data you have - keep the schemas as similar as possible. Split up by some logical grouping (such as year/month) based on how they are going to be queried (say you need to report with in month/year as well as all). Do not worry about the table size of the splits unless they are getting into the TB size range (your dbms should handle it) just make sure that they are appropriately indexed for the queries that need to be run. You should consider putting these onto a different disk to the active data if performance is an issue.

2> Create a routine to move data from the active table to the relevant historic table. Run this periodically. As a practice rebuild the indexes on the table that has had the data removed from it, and maybe update the table statistics. Easiest way to do this is to write a sql script.

3> Consider the reporting you want to do. If you want to only have to deal with 1 table when writing queries, create a view that joins the archived tables together. Create indexes on all the tables to suite the view. This way if you want all the data, select from the view. If you want data from a specific year/month, query that table. The view will look something like:

create view view_all_data as
select "Jan12" as month,a.* from data_Jan12 a
union
select "Feb12" as month,b.* from data_Feb12 b
....

I am assuming here that the system is not a highly used transactional system and that you have windows of low usage to run the analysis queries. If you need to maintain high levels of performance, you may like to do the above in a separate database (separate hardware) and port across the new data that you get from backups.