I have a SQL Server database which contains two tables — Acks
and Logs
.
These two tables are related in logic but not in a relational database way. Basically, every message that comes in gets saved in the Log
table and, if our server acknowledges it, then that ack gets stored in the Ack
table.
We are storing around 5 million Acks and 3 million Logs a day. I am trying to partition these two tables on a daily boundary so that we can easily remove older partitions from the table, and also improve query performance.
I haven't done table partitioning before, so I have been reading some on-line tutorials, however I am stuck on one thing. All the tutorials I have followed seem to manually add filegroups and manually add boundaries.
I want SQL Server somehow to do this daily, and this is what my question is about. I need it to create the new filegroups for the next day, every day at, say, 22:00. Then at 24:00 the inserts should start filling up the new day's partition.
Can anyone point me in the right direction on how to achieve this? Either a comprehensive tutorial or some good old advice will do as well.
My second question: can I somehow apply the same partition function to the two different tables?
They both have a datetime(2)
column on which I want to partition, and the same rules
will apply.
How does that then fit in with my filegroups? Do I need a single filegroup for the day? Will each table have a file in that filegroup, or will both tables save to the same file in the filegroup?
Do I have to make a .mdf
and .ldf
for each filegroup? Or is there still one log file for the entire database?
Best Answer
From SQL Server 2008 SP2 and SQL Server 2008 R2 SP1 there is support for 15,000 partitions so to be honest, you don't really need to do that much dynamically. Instead of having a complex daily process (Dynamically add filegroup and boundaries) with an opportunity to fail, simply create the partitions up-front from now to the year 2020, and you're well within your limits and pretty future-proof.
You could assign all partitions to one filegroup (ok not necessarily a great pattern), or round-robin between a limited number. To put it another way, there is no technical need to have a filegroup per day, eg
Obviously use Excel or some tool to generate the scripts for you - no need to type them out : )
Use DMV sys.partition_range_values and metadata function $PARTITION to work out information about what data is where. Create a daily job to switch out and truncate your oldest partition. I would regard this as lower risk than the daily add.
Warning!! Carefully read the whitepaper as this needs to be enabled and there are some issues with this approach ( eg Creating and rebuilding nonaligned indexes on a table with more than 1,000 partitions is not supported ). If you're feeling risk averse the standard limit of 1,000 partitions would still allow you to pre-allocate just under 3 years.
As you really want to partition by DATE not DATETIME2, consider a computed column. I would probably want to performance test this first though.
There is also a tool on codeplex (SQL Server Partition Management) which might be worth a look although I haven't used it.
To answer your other questions, there should only be one log file for the database. Add other files as .ndf not .mdf. To use the same partition scheme (not function) on the same tables, simply create them with the same scheme, they will divide the data up into the files beneath the filegroups eg
OK, this will make for a long answer but I've knocked up a demo of how something like this might work. It does remind me that the great thing about partition switching is that as a metadata-only operation it's instant. Just to be clear, this is a demo to show off principles and some example "how-to" code, it is not production quality. Work through it and make sure you understand it before running in a dev or test environment. You'll need about 200MB space.