Here are two best practices for partitioning that pertain to the question:
- Keep an empty staging partition at the leftmost and rightmost ends
of the partition range to ensure that the partitions split when
loading in new data, and merge, after unloading old data, do not
cause data movement.
- Do not split or merge a partition already populated with data
because this can cause severe locking and explosive log growth.
http://www.informit.com/articles/article.aspx?p=1946159&seqNum=5
If the leftmost end of your partition is empty, use ALTER PARTITION FUNCTION SPLIT RANGE to add new ranges to the partition function.
To check if the leftmost partition is empty, use a query like the following:
DECLARE @PartitionFunctionName sysname = 'YourPartitionFunctionNameHere';
SELECT
p.partition_number, SUM(pst.row_count) RowCountInPartition, pf.name PartitionFunction, ps.name PartitionScheme
FROM sys.dm_db_partition_stats pst
INNER JOIN sys.partitions p ON pst.partition_id = p.partition_id
INNER JOIN sys.indexes i ON p.object_id = i.object_id AND p.index_id = i.index_id
INNER JOIN sys.partition_schemes ps ON ps.data_space_id = i.data_space_id
INNER JOIN sys.partition_functions pf ON ps.function_id = pf.function_id
WHERE pf.name = @PartitionFunctionName
GROUP BY p.partition_number, pf.name, ps.name;
If the first partition is not empty, the best practices recommend that you create a new function with all values, create a new table on that function, then insert the data to the new table.
Also, if the left partition just has a few records, a split may be fine. Not sure on that as I've never tried it.
Whatever you do, make sure to leave some empty partitions at the leftmost and rightmost partition when you're finished. I might even go so far as to creating partition ranges for 0 and 1, then add a check constraint to prevent the first partition from getting data in it. Do the same thing for the end.
What might cause the big discrepancy in disk space?
Typically this is caused by rows that have been deleted. Heaps do not reclaim space for data pages emptied by DELETE
operations unless a table lock is taken at the time of the deletion. Even then, other factors like the possibility of row-overflow data, or an enabled row-versioning isolation level can prevent space being reclaimed.
Another factor that usually has a much smaller impact on space usage is that rows with variable-length columns that expand may not fit on the original heap page, in which case the row is forwarded to another page. Forwarded rows are more usually associated with scan performance issues than pure space usage, however.
Overall, tables that experience significant deletions (without the space being reused quickly by new rows) and/or variable column updates that do not fit in place are often better configured with a clustered index. Clustered tables usually deallocate empty pages very quickly.
You can use the sys.dm_db_index_physical_stats
DMV to see physical details for a heap or index:
SELECT
DDIPS.index_id,
DDIPS.partition_number,
DDIPS.index_type_desc,
DDIPS.alloc_unit_type_desc,
DDIPS.avg_fragmentation_in_percent,
DDIPS.fragment_count,
DDIPS.avg_fragment_size_in_pages,
DDIPS.page_count,
DDIPS.avg_page_space_used_in_percent,
DDIPS.record_count,
DDIPS.avg_record_size_in_bytes,
DDIPS.forwarded_record_count
FROM sys.dm_db_index_physical_stats
(
DB_ID(),
OBJECT_ID(N'dbo.t1', N'U'),
0,
NULL,
'DETAILED'
) AS DDIPS;
The following (trivial) example shows a heap DELETE
not releasing any empty pages:
SET NOCOUNT ON;
CREATE TABLE t1 (c1 char(8000) DEFAULT 'a');
GO
INSERT INTO t1 DEFAULT VALUES;
GO 1000
-- 1000 pages allocated
SELECT
DDIPS.index_id,
DDIPS.partition_number,
DDIPS.page_count
FROM sys.dm_db_index_physical_stats
(
DB_ID(),
OBJECT_ID(N'dbo.t1', N'U'),
0,
NULL,
'DETAILED'
) AS DDIPS;
-- Delete all the data from the heap
DELETE FROM t1;
-- Still 1000 pages allocated
-- Even though the table is empty
SELECT
DDIPS.index_id,
DDIPS.partition_number,
DDIPS.page_count
FROM sys.dm_db_index_physical_stats
(
DB_ID(),
OBJECT_ID(N'dbo.t1', N'U'),
0,
NULL,
'DETAILED'
) AS DDIPS;
GO
DROP TABLE dbo.t1;
If you try the example again, but with a table lock (DELETE FROM t1 WITH (TABLOCK)
) the deletion frees all empty pages (assuming the database does not have the READ_COMMITTED_SNAPSHOT
option set to ON
etc.)
Best Answer
I'm fairly sure, according to the 5.1 documentation, that you'll have to alter table add partition to handle future years.
Are you needing to use the 'old' table after you get this set up? If not, I would follow these steps:
INSERT INTO mytable_new SELECT * FROM mytable
RENAME TABLE mytable TO mytable_old
RENAME TABLE mytable_new TO mytable
This should avoid the need to alter your queries. If your table is extremely write-heavy (I assume it must be if you have the need to partition it!), you will probably want to have a maintenance timeframe where it's offline until the operation is complete.
Hope this helps.