I am creating a table for testing purposes:
begin try
drop table countries
end try
begin catch
end catch
go
create table countries(
name nvarchar(100) not null,
visit date not null,
constraint pk__countries primary key clustered (visit,name))
set nocount on
insert into countries VALUES ('Portugal-Lisbon','20111101')
insert into countries VALUES ('Spain-Madrid','20111101')
insert into countries VALUES ('Italia-Milano','20120101')
insert into countries VALUES ('Norway-Thromso','20121201')
insert into countries VALUES ('USA-California','20160110')
insert into countries VALUES ('Brasil-Porto Alegre','20131101')
insert into countries VALUES ('Lithuania-Vilnius','20131101')
insert into countries VALUES ('France-Paris','20131101')
insert into countries VALUES ('Russia-Moscow','20141101')
insert into countries VALUES ('Germany-Liepzig','20140901')
insert into countries VALUES ('Germany-Hamburg','20100901')
insert into countries VALUES ('Estonia-Viru','20151101')
insert into countries VALUES ('Sweden-Gotemborg','20151101')
insert into countries VALUES ('Latvia-Riga','20141101')
insert into countries VALUES ('India-Vrindavana','20151101')
insert into countries VALUES ('Switzerland-Lugano','20161101')
insert into countries VALUES ('USA-New York','20160110')
insert into countries VALUES ('Russia-Cheliabinsky','20170110')
insert into countries VALUES ('Italia-Bergamo','20170101')
insert into countries VALUES ('UK-Telford','20170405')
insert into countries VALUES ('Spain-Barcelona','20101101')
insert into countries VALUES ('Spain-Fuerteventura','20101001')
then I create my partition function and partition schema:
---------------------------------------------------------------------------------------
--- create a partition function
-- https://technet.microsoft.com/en-us/library/ms186307(v=sql.110).aspx
---------------------------------------------------------------------------------------
use radhe
go
IF EXISTS (SELECT * FROM sys.partition_functions
WHERE name = 'PF_year')
DROP PARTITION FUNCTION PF_year ;
GO
CREATE PARTITION FUNCTION PF_year (date)
AS RANGE RIGHT
FOR VALUES ( '20100101',
'20110101',
'20120101',
'20130101',
'20140101',
'20150101',
'20160101');
---------------------------------------------------------------------------------------
--- Create a Partition Scheme
---------------------------------------------------------------------------------------
use radhe
go
DROP PARTITION SCHEME PSC_YEAR
GO
CREATE PARTITION SCHEME PSC_YEAR AS
PARTITION PF_year TO
([PRIMARY],[F0],[F1], [F2], [F3], [F4], [F5], [F6])
GO
After that I make my table to use the partition schema:
--================================================================================
--================================================================================
use radhe
go
ALTER TABLE countries
DROP CONSTRAINT [pk__countries]
go
CREATE UNIQUE CLUSTERED INDEX pk__countries
ON dbo.countries (visit desc ,name asc)
WITH (DROP_EXISTING=off,ONLINE=Off,FILLFACTOR=100,DATA_COMPRESSION=PAGE) ON PSC_YEAR (visit)
go
All that done – and to verify that it is all done alright I have the following script to check how many rows in each of the partitions:
SELECT [table_name] = SCHEMA_NAME([schema_id]) + '.' + t.[name]
,i.[name] AS [index_name]
,i.[type_desc] AS [index_type]
,ps.[name] AS [partition_scheme]
,pf.[name] AS [partition_function]
,p.[partition_number]
,r.[value] AS [current_partition_range_boundary_value]
,p.[rows] AS [partition_rows]
,p.[data_compression_desc]
FROM sys.tables t
INNER JOIN sys.partitions p ON p.[object_id] = t.[object_id]
INNER JOIN sys.indexes i ON p.[object_id] = i.[object_id]
AND p.[index_id] = i.[index_id]
INNER JOIN sys.data_spaces ds ON i.[data_space_id] = ds.[data_space_id]
INNER JOIN sys.partition_schemes ps ON ds.[data_space_id] = ps.[data_space_id]
INNER JOIN sys.partition_functions pf ON ps.[function_id] = pf.[function_id]
LEFT JOIN sys.partition_range_values AS r ON pf.[function_id] = r.[function_id]
AND r.[boundary_id] = p.[partition_number]
where t.object_id = object_id('dbo.countries')
GROUP BY SCHEMA_NAME([schema_id])
,t.[name]
,i.[name]
,i.[type_desc]
,ps.[name]
,pf.[name]
,p.[partition_number]
,r.[value]
,p.[rows]
,p.[data_compression_desc]
ORDER BY SCHEMA_NAME([schema_id])
,t.[name]
,i.[name]
,p.[partition_number];
and the above script gives me the following result:
but when I run the following script, to see how many rows I have in each year (because I partitioned per year) I get the following results:
select c.name,
[year of visit] = year(visit),
[number of visits] = count(*) over (partition by year(visit))
from countries c
order by visit desc
how is this that my records of 2010 are not on partition 1?
can I find out exactly on which partition a record is?
how do I add another partition, let's say next year (2018) I will add the partition for 2017, without having to rebuild the clustered index?
last but not least, let's say in a few year time down the line, I will not be interested anymore in the rows of 2010. how can I get rid of all that partition?
here below is how this database is created, basically my goal was to put each partition on a different file.
CREATE DATABASE [radhe]
CONTAINMENT = NONE
ON PRIMARY
( NAME = N'radhe', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\DATA\radhe.mdf' , SIZE = 70254592KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [F0]
( NAME = N'F0', FILENAME = N'C:\sql2014_data\RADHE0.ndf' , SIZE = 1048576KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [F1]
( NAME = N'F1', FILENAME = N'C:\sql2014_data\RADHE1.ndf' , SIZE = 1048576KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [F2]
( NAME = N'F2', FILENAME = N'C:\sql2014_data\RADHE2.ndf' , SIZE = 3145728KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [F3]
( NAME = N'F3', FILENAME = N'C:\sql2014_data\RADHE3.ndf' , SIZE = 3145728KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [F4]
( NAME = N'F4', FILENAME = N'C:\sql2014_data\RADHE4.ndf' , SIZE = 3145728KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [F5]
( NAME = N'F5', FILENAME = N'C:\sql2014_data\RADHE5.ndf' , SIZE = 3145728KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [F6]
( NAME = N'F6', FILENAME = N'C:\sql2014_data\RADHE6.ndf' , SIZE = 3145728KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [F7]
( NAME = N'F7', FILENAME = N'C:\sql2014_data\RADHE7.ndf' , SIZE = 1048576KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [Facts ]
( NAME = N'Facts', FILENAME = N'C:\sql2014_data\facts.ndf' , SIZE = 45613056KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB ),
FILEGROUP [Facts2]
( NAME = N'Facts2', FILENAME = N'C:\sql2014_data\facts2.ndf' , SIZE = 59244544KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1048576KB )
LOG ON
( NAME = N'radhe_log', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL12.SQL2014\MSSQL\DATA\radhe_log.ldf' , SIZE = 2097152KB , MAXSIZE = 2048GB , FILEGROWTH = 1048576KB )
END
these links below are interesting references regarding table partitioning:
How to partition an existing non-partitioned table?
What's the best way to archive all but current year and partition the table at the same time?
Best Answer
With a
RANGE RIGHT
function, values greater than or equal to the first boundary ('20100101') and less than the second boundary ('20110101') are stored in partition 2. Only rows less than first '20100101' boundary are stored in partition 1. No rows in your test data meet this criteria.Invoke the partition function using
$PARTITION
along with the partitioning column value:Set the partition scheme
NEXT USED
filegroup to the desired filegroup andSPLIT
the function with the new boundary. This should be done before 2018 data exists, ideally with the existing partition (2017) empty, to avoid costly data movement.SWITCH
the partition into an aligned staging table of the same schema and remove the partition boundary withMERGE
. I suggest partitioning the staging table using the same scheme as the source to ensure alignment and truncating the staging table immediately after theSWITCH
.