An option you can try is to combine partitioned views with the table level partitioning. By having a partitioned view which is current and historic tables unioned together, you will be able to have different table options i.e. fill factor etc. This also with give better statistics on the active portion of the data as it's no longer skewed with all the historic data.
USE master;
GO
DROP DATABASE PartitionDemo
GO
CREATE DATABASE PartitionDemo
GO
USE PartitionDemo;
GO
CREATE PARTITION FUNCTION everyHundredThousand(int) AS RANGE RIGHT FOR VALUES(100000,200000,300000,400000,500000,600000,700000,800000,900000,1000000)
CREATE PARTITION SCHEME allPrimary AS PARTITION everyHundredThousand ALL TO ([PRIMARY]);
GO
CREATE TABLE dbo.HistoryTable
(
ID int NOT NULL /*Note not an identity*/,
Filler char(200) NOT NULL,
CONSTRAINT PK_HistoryTable PRIMARY KEY CLUSTERED(ID) WITH (FILLFACTOR=100)
) ON allPrimary(ID)
GO
CREATE TABLE dbo.CurrentTable
(
ID int NOT NULL IDENTITY(0,1),
Filler char(200) NOT NULL,
CONSTRAINT PK_CurrentTable PRIMARY KEY CLUSTERED(ID) WITH (FILLFACTOR=80)
) ON allPrimary(ID)
GO
/* Now generate some data */
INSERT CurrentTable
(
Filler
)
SELECT TOP 1000000
''
FROM sys.objects a
CROSS JOIN sys.objects b
CROSS JOIN sys.objects c
CROSS JOIN sys.objects d
GO
/* Rebuild Indexes to set Fill Factor */
ALTER INDEX ALL ON dbo.CurrentTable REBUILD
GO
/* Create Partitioned View */
CREATE VIEW CurrentPlusHistory
AS
SELECT ID,
Filler
FROM HistoryTable
UNION ALL
SELECT ID,
Filler
FROM CurrentTable
GO
/* Move some data to History Table */
ALTER TABLE CurrentTable SWITCH PARTITION 1 TO HistoryTable PARTITION 1
ALTER TABLE CurrentTable SWITCH PARTITION 2 TO HistoryTable PARTITION 2
ALTER TABLE CurrentTable SWITCH PARTITION 3 TO HistoryTable PARTITION 3
ALTER TABLE CurrentTable SWITCH PARTITION 4 TO HistoryTable PARTITION 4
ALTER TABLE CurrentTable SWITCH PARTITION 5 TO HistoryTable PARTITION 5
ALTER TABLE CurrentTable SWITCH PARTITION 6 TO HistoryTable PARTITION 6
ALTER TABLE CurrentTable SWITCH PARTITION 7 TO HistoryTable PARTITION 7
ALTER TABLE CurrentTable SWITCH PARTITION 8 TO HistoryTable PARTITION 8
GO
/* For the Partitioed View to work we need a check constraint to define the key range */
/* as we've not created one this query will look at both tables */
SET STATISTICS IO ON
SELECT *
FROM CurrentPlusHistory
WHERE ID < 100000
AND Filler = 'One'
SET STATISTICS IO OFF
GO
SELECT MAX(ID) FROM dbo.HistoryTable
/* Notice the range for the each table is define so SQL Server can use this value to select the target table */
ALTER TABLE dbo.HistoryTable ADD CONSTRAINT HistoryTable_ID_Range CHECK(ID<800000)
ALTER TABLE dbo.CurrentTable ADD CONSTRAINT CurrentTable_ID_Range CHECK(ID>=800000)
GO
/* Notice in the output this time SQL Server doesn't look into the current table */
SET STATISTICS IO ON
SELECT *
FROM CurrentPlusHistory
WHERE ID < 100000
AND Filler = 'One'
SET STATISTICS IO OFF
GO
/* CHECK FILLFACTORS FOR TABLES\PARTITIONS */
SELECT OBJECT_SCHEMA_NAME(o.object_id)+'.'+o.name as Table_Name,
s.*
FROM sys.objects o
INNER JOIN sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,'DETAILED') s
ON o.object_id = s.object_id
AND s.index_id = 1
AND s.index_level = 0
AND s.page_count <>0
WHERE o.name IN ('CurrentTable','HistoryTable')
GO
/* Now rebuild HistoryTable to see that its Partitions take the FillFactor */
ALTER INDEX ALL ON dbo.HistoryTable REBUILD
GO
/* CHECK FILLFACTORS FOR TABLES\PARTITIONS */
SELECT OBJECT_SCHEMA_NAME(o.object_id)+'.'+o.name as Table_Name,
s.*
FROM sys.objects o
INNER JOIN sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,'DETAILED') s
ON o.object_id = s.object_id
AND s.index_id = 1
AND s.index_level = 0
AND s.page_count <>0
WHERE o.name IN ('CurrentTable','HistoryTable')
GO
/* Move some more data into the History Table, the CHECK CONSTRAINT range has to be managed as well so this is a coding overhead */
ALTER TABLE HistoryTable DROP CONSTRAINT HistoryTable_ID_Range
ALTER TABLE HistoryTable ADD CONSTRAINT HistoryTable_ID_Range CHECK(ID<900000)
ALTER TABLE CurrentTable SWITCH PARTITION 9 TO HistoryTable PARTITION 9
ALTER TABLE CurrentTable DROP CONSTRAINT CurrentTable_ID_Range
ALTER TABLE CurrentTable ADD CONSTRAINT CurrentTable_ID_Range CHECK(ID>=900000)
GO
Expanding upon your list, here are a couple potential downsides that we have come across in real production workloads:
Seeking into multiple partitions
Expanding on the queries that do not use partition elimination could take longer to execute point, there is a specific pattern that is particularly affected: singleton seeks. This operation will become much slower if all (or even a modest subset of) partitions need to be accessed. The skip scan operation essentially performs a seek into every partition that cannot be eliminated.
Let's say that you have a billion row table (N = 1,000,000,000
) with rows divided equally into 1,000 partitions (P = 1,000
). A single seek is roughly O(log(N)) ~ 30
in a non-partitioned table. However, this same seek operation becomes roughly O(P*log(N/P)) ~ 20,000
in this hypothetical partitioned table. So the seek now performs over 500x more work if data from all partitions is needed (or sometimes even if it isn't needed, but SQL can't prove that based on your query).
Note that this can come up both when you explicitly query the table for one row (or a small range of rows) and in more complex queries when the partitioned table appears in the innder side of a loop join. The good news is that SQL Server is reasonably good about taking this into account in cost-based optimization, but that still typically means that you get a hash join when a loop-seek into a non-partitioned table would have been far more optimal.
Thread skew in parallel query execution
In parallel query plans, threads are allocated to partitions. If there is one partition that is much larger than the others, queries against the table may be particularly susceptible to thread skew. It's possible that one thread gets too high a proportion of rows and is processing long after the other threads have done their work. This situation can happen with non-partitioned tables as well, but any partition functions that do not equally distribute rows are particularly vulnerable.
See Parallel Query Execution Strategy for Partitioned Objects for a more detailed description of the allocation of threads to partitions. For example:
The query processor uses a parallel execution strategy for queries that select from partitioned objects. As part of the execution strategy, the query processor determines the table partitions required for the query and the proportion of threads to allocate to each partition. In most cases, the query processor allocates an equal or almost equal number of threads to each partition, and then executes the query in parallel across the partitions.
Best Answer
Workaround: