Partitioning is not typically a performance enhancer. Normally, partitioning is employed to enable efficient data management. For instance, you can employ partitioning to easily swap old rows out of a table into an archive table, one partition at a time.
Querying against a partitioned table where your query does not include the partitioning key forces SQL Server to look at the entire table. This is less efficient than simply scanning or seeking into a non-partitioned table because each partition is essentially a table, necessitating a UNION ALL
style query against all the partitions. This is not particularly obvious when looking at a query plan for a partitioned table, although you can see it if you look carefully at the pop-up properties for an index or table seek or scan:
To make this more obvious, lets set up a simple test rig in tempdb, with a date-based partitioning key:
USE tempdb;
GO
IF OBJECT_ID(N'dbo.Tab', N'U') IS NOT NULL
BEGIN
DROP TABLE dbo.Tab;
DROP PARTITION SCHEME PartScheme;
DROP PARTITION FUNCTION PartFun;
END
GO
CREATE PARTITION FUNCTION PartFun (datetime)
AS RANGE LEFT
FOR VALUES (
N'2012-01-01T00:00:00'
, N'2012-04-01T00:00:00'
, N'2012-07-01T00:00:00'
, N'2012-10-01T00:00:00'
, N'2013-01-01T00:00:00'
, N'2013-04-01T00:00:00'
, N'2013-07-01T00:00:00'
, N'2013-10-01T00:00:00'
, N'2014-01-01T00:00:00'
, N'2014-04-01T00:00:00'
, N'2014-07-01T00:00:00'
, N'2014-10-01T00:00:00'
);
CREATE PARTITION SCHEME PartScheme
AS PARTITION PartFun
ALL TO ([PRIMARY]);
IF OBJECT_ID(N'dbo.Tab', N'U') IS NOT NULL
DROP TABLE dbo.Tab;
CREATE TABLE dbo.Tab
(
TabID int NOT NULL
, CreateDate datetime NOT NULL
, Data1 varchar(100) NOT NULL
, Data2 varchar(10) NOT NULL
, Data3 varchar(1000) NOT NULL
, CONSTRAINT PK_Tab
PRIMARY KEY CLUSTERED
(CreateDate, TabID)
) ON [PartScheme](CreateDate);
This will populate the table with 1,000,000 rows with randomly generated data evenly spread over all partitions:
;WITH Ten AS
(
SELECT v.Num
FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9))v(Num)
)
, Million AS
(
SELECT Num = (t6.Num * POWER(10, 5))
+ (t5.Num * POWER(10, 4))
+ (t4.Num * POWER(10, 3))
+ (t3.Num * POWER(10, 2))
+ (t2.Num * POWER(10, 1))
+ (t1.Num)
FROM Ten t1
CROSS JOIN Ten t2
CROSS JOIN Ten t3
CROSS JOIN Ten t4
CROSS JOIN Ten t5
CROSS JOIN Ten t6
)
INSERT INTO dbo.Tab (TabID, CreateDate, Data1, Data2, Data3)
SELECT m.Num, DATEADD(DAY, m.Num % 1000, N'2012-01-01T00:00:00')
, CONVERT(varchar(100), CRYPT_GEN_RANDOM(100))
, CONVERT(varchar(10), CRYPT_GEN_RANDOM(10))
, CONVERT(varchar(1000), CRYPT_GEN_RANDOM(1000))
FROM Million m;
Let's go ahead and create a non-clustered index using our partitioning scheme:
CREATE NONCLUSTERED INDEX IX_Tab
ON dbo.Tab(Data1)
ON [PartScheme](CreateDate);
Now, let's look at three "typical" queries you might execute (the first query just gets an actual value out of the table, which is then used in the WHERE
clause for the three queries):
DECLARE @d varchar(100);
SELECT TOP(1) @d = t.Data1
FROM dbo.Tab t
WHERE t.CreateDate > N'2013-08-24T00:00:00';
SELECT Data1
FROM dbo.Tab
WHERE Tab.Data1 = @d;
SELECT Data1
FROM dbo.Tab
WHERE Tab.Data1 = @d
AND tab.CreateDate >= N'2013-08-23T00:00:00'
SELECT Data1
FROM dbo.Tab
WHERE Tab.Data1 = @d
AND tab.CreateDate >= N'2013-08-23T00:00:00'
AND tab.CreateDate <= N'2013-08-29T00:00:00';
Here is a quick screenshot of the execution plans so you can see the difference adding the partitioning key to the WHERE
clause can make:
The "query costs relative to the batch" numbers are instructive here. The first query consumes 50% of the batch cost since it performed a scan of all 13 partitions. It does that since it has no idea where in the table the particular value of Data1
exists. The second query consumes only 23% by virtue of the "Start Date" parameter, which can eliminate half the partitions. Finally, the third query only consumes 4% since it only needs to look at a single partition, because we included a "start date" and an "end date".
The "statistics" for the 3 queries are:
Table 'Tab'. Scan count 13, logical reads 36, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tab'. Scan count 6, logical reads 16, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tab'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
What if we add one of the other columns to these queries:
DECLARE @d varchar(100);
SELECT TOP(1) @d = t.Data1
FROM dbo.Tab t
WHERE t.CreateDate > N'2013-08-24T00:00:00';
SELECT Data1
, Data2
FROM dbo.Tab
WHERE Tab.Data1 = @d;
SELECT Data1
, Data2
FROM dbo.Tab
WHERE Tab.Data1 = @d
AND tab.CreateDate >= N'2013-08-23T00:00:00'
SELECT Data1
, Data2
FROM dbo.Tab
WHERE Tab.Data1 = @d
AND tab.CreateDate >= N'2013-08-23T00:00:00'
AND tab.CreateDate <= N'2013-08-29T00:00:00';
The execution plans:
The statistics for those:
Table 'Tab'. Scan count 13, logical reads 39, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tab'. Scan count 6, logical reads 19, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Tab'. Scan count 1, logical reads 7, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
So, lets just do away with the partitioning, and re-run all those tests:
IF OBJECT_ID(N'dbo.TabU', N'U') IS NOT NULL
DROP TABLE dbo.TabU;
CREATE TABLE dbo.TabU
(
TabID int NOT NULL
, CreateDate datetime NOT NULL
, Data1 varchar(100) NOT NULL
, Data2 varchar(10) NOT NULL
, Data3 varchar(1000) NOT NULL
, CONSTRAINT PK_TabU
PRIMARY KEY CLUSTERED
(CreateDate, TabID)
) ON [PRIMARY];
;WITH Ten AS
(
SELECT v.Num
FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9))v(Num)
)
, Million AS
(
SELECT Num = (t6.Num * POWER(10, 5))
+ (t5.Num * POWER(10, 4))
+ (t4.Num * POWER(10, 3))
+ (t3.Num * POWER(10, 2))
+ (t2.Num * POWER(10, 1))
+ (t1.Num)
FROM Ten t1
CROSS JOIN Ten t2
CROSS JOIN Ten t3
CROSS JOIN Ten t4
CROSS JOIN Ten t5
CROSS JOIN Ten t6
)
INSERT INTO dbo.TabU (TabID, CreateDate, Data1, Data2, Data3)
SELECT m.Num, DATEADD(DAY, m.Num % 1000, N'2012-01-01T00:00:00')
, CONVERT(varchar(100), CRYPT_GEN_RANDOM(100))
, CONVERT(varchar(10), CRYPT_GEN_RANDOM(10))
, CONVERT(varchar(1000), CRYPT_GEN_RANDOM(1000))
FROM Million m;
CREATE NONCLUSTERED INDEX IX_TabU
ON dbo.TabU(Data1)
ON [PRIMARY];
Now, the first set of queries:
DECLARE @d varchar(100);
SELECT TOP(1) @d = t.Data1
FROM dbo.TabU t
WHERE t.CreateDate > N'2013-08-24T00:00:00';
SELECT Data1
FROM dbo.TabU
WHERE TabU.Data1 = @d;
SELECT Data1
FROM dbo.TabU
WHERE TabU.Data1 = @d
AND TabU.CreateDate >= N'2013-08-23T00:00:00'
SELECT Data1
FROM dbo.TabU
WHERE TabU.Data1 = @d
AND TabU.CreateDate >= N'2013-08-23T00:00:00'
AND TabU.CreateDate <= N'2013-08-29T00:00:00';
And the plans:
And the stats:
Table 'TabU'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TabU'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TabU'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
All three queries now have the same "cost", which is substantially lower than the partitioned version.
The queries with Data1
and Data2
columns:
DECLARE @d varchar(100);
SELECT TOP(1) @d = t.Data1
FROM dbo.TabU t
WHERE t.CreateDate > N'2013-08-24T00:00:00';
SELECT Data1
, Data2
FROM dbo.TabU
WHERE TabU.Data1 = @d;
SELECT Data1
, Data2
FROM dbo.TabU
WHERE TabU.Data1 = @d
AND TabU.CreateDate >= N'2013-08-23T00:00:00'
SELECT Data1
, Data2
FROM dbo.TabU
WHERE TabU.Data1 = @d
AND TabU.CreateDate >= N'2013-08-23T00:00:00'
AND TabU.CreateDate <= N'2013-08-29T00:00:00';
and the stats:
Table 'TabU'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TabU'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TabU'. Scan count 1, logical reads 8, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Clearly, if you can't do the queries the correct way, by including the partitioning key, then you are going to see worse performance with partitioning than without.
Best Answer
SQL Version:
Microsoft SQL Server 2012 - 11.0.5343.0 (X64)
Response:
The pointer to the LOB_DATA page (your 24 byte pointer) is only used when the row containing your VARBINARY(MAX) column (s) no longer can fit on the page.
I am still working on matching the exact scenario you mentioned above.
Reference(s):
Slot Array and Total Page Size