Sql-server – SQL Server 2008 R2 Partitioning – same FileGroup, 1 File, 2 partition_numbers – HELP

partitioningsql serversql-server-2008-r2

It's my first go at partitioning in SQL Server, I learned from the Brent Ozar guide which is fantastic 🙂

A few times I have ran into a weird scenario; when I run:

SELECT *
FROM ph.FileGroupDetail
ORDER BY partition_number
Go

There is the same filegroup showing twice with 2 different partition_numbers, 1 correctly at the end with a range value, the other at the start with a null range_value.

click here for enlarge image

enter image description here

Couple of questions:

How is this happening, where have I gone wrong?
How do I resolve the issue, that means how to get rid of the one at the start as I already have an empty partition at the beginning.

I've tried deleting the file (worked when it was empty) and filegroup, but filegroup said it couldn't be deleted.

Can someone explain please how this has happened and how to get rid of the partition 2 entry?

Best Answer

The results indicate at some point an explicit NULL partition boundary was added to the function when the partition scheme NEXT USED filegroup was set to DailyAlbertFG30. Also, I don't see DailyAlbertFG2 used. Perhaps there was once a partition on that filegroup that was subsequently merged.

Below is a script that shows how a FG30 partition with the NULL boundary can be created. The NULL boundary might have been added accidentally.

CREATE PARTITION FUNCTION DailyAlbertPF1 (datetime2(3)) AS RANGE RIGHT FOR VALUES();
GO
CREATE PARTITION SCHEME DailyAlbertPS1 AS PARTITION DailyAlbertPF1 ALL TO ([DailyAlbertFG1]);
GO
CREATE TABLE dbo.FactAgentAlbertPortalSessionEntries
    (
      DateTimeColumn datetime2(3)
    )
ON  DailyAlbertPS1(DateTimeColumn);
GO

DECLARE @FileGroupNumber int = 1;
DECLARE @DateTimeBoundary datetime2(3) = '2015-04-15T00:00:00.000';
DECLARE @SQL nvarchar(MAX);
WHILE @DateTimeBoundary <= '2015-05-14T00:00:00.000'
BEGIN
    SET @SQL = N'ALTER PARTITION SCHEME DailyAlbertPS1 NEXT USED DailyAlbertFG' + CAST(@FileGroupNumber AS nvarchar(5)) + N';';
    EXEC(@SQL);
    ALTER PARTITION FUNCTION DailyAlbertPF1() SPLIT RANGE(@DateTimeBoundary);
    SET @DateTimeBoundary = DATEADD(day, 1, @DateTimeBoundary);
    SET @FileGroupNumber += 1;
END;
--add NULL boundary on DailyAlbertFG30
SET @DateTimeBoundary = NULL;
ALTER PARTITION SCHEME DailyAlbertPS1 NEXT USED DailyAlbertFG30;
ALTER PARTITION FUNCTION DailyAlbertPF1() SPLIT RANGE(@DateTimeBoundary);
GO

Related Solutions

Sql-server – SQL Server 2008 Partitioning Evaluation

What do you expect your real life volume of data to be?

For 10 million rows, I wouldn't bother with partitioning. The overhead far outweighs the benefits: partitioning isn't a silver bullet to cure performance issues.

To answer,

Point 1: on the first run, data needs loaded into memory ("buffer pool") and will stay cached until evicted based on memory pressure and usage. Personally, I'd test with the cache filled because you'd expect your app to require that data very often, especially if you think partitioning is the solution to some problem

For point 2, what queries do you expect to run in production? The queries should be representative of this production load. However they should test different realistic filter combinations with and without partition key at least.

Edit, some reading, after comments below:

Sql-server – SQL Server 2008 – Partitioning and Clustered Indexes

A partitioned table is really more like a collection of individual tables stitched together. So your in example of clustering by IncidentKey and partition by IncidentDate, say that the partitioning function splits the tables into two partitions so that 1/1/2010 is in partition 1 and 7/1/2010 is partition two. The data will be layed out on disk as:

Partition 1:
IncidentKey    Date
ABC123        1/1/2010
ABC123        1/1/2011
XYZ999        1/1/2010

Partition 2:
IncidentKey    Date
ABC123        7/1/2010
XYZ999        7/1/2010

At a low level there really are two, distinct rowsets. Is the query processor that gives the illusion of a single table by creating plans that seek, scan and update all rowsets together, as one.

Any row in any non-clustered index will have have the clustered index key to which it corresponds, say ABC123,7/1/2010. Since the clustered index key always contains the partitioning key column, the engine will always know in what partition (rowset) of the clustered index to search for this value (in this case, in partition 2).

Now whenever you're dealing with partitioning you must consider if your NC indexes will be aligned (NC index is partitioned exactly the same as the clustered index) or non-aligned (NC index is non-partitioned, or partitioned differently from clustered index). Non-aligned indexes are more flexible, but they have some drawbacks:

non-aligned indexes require large amounts of memory for certain query plans
non-aligned indexes prevent efficient partition switch operations

Using aligned indexes solves these issues, but brings its own set of problems, because this physical, storage design, option ripples into the data model:

aligned indexes mean unique constrains can no longer be created/enforced (except for the partitioning column)
all foreign keys referencing the partitioned table must include the partitioning key in the relation (since the partitioning key is, due to alignment, in every index), and this in turn requires that all tables referencing the partitioned table contain partitioning key column value. Think Orders->OrderDetails, if Orders have OrderID but is partitioned by OrderDate, then OrderDetails must contain not only OrderID, but also OrderDate, in order to properly declare the foreign key constraint.

These effects I found seldom called out at the beginning of a project that deploys partitioning, but they exists and have serious consequences.

If you think aligned indexes are a rare or extreme case, then consider this: in many cases the cornerstone of ETL and partitioning solutions is the fast switch in of staging tables. Switch in operations require aligned indexes.

Oh, one more thing: all my argument about foreign keys and the ripple effect of adding the partitioning column value to other tables applies equally to joins.

Best Answer

Related Solutions

Sql-server – SQL Server 2008 Partitioning Evaluation

Sql-server – SQL Server 2008 – Partitioning and Clustered Indexes

Related Question