Let's say I have a table Foo
with columns ID1, ID2
and a composite primary key defined over ID2, ID1
. (I'm currently working with a System Center product that has several tables defined this way with the primary key columns listed in the opposite order they appear in the table definition.)
CREATE TABLE dbo.Foo(
ID1 int NOT NULL,
ID2 int NOT NULL,
CONSTRAINT [PK_Foo] PRIMARY KEY CLUSTERED (ID2, ID1)
);
GO
-- Add a row and update stats so that histogram isn't empty
INSERT INTO Foo (ID1, ID2) VALUES (1,2);
UPDATE STATISTICS dbo.Foo;
The key_ordinal
column in sys.index_columns
shows the index columns in the same order they were declared in the composite primary key:
SELECT t.name, i.name, c.column_id, c.name, ic.index_column_id, ic.key_ordinal
FROM sys.tables AS t
JOIN sys.indexes AS i
ON t.[object_id] = i.[object_id]
JOIN sys.index_columns AS ic
ON ic.[object_id] = i.[object_id]
AND ic.index_id = i.index_id
JOIN sys.columns AS c
ON ic.column_id = c.column_id
AND ic.[object_id] = c.[object_id]
WHERE t.name = 'Foo';
The histogram also shows the statistics in the same order:
DBCC SHOW_STATISTICS ('Foo',PK_Foo);
However, sys.stats_columns
shows the columns listed in the inverse order (ID1, ID2
).
SELECT s.name, sc.stats_column_id, c.name
FROM sys.stats AS s
JOIN sys.stats_columns AS sc
ON s.stats_id = sc.stats_id
AND s.[object_id] = sc.[object_id]
JOIN sys.columns AS c
ON c.[object_id] = s.[object_id]
AND c.column_id = sc.column_id
JOIN sys.objects AS o
ON o.[object_id] = c.[object_id]
WHERE o.name = 'Foo'
AND s.name = 'PK_Foo';
Books Online says stats_column_id
is a "1-based ordinal within set of stats columns," so I was expecting the value 1 to point to the first column in the statistics object.
Is this a bug in sys.stats_columns
or a misunderstanding on my part?
I've verified this behaviour occurs on current builds of SQL Server 2005, 2008, 2008 R2, 2012, and 2014.
sys.stats_columns
seems to reflect the order within the statistics object in other situations, for example:
CREATE TABLE dbo.Foo2(
ID1 int NOT NULL,
ID2 int NOT NULL,
ID3 int NULL,
String VARCHAR(10) NULL,
CONSTRAINT [PK_Foo2] PRIMARY KEY CLUSTERED (ID2, ID1)
);
GO
INSERT INTO Foo2 (ID1, ID2, ID3, String) VALUES (1,2,3,'String');
CREATE STATISTICS ST_Test ON Foo2 (ID3, String);
CREATE STATISTICS ST_Test2 ON Foo2 (String, ID3);
DBCC SHOW_STATISTICS ('Foo2',ST_Test);
DBCC SHOW_STATISTICS ('Foo2',ST_Test2);
SELECT s.name, sc.stats_column_id, c.name
FROM sys.stats AS s
JOIN sys.stats_columns AS sc
ON s.stats_id = sc.stats_id
AND s.[object_id] = sc.[object_id]
JOIN sys.columns AS c
ON c.[object_id] = s.[object_id]
AND c.column_id = sc.column_id
JOIN sys.objects AS o
ON o.[object_id] = c.[object_id]
WHERE o.name = 'Foo2'
AND s.name LIKE 'ST_Test%';
Here is another example where sys.stats_columns
appears to return the correct data, this time for statistics on an index:
--drop table dbo.Foo3
CREATE TABLE dbo.Foo3(
ID1 int NOT NULL,
ID2 int NOT NULL,
ID3 int NULL,
String VARCHAR(10) NULL,
CONSTRAINT [PK_Foo3] PRIMARY KEY CLUSTERED (ID2, ID1)
);
GO
INSERT INTO Foo3 (ID1, ID2, ID3, String) VALUES (1,2,3,'String');
UPDATE STATISTICS Foo3;
CREATE INDEX IX_Test ON Foo3 (ID3, String);
CREATE INDEX IX_Test2 ON Foo3 (String, ID3);
DBCC SHOW_STATISTICS ('Foo3',IX_Test);
DBCC SHOW_STATISTICS ('Foo3',IX_Test2);
SELECT s.name, sc.stats_column_id, c.name
FROM sys.stats AS s
JOIN sys.stats_columns AS sc
ON s.stats_id = sc.stats_id
AND s.[object_id] = sc.[object_id]
JOIN sys.columns AS c
ON c.[object_id] = s.[object_id]
AND c.column_id = sc.column_id
JOIN sys.objects AS o
ON o.[object_id] = c.[object_id]
WHERE o.name = 'Foo3'
AND s.name LIKE 'IX_Test%';
Best Answer
This seems to be a long standing error:
swasheck - March 5, 2015 posted:
https://connect.microsoft.com/SQLServer/feedback/details/1163126
Max Vernon and James Lupolt seem to agree based on their comments/encouragement.