Sql-server – Using an index when leading column is not in the predicate

clustered-indexindexnonclustered-indexsql server

Are there any circumstances when an index could be used when not all the index columns are included in the predicate? In my case the leading column in the index is unique and is part of the unique primary key so I am wondering if SQL can still use the index, as it strikes me it has all the information to know its unique.

e.g.

CREATE TABLE x (a int, b int, c int, CONSTRAINT y PRIMARY KEY CLUSTERED (a ASC, b ASC))

SELECT * FROM x WHERE b = 1

This does not use the clustered index (it uses some other NC index). Of course if I specify the leading column it does.

 SELECT * FROM x WHERE a = 1 AND b = 1

Best Answer

In my case the leading column in the index is unique ... it strikes me it has all the information to know its unique.

It does not. That primary key does not guarantee that either a or b are unique, just that all combinations of a & b are. There could be many rows for which b = 1 is true, maybe all of them, maybe none of them. When you search for a specific combination of a and b then it can use the index to do a simple seek.

With that table definition, your first query is like asking "find all words in the dictionary where the second letter is 'a'". You can't answer that with a single seek. Your second query is like asking for "words starting 'aa'" which is easy to answer with that index.

Note though that some database systems are able to perform a skip-search to speed up the first query, which would help here, essentially looking for a=1 and b=1 then a=2 and b=1 then a=3 and b=1 and so on (it isn't quite this but as near to as makes little odds). If operation is supported it may be used if the datatype of a and the selectivity indicated by the index stats suggest it might be appropriate. No version of SQL server supports this operation though. Oracle does unless you have a really old version (> a decade or two), as does SQLite, IIRC both call the operation a "skip scan". Further note that in every circumstance where a skip-scan is better than a full index scan, having an index on the second column to use would be more efficient, often significantly so, meaning that while the feature might make some queries better in databases that are not optimised for them, if you expect such queries you should optimise your table(s) for them by having the extra index even if your DBMS supports skip-searches.

it uses some other NC index

To explain why that particular index is selected and used instead of any other, we would need to know how that index and all the other indexes are defined, and perhaps see the query plan. Also you don't say how the index is used - I assume it was scanned rather than being used for seeks.

1 Pages / 245 rows

This plan has a seek on A=1 AND B=2 with a residual predicate on (C=@C1 OR C=@C2) AND D=5

Plan 1

2 leaf Pages / 246 rows

Plan 2

In the second plan the extra operators are responsible for removing any duplicates from @C1,@C2 first before performing the seek(s).

The seek in the second plan is actually a range seek between A=1 AND B=2 AND C > Expr1010 and A=1 AND B=2 AND C < Expr1011 with a residual predicate on D=5. It still isn't an equality seek on all 4 columns. More information about the additional plan operators can be found here.

Adding OPTION (RECOMPILE) does allow it to inspect the parameter values for duplicates at compile time and produces a plan with two equality seeks.

You could also achieve that with

;WITH CTE
     AS (SELECT DISTINCT ( C )
         FROM   (VALUES (@C1),
                        (@C2)) V(C))
SELECT CA.*
FROM   CTE
       CROSS APPLY (SELECT *
                    FROM   T
                    WHERE A=1 AND B=2 AND D=5  AND C = CTE.C) CA

Plan 3

But actually in this test case it would likely be counter productive as having two seeks into the single page index rather than one increases the logical IO.

Sql-server – Unable to drop non-PK index because it is referenced in a foreign key constraint

Because a foreign key can point to a primary key or a unique constraint, and whoever created that foreign key possibly created it before the primary key existed (or they shifted the FK to point to the Unique index while they changed something else about the primary key). This is easy to repro:

CREATE TABLE dbo.MyTable(MyTableID INT NOT NULL, CONSTRAINT myx UNIQUE(MyTableID));

CREATE TABLE dbo.OtherTable1(ID INT FOREIGN KEY REFERENCES dbo.MyTable(MyTableID));

ALTER TABLE dbo.MyTable ADD CONSTRAINT PKmyx PRIMARY KEY(MyTableID);

CREATE TABLE dbo.OtherTable2(ID INT FOREIGN KEY REFERENCES dbo.MyTable(MyTableID));

In fact, both of these foreign keys will point to the first unique constraint defined on that column (myx).

You can fix the foreign key on the other table by dropping it and re-creating it. You will need to repeat that process for any other tables that point to this column. You can find these easily:

SELECT s.name,t.name,fk.name
FROM sys.foreign_key_columns AS fkc
INNER JOIN sys.foreign_keys AS fk
ON fkc.constraint_object_id = fk.[object_id]
INNER JOIN sys.tables AS t
ON fkc.parent_object_id = t.[object_id]
INNER JOIN sys.schemas AS s
ON t.[schema_id] = s.[schema_id]
INNER JOIN sys.columns AS c1
ON c1.[object_id] = fkc.referenced_object_id
AND c1.column_id = fkc.referenced_column_id
AND c1.name = N'MyTableID'
WHERE fkc.referenced_object_id = OBJECT_ID('dbo.MyTable');

Results:

dbo    OtherTable1    FK__OtherTable1__ID__32E0915F
dbo    OtherTable2    FK__OtherTable2__ID__35BCFE0A

And even generate a script to drop and re-create them (dropping the redundant unique constraint in the meantime):

DECLARE 
  @sql1 NVARCHAR(MAX) = N'', 
  @sql2 NVARCHAR(MAX) = N'ALTER TABLE dbo.MyTable DROP CONSTRAINT myx;', 
  @sql3 NVARCHAR(MAX) = N'';

SELECT 
  @sql1 += N'
ALTER TABLE ' + QUOTENAME(s.name) + '.' + QUOTENAME(t.name)
  + ' DROP CONSTRAINT ' + QUOTENAME(fk.name) + ';',
  @sql3 += N'
ALTER TABLE ' + QUOTENAME(s.name) + '.' + QUOTENAME(t.name)
  + ' ADD CONSTRAINT ' + QUOTENAME(fk.name) + ' FOREIGN KEY '
  + '(' + QUOTENAME(c2.name) + ') REFERENCES dbo.MyTable(MyTableID);'
FROM sys.foreign_key_columns AS fkc
INNER JOIN sys.foreign_keys AS fk
ON fkc.constraint_object_id = fk.[object_id]
INNER JOIN sys.tables AS t
ON fkc.parent_object_id = t.[object_id]
INNER JOIN sys.schemas AS s
ON t.[schema_id] = s.[schema_id]
INNER JOIN sys.columns AS c1
ON c1.[object_id] = fkc.referenced_object_id
AND c1.column_id = fkc.referenced_column_id
AND c1.name = N'MyTableID'
INNER JOIN sys.columns AS c2
ON c2.[object_id] = fkc.parent_object_id
AND c2.column_id = fkc.parent_column_id
WHERE fkc.referenced_object_id = OBJECT_ID('dbo.MyTable');

PRINT @sql1;
PRINT @sql2;
PRINT @sql3;
-- EXEC sp_executesql @sql1;
-- EXEC sp_executesql @sql2;
-- EXEC sp_executesql @sql3;

Results:

ALTER TABLE [dbo].[OtherTable1] DROP CONSTRAINT [FK__OtherTable1__ID__32E0915F];
ALTER TABLE [dbo].[OtherTable2] DROP CONSTRAINT [FK__OtherTable2__ID__35BCFE0A];

ALTER TABLE dbo.MyTable DROP CONSTRAINT myx;

ALTER TABLE [dbo].[OtherTable1] ADD CONSTRAINT [FK__OtherTable1__ID__32E0915F] 
  FOREIGN KEY ([ID]) REFERENCES dbo.MyTable(MyTableID);
ALTER TABLE [dbo].[OtherTable2] ADD CONSTRAINT [FK__OtherTable2__ID__35BCFE0A] 
  FOREIGN KEY ([ID]) REFERENCES dbo.MyTable(MyTableID);

This explicitly handles this case, where the constraint only involves a single column. It gets a little more complex if there are multiple columns involved (and this answer is not meant to solve that problem). I also didn't test if this works exactly as coded if the foreign keys point to a redundant unique index (which has the same underlying structure but is created with slightly different DDL). Exercise for the reader. :-)

Best Answer

Related Solutions

Sql-server – Seek predicate not using all available columns

1 Pages / 245 rows

2 leaf Pages / 246 rows

Sql-server – Unable to drop non-PK index because it is referenced in a foreign key constraint

Related Question