SQL Server Constraint – Enforce a Column as Unique Based on Another Column Value

constraintsql serversql-server-2008-r2t-sql

I want to make a column unique, but only if a different column is a specific value.

Consider the following table:

CREATE TABLE [SampleTable]
(
     [Id] INTEGER NOT NULL IDENTITY(1,1)
    ,CONSTRAINT [PK_SampleTable]
        PRIMARY KEY ([Id])

    ,[Code]      NVARCHAR(255) NOT NULL
    ,[Deleted]   BIT           NOT NULL DEFAULT 0
    ,[CreatedOn] DATETIME      NOT NULL DEFAULT GETDATE()
);

The intention is that an item can be 'deleted' by setting the [Deleted] column to 1.

What I want is for the [Code] column to be enforced as unique, but only among non-deleted rows.

Enforcing data correctness at the database level is generally my strong preference. However, I've used this pattern a lot in the past, but have never been sure if it were possible to enforce this kind of constraint at the database level. Deadline pressures being what they are, I never bothered to find out. So I've always just enforced them at the application level.

If there's a way to do it though, I'd really like to know what it is.

To be clear, I can't just use a combined unique constraint, because I need to be able to support the following data:

    [ID]   [Code]   [Deleted]  [CreatedOn]
=====================================================================
     1     'ABC'    1          Ages ago
     2     'ABC'    1          A while ago
     3     'ABC'    1          Quite recently, actually
     4     'ABC'    0          Just a moment ago!

A combined unique constraint won't work, because for my purposes three different 'deleted' entries with the same code is valid.

I ask this question because the 'enforcing this at the application level' policy has recently bitten me in the ass when it came to integrating data from a third-party application. It would have been nice if the database had rejected the bad integration data outright, because fixing the integration before it happened would have been a lot easier than cleansing the data after it happened incorrectly.

I'm using SQL Server 2008 R2. But I don't mind upgrading if I have to to get this functionality, I've been meaning to upgrade anyway.

Aaron answered this pretty much immediately. I needed to be using a filtered unique index.

The following code demonstrates the solution.

IF EXISTS (SELECT * FROM SYS.TABLES WHERE [name] = 'SampleTable')
BEGIN
    PRINT 'Dropping Table [SampleTable]';
    DROP TABLE [SampleTable];
END;
GO

PRINT 'Creating Table [SampleTable]';
CREATE TABLE [SampleTable]
(
     [Id] INTEGER NOT NULL IDENTITY(1,1)
    ,CONSTRAINT [PK_SampleTable]
        PRIMARY KEY ([Id])

    ,[Code]      NVARCHAR(255) NOT NULL
    ,[Deleted]   BIT           NOT NULL DEFAULT 0
    ,[CreatedOn] DATETIME      NOT NULL DEFAULT GETDATE()
);

CREATE UNIQUE INDEX
    [UNQ_SampleTable_Code]
ON
    [SampleTable]([Code])
WHERE
    ([Deleted] = 0);

INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 1);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 1);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 1);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 1);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 0);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 0);

UPDATE [SampleTable] SET [Deleted] = 0 WHERE [Id] = 1;

SELECT * FROM [SampleTable];

The sixth (final) insert and the update both fail because of the filtered index.

I'll make good use of this in future. Thanks Aaron!

Best Answer

When you have a unique constraint that you want to apply to only a subset of rows, you can enforce this using a unique, filtered index. The index that seemed to work for you in this case is:

CREATE UNIQUE INDEX [UNQ_SampleTable_Code]
  ON dbo.[SampleTable]([Code])
  WHERE   ([Deleted] = 0);

This ensures that only one distinct value of Code can exist for rows where Deleted is 0, but duplicates can exist where Deleted is 1. Typically this will also help the performance of some queries, since you will often be interested in only the active rows (and not the soft deletes), but you may want to consider adding columns to the INCLUDE clause if this doesn't cover queries (SQL Server may choose to scan the clustered index, or a different index, if lookups are deemed too costly).