SQL Server Constraint – Enforce a Column as Unique Based on Another Column Value

constraintsql serversql-server-2008-r2t-sql

I want to make a column unique, but only if a different column is a specific value.

Consider the following table:

CREATE TABLE [SampleTable]
(
     [Id] INTEGER NOT NULL IDENTITY(1,1)
    ,CONSTRAINT [PK_SampleTable]
        PRIMARY KEY ([Id])

    ,[Code]      NVARCHAR(255) NOT NULL
    ,[Deleted]   BIT           NOT NULL DEFAULT 0
    ,[CreatedOn] DATETIME      NOT NULL DEFAULT GETDATE()
);

The intention is that an item can be 'deleted' by setting the [Deleted] column to 1.

What I want is for the [Code] column to be enforced as unique, but only among non-deleted rows.

Enforcing data correctness at the database level is generally my strong preference. However, I've used this pattern a lot in the past, but have never been sure if it were possible to enforce this kind of constraint at the database level. Deadline pressures being what they are, I never bothered to find out. So I've always just enforced them at the application level.

If there's a way to do it though, I'd really like to know what it is.

To be clear, I can't just use a combined unique constraint, because I need to be able to support the following data:

    [ID]   [Code]   [Deleted]  [CreatedOn]
=====================================================================
     1     'ABC'    1          Ages ago
     2     'ABC'    1          A while ago
     3     'ABC'    1          Quite recently, actually
     4     'ABC'    0          Just a moment ago!

A combined unique constraint won't work, because for my purposes three different 'deleted' entries with the same code is valid.

I ask this question because the 'enforcing this at the application level' policy has recently bitten me in the ass when it came to integrating data from a third-party application. It would have been nice if the database had rejected the bad integration data outright, because fixing the integration before it happened would have been a lot easier than cleansing the data after it happened incorrectly.

I'm using SQL Server 2008 R2. But I don't mind upgrading if I have to to get this functionality, I've been meaning to upgrade anyway.

Aaron answered this pretty much immediately. I needed to be using a filtered unique index.

The following code demonstrates the solution.

IF EXISTS (SELECT * FROM SYS.TABLES WHERE [name] = 'SampleTable')
BEGIN
    PRINT 'Dropping Table [SampleTable]';
    DROP TABLE [SampleTable];
END;
GO

PRINT 'Creating Table [SampleTable]';
CREATE TABLE [SampleTable]
(
     [Id] INTEGER NOT NULL IDENTITY(1,1)
    ,CONSTRAINT [PK_SampleTable]
        PRIMARY KEY ([Id])

    ,[Code]      NVARCHAR(255) NOT NULL
    ,[Deleted]   BIT           NOT NULL DEFAULT 0
    ,[CreatedOn] DATETIME      NOT NULL DEFAULT GETDATE()
);

CREATE UNIQUE INDEX
    [UNQ_SampleTable_Code]
ON
    [SampleTable]([Code])
WHERE
    ([Deleted] = 0);

INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 1);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 1);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 1);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 1);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 0);
INSERT INTO [SampleTable] ([Code],[Deleted]) VALUES ('ABC', 0);

UPDATE [SampleTable] SET [Deleted] = 0 WHERE [Id] = 1;

SELECT * FROM [SampleTable];

The sixth (final) insert and the update both fail because of the filtered index.

I'll make good use of this in future. Thanks Aaron!

Best Answer

When you have a unique constraint that you want to apply to only a subset of rows, you can enforce this using a unique, filtered index. The index that seemed to work for you in this case is:

CREATE UNIQUE INDEX [UNQ_SampleTable_Code]
  ON dbo.[SampleTable]([Code])
  WHERE   ([Deleted] = 0);

This ensures that only one distinct value of Code can exist for rows where Deleted is 0, but duplicates can exist where Deleted is 1. Typically this will also help the performance of some queries, since you will often be interested in only the active rows (and not the soft deletes), but you may want to consider adding columns to the INCLUDE clause if this doesn't cover queries (SQL Server may choose to scan the clustered index, or a different index, if lookups are deemed too costly).

Related Solutions

Sql-server – Any way around unique index 16 column max

Add a persisted computed column that combines the 18 keys, then create an unique index on the computed column:

alter table t add all_keys as c1+c2+c3+...+c18 persisted;
create unique index i18 on t (all_keys);

See Creating Indexes on Computed Columns.

Another approach is to create an indexed view:

create view v 
with schemabinding
as select c1+c2+c3+...+c18 as all_keys
from dbo.t;

create unique clustered index c18 on v(all_keys);

See Creating Indexed Views.

Both approaches allow for a partial key aggregate: aggregate c1+c2+c3 as k1, c4+c5+c6 as k2 etc. then index/create indexed view on (k1, k2, ...). Thia could be beneficial for range scans (index can be used for search on c1+c2+c3.

Of course, all + operation in my example are string aggregation, the actual operator to use depends on the types of all those columns (ie. you may have to use explicit casts).

PS. As unique constraints are enforced by an unique index, any restriction on unique indexes will apply to unique constraints as well:

create table t (
    c1 char(3), c2 char(3), c3 char(3), c4 char(3),
    c5 char(3), c6 char(3), c7 char(3), c8 char(3),
    c9 char(3), c10 char(3), c11 char(3), c12 char(3),
    c13 char(3), c14 char(3), c15 char(3), c16 char(3),
    c17 char(3), c18 char(3), c19 char(3), c20 char(3),
    constraint unq unique
      (c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13,c14,c15,c16,c17,c18));
go  


Msg 1904, Level 16, State 1, Line 3
The index '' on table 't' has 18 column names in index key list. 
The maximum limit for index or statistics key column list is 16.
Msg 1750, Level 16, State 0, Line 3
Could not create constraint. See previous errors.

However, creating the constraint on a persisted computed column works:

create table t (
    c1 char(3), c2 char(3), c3 char(3), c4 char(3),
    c5 char(3), c6 char(3), c7 char(3), c8 char(3),
    c9 char(3), c10 char(3), c11 char(3), c12 char(3),
    c13 char(3), c14 char(3), c15 char(3), c16 char(3),
    c17 char(3), c18 char(3), c19 char(3), c20 char(3),
    all_c as 
        c1+c2+c3+c4+c5+c6+c7+c8+c9+c10+c11+
        c12+c13+c14+c15+c16+c17+c18 
        persisted
        constraint unq unique (all_c));
go

Obviously, the persisted column consumes the space on disk so the approach may be bad for a very large table. The indexed view approach does not have this problem, it only consumes the space for the index, not the space for the computed column and index.

Sql-server – unique key violation occuring on a unique value combination

I cannot really go through all this complex query but it appears that these 2 rows would be updated with same date from GETDATE() and since all other columns in the UNIQUE key are identical, this is in conflist with the UNIQUE KEY constraint.

Best Answer

Related Solutions

Sql-server – Any way around unique index 16 column max

Sql-server – unique key violation occuring on a unique value combination

Related Question