Should I mark a composite index as unique if it already contains the primary key?
Probably not. The optimizer can generally use information about the uniqueness of the contained key column anyway, so there's no real advantage.
There is also an important consequence of marking an index unique on update plans that modify keys of that index to consider:
Setup
CREATE TABLE dbo.Customers
(
CustomerID int NOT NULL PRIMARY KEY,
FirstName nvarchar(50),
LastName nvarchar(50),
[Address] nvarchar(200),
Email nvarchar(260)
);
CREATE NONCLUSTERED INDEX
IX_Customers_CustomerIDEmail
ON dbo.Customers
(
CustomerID,
Email
);
-- Pretend we have some rows
UPDATE STATISTICS dbo.Customers
WITH ROWCOUNT = 100000, PAGECOUNT = 20000;
Per-index update plan (non-unique index)
UPDATE dbo.Customers
SET Email = N'New', [Address] = 'New Address'
WHERE Email = N'Old'
OPTION (QUERYTRACEON 8790); -- Per-index update plan
Execution plan:
The optimizer often makes a cost-based decision between updating nonclustered indexes per-row (a 'narrow' plan) or per-index (a 'wide' plan). The default strategy (except for in-memory OLTP tables) is a wide plan.
Narrow plans (where nonclustered indexes are maintained at the same time as the heap/clustered index) are a performance optimization for small updates. This optimization is not implemented for all cases - using certain features (like indexed views) means that the associated index(es) will be maintained in a wide plan.
More information: Optimizing T-SQL Queries that Change Data
In this case, I have used undocumented trace flag 8790 to force a wide update plan: The plan therefore shows the clustered and nonclustered indexes being maintained separately.
The Split turns each update into a separate delete & insert pair; the Filter filters out any rows that would not result in a change to the index.
More information: (Non-updating updates) by the SQL Server QO Team.
Per-index update plan (unique index)
-- Same index, but unique
CREATE UNIQUE INDEX IX_Customers_CustomerIDEmail ON Customers
(
CustomerID,
Email
)
WITH (DROP_EXISTING = ON);
UPDATE dbo.Customers
SET Email = N'New', [Address] = 'New Address'
WHERE Email = N'Old'
OPTION (QUERYTRACEON 8790); -- Per-index update plan
Execution plan:
Notice the extra Sort and Collapse operators when the index is marked unique.
This Split-Sort-Collapse pattern is required when updating the keys of a unique index, to prevent intermediate unique key violations.
More information: Maintaining Unique Indexes by Craig Freedman
The Sort in particular can be a problem. Not only is it an unnecessary extra cost, it may spill to disk if estimates are inaccurate.
About nonclustered keys
Another factor to consider is that nonclustered index structures are always unique, at every level of the index, even if UNIQUE
is not specified. The clustering key(s) - and possibly a uniquifier if the clustered index is not marked unique - are added to a non-unique nonclustered index at all levels.
As a consequence, the following index definiton:
CREATE INDEX IX_Customers_CustomerIDEmail ON Customers
(
Email
)
WITH (DROP_EXISTING = ON);
...actually contains the keys (Email, CustomerID) at all levels. It is therefore 'seekable' on both columns:
SELECT *
FROM dbo.Customers AS C WITH (INDEX(IX_Customers_CustomerIDEmail))
WHERE C.Email = N'Email'
AND C.CustomerID = 1;
More information: More About Nonclustered Index Keys by Kalen Delaney
Best Answer
Since the
alias
column is nullable, if you add aunique
constraint on the composite(last_name, first_name, birth_date, alias)
, there will still be duplicates allowed, with the same values in the first 3 columns andNULL
in thealias
. The constraint is skipped / accepted when at least one value is null. MySQL documentation onCREATE TABLE
is not very clear but you can test the behaviour:What you could do is to define the
alias
asnot null
and add adefault
value, (say'NONE'
or'DEFAULT'
or the empty string''
). You (or the user) will not have to provide that value, it will be automatically saved in all rows. Once someone tries to add a new row with same last name, first name and birth date as an existing row, theunique
constraint will forbid it. I guess you could add some procedure at that point, that asks for a different value for thealias
and adds the new row with it.