Sql-server – Unique index for subquery with RANK() function

database-designsql serversql-server-2012

I have the following tables:

CREATE TABLE Revision (
    RevisionId INT PRIMARY KEY IDENTITY,
    UserName NVARCHAR(256) NOT NULL,
    DateTime DATETIME NOT NULL DEFAULT GETDATE()
)

CREATE TABLE MyEntity (
    MyEntityId INT NOT NULL,
    RevisionId INT NOT NULL FOREIGN KEY REFERENCES Revision (RevisionId),
    Deleted BIT NOT NULL DEFAULT 0,
    Name NVARCHAR(256) NOT NULL,
    Body NVARCHAR(MAX) NOT NULL,
    PRIMARY KEY (MyEntityId, RevisionId)
)

That is, the MyEntity table has all changes to an entity. When an entity is created, modified, or deleted, a new record is inserted into MyEntity and into Revision, so that the entire history is tracked.

I would like a view to have the latest version of each entity:

CREATE VIEW MyEntityLatest WITH SCHEMABINDING AS
SELECT
    Latest.MyEntityId,
    Latest.Name,
    Latest.Body
FROM dbo.MyEntity
INNER JOIN (
    SELECT
        MyEntityId,
        RevisionId,
        Name,
        Body,
        RANK() OVER (PARTITION BY MyEntityId ORDER BY RevisionId DESC) AS RevisionIdDesc
    FROM MyEntity
    WHERE Deleted = 0
) AS Latest
    ON dbo.MyEntity.MyEntityId = Latest.MyEntityId
    AND dbo.MyEntity.RevisionId = Latest.RevisionId
WHERE Latest.RevisionIdDesc = 1

However, I would like to create an unique index (and hence constraint) such that the Name is unique for the latest revision only. I cannot create an index on the view, because of the subquery in the view.

How can I accomplish this?

Best Answer

Relying on the order of RevisionId to determine the latest revision is incredibly dangerous, and points to a design issue here.

Since you're using SQL Server 2008+, what I would recommend is to add a bit flag IsLatest to the MyEntity table to identify the latest revision. Then, create a unique filtered index to constrain each entity to only have a single latest revision (technically, this isn't required given what you asked for, but it's a Really Good Idea):

CREATE UNIQUE NONCLUSTERED INDEX UC_MyEntity_IsLatest_MyEntityId
    ON [dbo].[MyEntity](IsLatest, MyEntityId)
        WHERE (IsLatest = 1);

Then use the same technique to enforce the constraint on the name:

CREATE UNIQUE NONCLUSTERED INDEX UC_MyEntity_IsLatest_Name
    ON [dbo].[MyEntity](IsLatest, Name)
        WHERE (IsLatest = 1);

You'll need to manage the flag by turning it off from the "previous latest" revision before inserting the "new latest" revision, which is not much extra work. I'll leave it up to you to determine how this should work with the Deleted flag involved.

This solution enforces the constraint by design, and will greatly simplify the view definition, which you may not even need to index separately. (Indexing the Body column which is nvarchar(MAX) is a bit suspect, but that's a separate issue.)

Note: I included the IsLatest column in the key of the indexes on purpose, as the optimizer isn't smart enough to implicitly include the value of that column from the filter predicate (since the predicate is arbitrary, I assume it doesn't even try). This is what will allow the index to be used when you specify WHERE IsLatest = 1 in a query.