Sql-server – Determinism inside an INSTEAD OF trigger

referential-integritysql servertrigger

In reference to my prior question about a preferred design to avoid circular or multiple update path referential integrity cascading updates and deletes via ON UPDATE CASCADE and ON DELETE CASCADE, I'm attempting to use a trigger as an alternative.

For this example, I've attempted to simplify the design as much as possible.

I have a Parent table and a Child table; Parent and Child share a column, ParentName. Whenever Parent.ParentName changes, I want to reflect that change in Child.ParentName.

Simple enough? If I update a single row in Parent, I can use the inserted and deleted tables to very easily know which row to update in both the Parent and Child tables. Something like this:

UPDATE Parent
SET Parent.ParentName = inserted.ParentName
FROM Parent
    INNER JOIN deleted ON Parent.ParentName = deleted.ParentName
    CROSS JOIN inserted;

UPDATE Child
SET Child.ParentName = inserted.ParentName
FROM Child
    INNER JOIN deleted ON Child.ParentName = deleted.ParentName
    CROSS JOIN inserted;

Pretty clearly, this won't work if more than a single row in Parent is modified in one UPDATE statement.

The tables:

IF OBJECT_ID(N'dbo.Parent', N'U') IS NOT NULL
DROP TABLE dbo.Parent;
CREATE TABLE dbo.Parent
(
    ParentName int NOT NULL
        PRIMARY KEY
    , IsActive bit NOT NULL
);

IF OBJECT_ID(N'dbo.Child', N'U') IS NOT NULL
DROP TABLE dbo.Child;
CREATE TABLE dbo.Child
(
    ChildName int NOT NULL
        PRIMARY KEY
    , ParentName int NOT NULL
    , IsActive bit NOT NULL
);
GO

INSERT INTO dbo.Parent (ParentName, IsActive)
VALUES (0, 1)
    , (1, 1)
    , (2, 1);

INSERT INTO dbo.Child (ChildName, ParentName, IsActive)
VALUES (7, 0, 1)
    , (8, 1, 1)
    , (9, 2, 1);

SELECT *
FROM dbo.Parent;
╔════════════╦══════════╗
║ ParentName ║ IsActive ║
╠════════════╬══════════╣
║ 0          ║ 1        ║
║ 1          ║ 1        ║
║ 2          ║ 1        ║
╚════════════╩══════════╝
SELECT *
FROM dbo.Child;
╔═══════════╦════════════╦══════════╗
║ ChildName ║ ParentName ║ IsActive ║
╠═══════════╬════════════╬══════════╣
║ 7         ║ 0          ║ 1        ║
║ 8         ║ 1          ║ 1        ║
║ 9         ║ 2          ║ 1        ║
╚═══════════╩════════════╩══════════╝

The trigger:

CREATE TRIGGER DRI
ON dbo.Parent
INSTEAD OF UPDATE
AS
BEGIN
    UPDATE Parent
    SET Parent.ParentName = inserted.ParentName
    FROM Parent
        INNER JOIN deleted ON Parent.ParentName = deleted.ParentName
        CROSS JOIN inserted;

    UPDATE Child
    SET Child.ParentName = inserted.ParentName
    FROM Child
        INNER JOIN deleted ON Child.ParentName = deleted.ParentName
        CROSS JOIN inserted;
END
GO

Testing it:

UPDATE dbo.Parent SET ParentName = 6 WHERE ParentName = 2;

The results:

SELECT *
FROM dbo.Parent;
╔════════════╦══════════╗
║ ParentName ║ IsActive ║
╠════════════╬══════════╣
║ 0          ║ 1        ║
║ 1          ║ 1        ║
║ 6          ║ 1        ║  <-- this row has changed
╚════════════╩══════════╝
SELECT *
FROM dbo.Child;
╔═══════════╦════════════╦══════════╗
║ ChildName ║ ParentName ║ IsActive ║
╠═══════════╬════════════╬══════════╣
║ 7         ║ 0          ║ 1        ║
║ 8         ║ 1          ║ 1        ║
║ 9         ║ 6          ║ 1        ║  <-- this row has changed
╚═══════════╩════════════╩══════════╝

If, however I update multiple rows, you get this:

UPDATE dbo.Parent SET ParentName = ParentName + 1;

SELECT *
FROM dbo.Parent;
╔════════════╦══════════╗
║ ParentName ║ IsActive ║
╠════════════╬══════════╣
║ 1          ║ 1        ║
║ 1          ║ 1        ║
║ 1          ║ 1        ║
╚════════════╩══════════╝
SELECT *
FROM dbo.Child;
╔═══════════╦════════════╦══════════╗
║ ChildName ║ ParentName ║ IsActive ║
╠═══════════╬════════════╬══════════╣
║ 7         ║ 1          ║ 1        ║
║ 8         ║ 1          ║ 1        ║
║ 9         ║ 1          ║ 1        ║
╚═══════════╩════════════╩══════════╝

I could resolve this by making the trigger refuse updates where there are more than a single row being updated, however that's not very scalable, and will promote RBAR.

CREATE TRIGGER DRI
ON dbo.Parent
INSTEAD OF UPDATE
AS
BEGIN
    DECLARE @msg nvarchar(1000);
    SET @msg = 'Only a single row can be updated per batch.';
    IF (SELECT COUNT(1) FROM inserted) > 1 
    BEGIN
        RAISERROR (@msg, 10, 1);
    END
    ELSE
    BEGIN
        UPDATE Parent
        SET Parent.ParentName = inserted.ParentName
        FROM Parent
            INNER JOIN deleted ON Parent.ParentName = deleted.ParentName
            CROSS JOIN inserted;

        UPDATE Child
        SET Child.ParentName = inserted.ParentName
        FROM Child
            INNER JOIN deleted ON Child.ParentName = deleted.ParentName
            CROSS JOIN inserted;
    END
END
GO

Is there something I can do to mitigate this? Obviously, I don't want to add an IDENTITY column to these tables, since that will negate the purpose of using natural keys.

I thought about using ROW_NUMBER() in the trigger as a sort of pseudo-column on the inserted and deleted tables; however I cannot use a "proper" ORDER BY in the OVER(...) clause, which makes the row numbers non-deterministic:

CREATE TRIGGER DRI
ON dbo.Parent
INSTEAD OF UPDATE
AS
BEGIN
    ;WITH i AS (
        SELECT *
            , rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
        FROM inserted
        )
    , d AS (
        SELECT *
            , rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
        FROM deleted
        )
    UPDATE dbo.Parent
    SET Parent.ParentName = i.ParentName
    FROM i
        INNER JOIN d ON i.rn = d.rn
    WHERE Parent.ParentName = d.ParentName

    ;WITH i AS (
        SELECT *
            , rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
        FROM inserted
        )
    , d AS (
        SELECT *
            , rn = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) 
        FROM deleted
        )
    UPDATE dbo.Child
    SET Child.ParentName = i.ParentName
    FROM i
        INNER JOIN d ON i.rn = d.rn
    WHERE Child.ParentName = d.ParentName
END
GO

The test:

UPDATE dbo.Parent SET ParentName = ParentName + 1;

SELECT *
FROM dbo.Parent;
╔════════════╦══════════╗
║ ParentName ║ IsActive ║
╠════════════╬══════════╣
║ 1          ║ 1        ║
║ 2          ║ 1        ║
║ 3          ║ 1        ║
╚════════════╩══════════╝
SELECT *
FROM dbo.Child;
╔═══════════╦════════════╦══════════╗
║ ChildName ║ ParentName ║ IsActive ║
╠═══════════╬════════════╬══════════╣
║ 7         ║ 1          ║ 1        ║
║ 8         ║ 2          ║ 1        ║
║ 9         ║ 3          ║ 1        ║
╚═══════════╩════════════╩══════════╝

This does appear to solve the problem, but I'm concerned it is not reliable.

Best Answer

I thought about using ROW_NUMBER() in the trigger as a sort of pseudo-column...

Stop thinking this. There is no guarantee of row order in the inserted or deleted pseudo-tables. You may observe it "working", but you will be relying on undocumented behaviour.

The standard workaround for the lack of an exposed unique row identifier to correlate updated rows (or an updated pseudo-table, or per-row triggers...etc.) is to add an immutable alternate key, such as a column with the IDENTITY property. I realize this goes against the design objective of the question, but that does not change the facts.

A rowversion column would work as well, but that is 8 bytes and rather a misuse.

Connect items: