Sql-server – How to delete only related records in a multi-key MERGE in SQL Server

mergesql serversql-server-2012

Suppose you have something like this:

Source table (variable):

Values (
  LeftId INT NOT NULL,
  RightId INT NOT NULL,
  CustomValue varchar(100) NULL
)

Target table:

Mapping (
  LeftId INT NOT NULL,
  RightId INT NOT NULL,
  CustomValue varchar(100) NULL
)

I want to merge Values into Target with the following rules:

  • Match on source.LeftId = target.LeftId AND source.RightId = target.RightId
    • when matched in target, update CustomValue
    • when not matched in target, insert
  • Delete any unmatched values in the target that do match a LeftId in the source, i.e. only delete records that related to the LefIds of what I'm merging.

(That last rule is hard to describe, sorry!)

For instance:

Source:

1, 10, foo
1, 11, foo

Target:

1, 10, bar
1, 12, foo
2, 20, car

Merge result:

Result Target:

1, 10, foo (updated)
1, 11, foo (inserted)
1, 12, foo (deleted)
2, 20, car (unchanged)

So…

Here's what I have so far, which takes care of update and insert:

MERGE Mapping AS target
USING (SELECT LeftId, RightId, CustomValue FROM @Values) 
  AS source (LeftId, RightId, CustomValue)
  ON target.LeftId = source.LeftId
    AND target.RightId = source.RightId
WHEN NOT MATCHED THEN
  INSERT (LeftId, RightId, CustomValue)
  VALUES (source.LeftId, source.RightId, source.CustomValue)
WHEN MATCHED THEN
  UPDATE SET
    CustomValue = source.CustomValue;

How do I do the delete part of my rule?

Best Answer

This is the separate DELETE operation I had in mind:

DELETE m
FROM dbo.Mapping AS m
WHERE EXISTS 
  (SELECT 1 FROM @Values WHERE LeftID = m.LeftID)
AND NOT EXISTS 
  (SELECT 1 FROM @Values WHERE LeftID = m.LeftID AND RightID = m.RightID);

As I outline here, for a left anti-semi join, the NOT EXISTS pattern will often outperform the LEFT JOIN / NULL pattern (but you should always test).

Not sure if your overall goal is clarity or performance, so only you can judge if this will work out better for your requirements than the NOT MATCHED BY source option. You'll have to look at the plans qualitatively, and the plans and/or runtime metrics quantitatively, to know for sure.

If you expect your MERGE command to protect you from race conditions that would happen with multiple independent statements, you better make sure that is true by changing it to:

MERGE dbo.Mapping WITH (HOLDLOCK) AS target

(From Dan Guzman's blog post.)

Personally, I would do all of this without MERGE, because there are unresolved bugs, among other reasons. And Paul White seems to recommend separate DML statements as well.

And here's why I added a schema prefix: you should always reference objects by schema, when creating, affecting, etc.