Consider the following table:
Id Hash
----------- ----------------------------------
1 0x31F777F0804D301936411E3ECD760859
2 0xD64A593F3E9ACC972158D522A4289EA0
(Id is an identity column)
In to that table I want to merge the following dataset:
Hash
----------------------------------
0x31F777F0804D301936411E3ECD760859
0x31F777F0804D301936411E3ECD760859
0x0C5A65264F92A543E7AAA06375349C06
(Id is NOT present in the dataset)
The rules for the merge are:
- If the hash does not exist in the table, insert it to the table;
- If the hash does not exist in the dataset, delete it from the table;
- If the hash does exist on both sides, and there are X instances in the table and Y instances in the source, then (Y-X) instances should be inserted in to the table.
The result of the merge should leave the table looking like this:
Id Hash
----------- ----------------------------------
1 0x31F777F0804D301936411E3ECD760859
3 0x31F777F0804D301936411E3ECD760859
4 0x0C5A65264F92A543E7AAA06375349C06
What is the most efficient way to write a query to achieve this operation? FYI, other columns are omitted for brevity.
Best Answer
Using the sample data:
You could write it as a
MERGE
:db<>fiddle
But for performance reasons (as well as some bugs), I would normally write it as two separate statements:
db<>fiddle
You should have a unique index on the target on
([Hash], [Id])
. Chances are you have this already, or equivalently an index on[Hash]
and a unique (possibly clustered) index on[Id]
.There may be other issues depending on what else was omitted from the question for brevity. In any case it should provide a couple of possible starting points for your own solution.