SQL Server Merge – How to Update and Insert Rows in SQL Server 2008 R2

mergesql serversql-server-2008-r2upsert

I have a table in the database that I need to update based on the records being passed to it. I am given a table-valued-parameter that has the same schema as the existing table (for all intensive purposes). The requirements are:

If the TVP (source) contains a record with existing ID in the existing table (target) update that record, if it is not marked as deleted in the existing table and its modified date is more recent.
- Conditions: source.Id = target.Id AND target.IsDeleted = 0 AND source.Modified > target.Modified
If the TVP contains a record for which the ID does not exist in the existing table, insert the record to the table.

I have the following set up so far.

EDITED: This setup works to give the desired end result. I, however, don't believe this is the best way of getting the intended result. There are two select statements that calculate the source table for the Merge statement. I would like to avoid having to filter out the source table in this manner.

My question is this: is there a way to achieve the same result as the query below but without having to use two select statements as is currently used in the Merge statement?

IF OBJECT_ID('tempdb..#target') IS NOT NULL DROP TABLE #target    
IF OBJECT_ID('tempdb..#source') IS NOT NULL DROP TABLE #source

CREATE TABLE #target  ( Id int, Modified int, IsDeleted bit, NoteText varchar(500) )    
INSERT #target VALUES (1, 1, 0, '') 
INSERT #target VALUES (2, 1, 1, '') 
INSERT #target VALUES (3, 3, 0, '') 

CREATE TABLE #source ( Id int, Modified int, NoteText varchar(500) )     
INSERT #source VALUES (1, 2, 'Modified') 
INSERT #source VALUES (2, 2, 'Modified')
INSERT #source VALUES (3, 2, 'Modified')
INSERT #source VALUES (4, 2, 'Modified') 

-- ** NEED HELP WITH OPTIMIZATION HERE **
merge #target as t 
using (select * from #source s where s.id not in (select id from #target where IsDeleted = 1 OR Modified > s.Modified)) as source
on t.id=source.id 
when matched then update set Modified=source.Modified, NoteText = source.NoteText 
when not matched then insert values (id, Modified, 0 ,NoteText) ;

SELECT * FROM #target; 

-- Expected records in #target after operations
-- 1,2,0,'Modified'  ** UPDATED **         
-- 2,1,1,''          ** NOT UPDATED BECAUSE ISDELETED = 1 **
-- 3,3,0,''          ** NOT UPDATED BECAUSE TARGET.MODIFIED > SOURCE.MODIFIED **
-- 4,2,0,'Modified'  ** INSERTED **

drop table #target;
drop table #source

Best Answer

A simpler way to write the MERGE is as follows:

MERGE #target /* WITH (SERIALIZABLE) */ AS T
USING #source AS S
    ON S.Id = T.Id
WHEN MATCHED
    AND T.IsDeleted = 0
    AND S.Modified > T.Modified
    THEN UPDATE SET
        Modified = S.Modified,
        NoteText = S.NoteText
WHEN NOT MATCHED THEN
    INSERT (Id, Modified, IsDeleted, NoteText)
    VALUES (S.Id, S.Modified, 0, S.NoteText);

The SERIALIZABLE hint would be necessary to avoid a potential race condition. This consideration obviously does not apply to a local temporary table target, but it would be remiss of me not to mention.

In broad terms, it is almost always wrong to perform any sort of filtering in the ON clause (or equivalently, in a CTE). The ON clause should only be used to specify how rows between the source and target are related.

See Inserting, Updating, and Deleting Data by Using Merge for more details.

The source and target tables should also have a unique index or key. Moreover, for best performance SQL Server should be able to guarantee that the MERGE statement will affect each target row a maximum of once. Without that, the execution plan will have to detect multiple DML operations on the same row in order to raise a runtime error if it happens. This often requires a Sort, and always requires a Segment, Sequence Project, and Assert. None of these will make your command execute any faster :)

Related Solutions

Sql-server – SQL Server Storage sql_variant

The first 2 answers are from the SQL Server Internals Book p.278

0x38 is 56 in decimal. This indicates int in sys.types (system_type_id column)
0x01 is the version number of the sql_variant format (always 1 in SQL Server 2008)
This is the way the literal 1 is always interpreted in SQL Server. e.g. SELECT 1 AS foo INTO NewTable will create a new column of integer datatype. Use an explicit cast if you want it to be treated as a different datatype.
This is part of the product source code. You cannot view the definition.

BTW: If you are looking at other datatypes as well you may encounter some additional bytes between the version number and the column value as follows.

numeric/decimal have 1 byte each for precision and scale.
[n][var]char have 2 bytes for max length and 4 bytes for collation id.
[var]binary have 2 bytes for max length.

Sql-server – How to delete only related records in a multi-key MERGE in SQL Server

This is the separate DELETE operation I had in mind:

DELETE m
FROM dbo.Mapping AS m
WHERE EXISTS 
  (SELECT 1 FROM @Values WHERE LeftID = m.LeftID)
AND NOT EXISTS 
  (SELECT 1 FROM @Values WHERE LeftID = m.LeftID AND RightID = m.RightID);

As I outline here, for a left anti-semi join, the NOT EXISTS pattern will often outperform the LEFT JOIN / NULL pattern (but you should always test).

Not sure if your overall goal is clarity or performance, so only you can judge if this will work out better for your requirements than the NOT MATCHED BY source option. You'll have to look at the plans qualitatively, and the plans and/or runtime metrics quantitatively, to know for sure.

If you expect your MERGE command to protect you from race conditions that would happen with multiple independent statements, you better make sure that is true by changing it to:

MERGE dbo.Mapping WITH (HOLDLOCK) AS target

(From Dan Guzman's blog post.)

Personally, I would do all of this without MERGE, because there are unresolved bugs, among other reasons. And Paul White seems to recommend separate DML statements as well.

And here's why I added a schema prefix: you should always reference objects by schema, when creating, affecting, etc.

Best Answer

Related Solutions

Sql-server – SQL Server Storage sql_variant

Sql-server – How to delete only related records in a multi-key MERGE in SQL Server

Related Question