I have a table in the database that I need to update based on the records being passed to it. I am given a table-valued-parameter that has the same schema as the existing table (for all intensive purposes). The requirements are:
-
If the TVP (source) contains a record with existing ID in the existing table (target) update that record, if it is not marked as deleted in the existing table and its modified date is more recent.
- Conditions:
source.Id = target.Id AND target.IsDeleted = 0
AND source.Modified > target.Modified
- Conditions:
-
If the TVP contains a record for which the ID does not exist in the existing table, insert the record to the table.
I have the following set up so far.
EDITED: This setup works to give the desired end result. I, however, don't believe this is the best way of getting the intended result. There are two select statements that calculate the source table for the Merge statement. I would like to avoid having to filter out the source table in this manner.
My question is this: is there a way to achieve the same result as the query below but without having to use two select statements as is currently used in the Merge statement?
IF OBJECT_ID('tempdb..#target') IS NOT NULL DROP TABLE #target
IF OBJECT_ID('tempdb..#source') IS NOT NULL DROP TABLE #source
CREATE TABLE #target ( Id int, Modified int, IsDeleted bit, NoteText varchar(500) )
INSERT #target VALUES (1, 1, 0, '')
INSERT #target VALUES (2, 1, 1, '')
INSERT #target VALUES (3, 3, 0, '')
CREATE TABLE #source ( Id int, Modified int, NoteText varchar(500) )
INSERT #source VALUES (1, 2, 'Modified')
INSERT #source VALUES (2, 2, 'Modified')
INSERT #source VALUES (3, 2, 'Modified')
INSERT #source VALUES (4, 2, 'Modified')
-- ** NEED HELP WITH OPTIMIZATION HERE **
merge #target as t
using (select * from #source s where s.id not in (select id from #target where IsDeleted = 1 OR Modified > s.Modified)) as source
on t.id=source.id
when matched then update set Modified=source.Modified, NoteText = source.NoteText
when not matched then insert values (id, Modified, 0 ,NoteText) ;
SELECT * FROM #target;
-- Expected records in #target after operations
-- 1,2,0,'Modified' ** UPDATED **
-- 2,1,1,'' ** NOT UPDATED BECAUSE ISDELETED = 1 **
-- 3,3,0,'' ** NOT UPDATED BECAUSE TARGET.MODIFIED > SOURCE.MODIFIED **
-- 4,2,0,'Modified' ** INSERTED **
drop table #target;
drop table #source
Best Answer
A simpler way to write the
MERGE
is as follows:The
SERIALIZABLE
hint would be necessary to avoid a potential race condition. This consideration obviously does not apply to a local temporary table target, but it would be remiss of me not to mention.In broad terms, it is almost always wrong to perform any sort of filtering in the
ON
clause (or equivalently, in aCTE
). TheON
clause should only be used to specify how rows between the source and target are related.See Inserting, Updating, and Deleting Data by Using Merge for more details.
The source and target tables should also have a unique index or key. Moreover, for best performance SQL Server should be able to guarantee that the
MERGE
statement will affect each target row a maximum of once. Without that, the execution plan will have to detect multiple DML operations on the same row in order to raise a runtime error if it happens. This often requires a Sort, and always requires a Segment, Sequence Project, and Assert. None of these will make your command execute any faster :)