Sql-server – Will a MERGE between two tables ever be faster than a DROP and SELECT INTO on the second table

data synchronizationmergeperformancequery-performancesql serversql-server-2016

Is there a scenario where it's faster for synchronizing the data from TableA to TableB by using a MERGE query as opposed to dropping TableB and doing a SELECT * INTO TableB FROM TableA?

Best Answer

Will a MERGE between two tables ever be faster than a DROP and SELECT INTO on the second table?

I think, in general, I'd say probably not. But it depends on a lot of factors.

Bulk Loading (minimal logging)

SELECT...INTO will benefit from bulk load optimizations. The INSERT portion of the MERGE query might benefit from these optimizations, but the UPDATE portion won't. So the MERGE statement will likely result in more transaction log writes, making it slower.

DROP vs DELETE

SELECT...INTO will also benefit from the fact that DROP TABLE is a metadata operation, and thus will "remove" all the rows very quickly. The DELETE portion of the MERGE query will be logged normally, resulting in more transaction log writes, making it slower.

Concurrency

SELECT...INTO will benefit from reduced locking / blocking / concurrency issues, because it will have exclusive access to the target table. The MERGE statement will potentially have to deal with lock escalation, blocking by other processes, etc, unless you provide TABLOCK hints.

Index Maintenance

The MERGE statement also has to deal with index maintenance (you mentioned there is a clustered index and a nonclustered index on the target table in this scenario), whereas the SELECT...INTO does not. To mitigate this, you could potentially drop the indexes and recreate them after the MERGE has run.

Additional thoughts

All of this somewhat depends on what the breakdown (in terms of # of rows) is between INSERTs, UPDATEs, and DELETEs in the MERGE statement, but in general I would expect the SELECT...INTO case to be quicker unless the MERGE statement has been designed very carefully.

Some related information that might be interesting to you and this situation:


Sidebar: in a comment, you mentioned:

I'm asking because in my case I have to synchronize about 300 million records, and with my testing of only 5 million records, the SELECT INTO took about 30 seconds, the MERGE took about 4 minutes. :(

To compare "apples to apples" make sure to include the time it takes to create the two indexes you mentioned for the SELECT...INTO case:

In my specific case, there's actually no indexes on TableB at the time of the SELECT INTO, but a single unique clustered (PK) and a single non-clustered index is created on it on after the data is inserted. If a MERGE query was used instead, then both tables would have unique clustered indexes and non-clustered indexes on them.