Sql-server – Efficient way to copy/sync a table – ideas on different approaches

sql servert-sql

We have an ISV database which we try to minimize contact with. It is an old 2008 R2 DB. We want to keep a copy of a number of tables in a different database. We are looking for a T-SQL solution to sync tables. This may need to take place when the system is being used.

I appreciate the answer will require further research and testing.

These tables are prone to locking from ISV app.
This is a low powered server and not really Enterprise.
Also we think the number of tables that we sync will increase
We want avoid anything like replication (the supplier of the ISV has habit of blaming any issues on us with no investigation whatsoever!)

These are my options (the ones I've thought of)

I've read bad things about MERGE in terms of performance, bearing in mind it's an older db, would this be fair?
tablediff.exe would be perfect but I'm concerned about locking the source table.
Is a straight truncate/insert with minimal locking/logging very inefficient – sounds stupid but I thought I'd include it.

Would just truncating and inserting a couple

Best Answer

You have many options, depending on what you want to achieve and the constraints you have on the source/copy database:

Merge Replication: can be bi-directional. Can be filtered and implement complex JOIN filters. The subscriber database is writable. Needs to have a rowguidcol on each table (uniqueidentifier UNIQUE NOT NULL ROWGUIDCOL column), so it can be a deal breaker if you can't touch the schema of the source database. Ships incremental changes quite efficiently.
Snapshot replication: uni-directional. Can be filtered. The subscriber database is writable. No schema changes needed. Doesn't ship incremental changes: it's all or nothing.
Transactional Replication: can be bi-directional (but updateable subscriptions is deprecated and needs an additional column on each updateable article. Peer-to-peer is enterprise-only, so I'm mentioning it so that you know it exists, but it won't work for you). Can have static filters. Every table needs to have a PRIMARY KEY defined, which is not typically the case in ugly, hairy, legacy third-party applications.
Mirroring: the database at the mirror server can be read from a snapshot, which is an Enterprise features, unless you use SQL Server 2016 SP1 or later, where snapshots have been included in all editions. Wouldn't work for HA, just for scale-out. Uni-directional, no filters, no schema changes needed. The snapshot is a static copy of the data at a given time, so you need to take snapshots at different times to see new data. This means you have to kill connections to older snapshots before you create the new one.
Log Shipping: can be used to obtain a read-only copy of the primary database by using RESTORE WITH STANDBY. Needs to kill all connections before restoring additional t-logs. No filters, no schema changes needed. Interferes with existing transaction log backup schedules.
Custom sync process: you could combine different techniques to build your own sync process, but I really advise against that. The complexities are countless and it's hard to outsmart existing scale-out or data distribution tecnhnologies built in the product.

Related Solutions

Sql-server – Approaches for Real time copy of database on same SQL Server instance

Transactional replication gets you close, but as you said, changing the subscriber schema is a recipe for disaster.

Transactional replication uses stored procedures to apply the data changes. You could, with every table change, change those procedures to deal with the underlying changes. That is a lot of additional manual work, but I don't think you will find a solution without that. Those procedures are created automatically when calling sp_addarticle (See value 0x02 for the @schema_option parameter and the @ins_cmd, @upd_cmd and @del_cmd parameters.)

However, you might be better of to take a copy at the beginning of the development cycle and then not apply new changes till the end of the cycle. That is a simple backup-restore with the added benefit that you do not have to constantly adjust testing scripts do deal with the changing data.

SQL Server – Alternative Approaches to Primary Record in Child Table

The other design option that came in my mind is to use CustomerID column in child table like Addresses and Contacts so that you will be making entry first in Customer table and that CustomerID will be referred in other table.

As suggested in comment by @a_horse_with_no_name, you can create Unique Index as

CREATE UNIQUE INDEX UniquePrimaryAddresss ON Addresses(CustomerID) WHERE IsPrimary = 1

Similar indexes on Contacts table as well

Best Answer

Related Solutions

Sql-server – Approaches for Real time copy of database on same SQL Server instance

SQL Server – Alternative Approaches to Primary Record in Child Table

Related Question