I need to sync a large table ~500 millions rows without a primary key between SQL Server and MySQL. The table has only a clustered composite non-unique index.
I do have a ODBC connection between the servers, but an import of ~8 million rows took around 45 minutes, so I believe a larger single import would be unreasonable as interruptions may occur at any point. I can't change the existing table structure, I can add other tables. After further reading, offset / fetch is not an option for large tables. "Select … where x between … and … " is not an option as I don't have an unique key.
How can I export the table in batches that are guaranteed to contain all rows?
My problem is that since the clustered key is not unique, ordering after it would not guarantee the physical rows have the same order between consecutive queries and ordering after all columns would take too long. And how would you recommend to migrate the batches, through ODBC or CSV files?
Best Answer
Assuming you don't have updates or deletes against source table you can try the following :
1. Make a copy of existing table using CTAS syntax (for SQLServer it's
SELECT * into source_table_copy FROM source_table
). Such operation is very fast even for huge tables.2. Add
after insert
trigger onsource_table
that copies new record[s] tosource_table_copy
.3. Now when all new records in
source_table
go tosource_table_copy
as well, and you can move data from copied table to Mysql in batches. For instance, if you have a link between 2 servers, everything can be done within the body of TSQL stored procedure.E.g. a piece of code that move up to 20 records to new server might look like
It's also possible to use CURSOR to read data and then delete with
where current of
clause.**Ideally you need to prevent applications from inserting data into
source_table
during step 1. If it's absolutely impossible , I'll go with anafter insert
trigger which is added right before step 1 and removed right after it's done which copies data to some other table I can later merge withsource_table_copy
.