Mysql – How to efficently handle large amounts of insert statements

insertMySQLperformance

I have a file containing a large amount of insert statements roughly 30 GB worth of them. The insert statements themselves are rather simple and contain 5 different values. I now have the requirement the insert the data from these statements into a SQL database on Azure. Which tool would be best to easily and efficiently insert the data? This is a one-time thing, so the solution does not need to be elegant or repeatable. The key priority is getting it done with as little time spent on the solution as possible. Thanks for reading this.

Best Answer

If each line of the file is a well-formed SQL INSERT statement, listing target columns and has a VALUES clause you can simply submit is as input using the command line tool of your choice. It will be slow but it will work. It would be better to split the one 30GB file into many smaller files (perhaps about 100MB each?) and have a script submit each in succession. Then failures can be more easily recovered, the server will become less flooded, and the load can be halted and resumed as required.

If the file can be converted to raw data, say a CSV format, that can be submitted using whichever ETL or bulk load tool you have. Again, I'd split the source into multiple smaller files for easier management.

Of course getting the source files as close as possible to the server will reduce run-time network latency.

Finally, disable all triggers, foreign keys, secondary indexes etc. on the target table. They can be rebuilt or rechecked after all the data is in place. It will be faster overall than having them checked for each row as it is processed. Make sure the DB has sufficient disk to handle the new data and the DBMS can get to it. Figure out how you'll manage log growth as data is processed. Halt all other activity on the server while the load is in progress.