MySQL Remote Insert – How to Insert 3.5 Million Records

insertMySQLremote

I need to remotely insert 3.5 million rows to my amazon rds. Currently my data is in an access database, for which I will use bullzip, access to mysql to create dump files. I have created my schema on the RDS and defined the data types pre-insert (because when you auto upload through the app it defines most cols as varchar(255)).

So far I have used the following methods, when uploading to a test server, whilst building my application.

  1. Direct transfer using bullzip's app. Left it to run overnight once, and 1.7 million rows had inserted after about 10 hours.

  2. Split the file into numerous sql dumps (300k rows each at the most, split by date). Which is also incredibly slow ( over an hour so far to 200k rows)

  3. Export to CSV, and upload through mysql workbench. ( also taking a hell of a long time)

I believe that my issue is that bullzip's application, produces an sql file which performs an insert statement for every row, rather than multi-row insert queries.

Is there a way to produce an sql file, in a multi-row insert format?

Or, do I have to just put up with it, because I'm exporting from access?

And I have read elsewhere that you can only add 50k rows to a multi-value insert. So is what I'm asking even suitable for me?

This is the SHOW CREATE TABLE:

'CREATE TABLE `1_txns` (
  `txn_id` varchar(32) DEFAULT NULL,
  `nationality_id` int(3) DEFAULT NULL,
  `transaction date` int(8) DEFAULT NULL,
  `Yr` int(4) DEFAULT NULL,
  `mth` int(2) DEFAULT NULL,
  `dy` int(2) DEFAULT NULL,
  `hr` int(2) DEFAULT NULL,
  `Min` int(2) DEFAULT NULL,
  `Flight No` varchar(12) DEFAULT NULL,
  `sku` varchar(20) DEFAULT NULL,
  `sales` decimal(8,2) DEFAULT NULL,
  `units` int(8) DEFAULT NULL,
  KEY `txnid` (`txn_id`),
  KEY `skutxnid` (`sku`,`txn_id`),
  KEY `skunatid` (`sku`,`nationality_id`,`txn_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8'

Best Answer

It would be convenient to diagnose where is the performance bottleneck in this case. Can you monitor the network throughput / CPU usage / disk usage to see which one is higher while uploading?

  • If it's a network bandwidth problem, make sure you're using compression in server/client protocol, any client should have this feature, Mysql Workbench for sure does
  • Most probably it's not related to the fact that the data comes from Access
  • Having your own test Mysql server inside your network would be of use for two things: importing the data there to see if the network is the problem; and importing it there and exporting it from there, forcing a multi-row insert export format