I need to remotely insert 3.5 million rows to my amazon rds. Currently my data is in an access database, for which I will use bullzip, access to mysql to create dump files. I have created my schema on the RDS and defined the data types pre-insert (because when you auto upload through the app it defines most cols as varchar(255)).
So far I have used the following methods, when uploading to a test server, whilst building my application.
-
Direct transfer using bullzip's app. Left it to run overnight once, and 1.7 million rows had inserted after about 10 hours.
-
Split the file into numerous sql dumps (300k rows each at the most, split by date). Which is also incredibly slow ( over an hour so far to 200k rows)
-
Export to CSV, and upload through mysql workbench. ( also taking a hell of a long time)
I believe that my issue is that bullzip's application, produces an sql file which performs an insert statement for every row, rather than multi-row insert queries.
Is there a way to produce an sql file, in a multi-row insert format?
Or, do I have to just put up with it, because I'm exporting from access?
And I have read elsewhere that you can only add 50k rows to a multi-value insert. So is what I'm asking even suitable for me?
This is the SHOW CREATE TABLE:
'CREATE TABLE `1_txns` (
`txn_id` varchar(32) DEFAULT NULL,
`nationality_id` int(3) DEFAULT NULL,
`transaction date` int(8) DEFAULT NULL,
`Yr` int(4) DEFAULT NULL,
`mth` int(2) DEFAULT NULL,
`dy` int(2) DEFAULT NULL,
`hr` int(2) DEFAULT NULL,
`Min` int(2) DEFAULT NULL,
`Flight No` varchar(12) DEFAULT NULL,
`sku` varchar(20) DEFAULT NULL,
`sales` decimal(8,2) DEFAULT NULL,
`units` int(8) DEFAULT NULL,
KEY `txnid` (`txn_id`),
KEY `skutxnid` (`sku`,`txn_id`),
KEY `skunatid` (`sku`,`nationality_id`,`txn_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8'
Best Answer
It would be convenient to diagnose where is the performance bottleneck in this case. Can you monitor the network throughput / CPU usage / disk usage to see which one is higher while uploading?