Mysql – Fastest way to do bulk import in MYSQL

importMySQLperformance

I have got a question about the fastest way to copy a large number of records (around 10 million) from 1 table to another in MySQL.

The target table has 1 primary key, 4 indexes and 3 foreign key constraints. The source table has the same primary key, 3 equal indexes, no foreign keys.
Before copying I can truncate the target table, so it's not necessary to any duplicate key check (source table has the same primary key after all).

On my local development machine I have experimented with insert into, insert ignore into, insert into … on duplicate key update, replace into. The fastest result so far has been around 75 minutes (equal results for replace into and insert ignore into). Is there a faster way to do this?

Best Answer

(Seems like I'm able to post detailed answer here today... let me know if this has helped you) Here is a test output for you...

mysql> truncate table t1;
Query OK, 0 rows affected (0.07 sec)

mysql> insert into t1 select * from t;
Query OK, 3842688 rows affected (36.19 sec)
Records: 3842688  Duplicates: 0  Warnings: 0

mysql>
mysql> \! rm /tmp/t.txt
mysql> select *  from t into outfile '/tmp/t.txt';
Query OK, 3842688 rows affected (7.24 sec)

mysql> load data infile '/tmp/t.txt' into table t1;
Query OK, 3842688 rows affected (25.20 sec)
Records: 3842688  Deleted: 0  Skipped: 0  Warnings: 0

With larger records and data-size you might see increased margin of difference.

If those are MyISAM tables then you might want to review following notes from documentation:

To ignore foreign key constraints during the load operation, issue a SET foreign_key_checks = 0 statement before executing LOAD DATA.

If you use LOAD DATA INFILE on an empty MyISAM table, all nonunique indexes are created in a separate batch (as for REPAIR TABLE). Normally, this makes LOAD DATA INFILE much faster when you have many indexes. In some extreme cases, you can create the indexes even faster by turning them off with ALTER TABLE ... DISABLE KEYS before loading the file into the table and using ALTER TABLE ... ENABLE KEYS to re-create the indexes after loading the file.

Also loading into/from memory tables is much faster but based on size you've mentioned I suspect we have sufficient memory (you might see table-full errors and issues due to that...)!

Anyways, try following commands and share the speed results:

select * from t into outfile '/tmp/t.txt';

load data infile '/tmp/t.txt' into table t1;

(I guess <60 mins but no magic...)