Mysql – Help the database isn’t performing fast enough! 100M Merge with 6M need < 1 hour!

innodbmyisamMySQLmysql-5.5

I have a server right now receiving more raw data files in 1 hour then I can upsert (insert -> merge) in an hour.

I have a table with 100M (rounded up) rows. Table is currently MyISAM. The table has 1000 columns mostly boolean and a few varchar.

Currently the fastest way i've found to get the information into my DB until now was:

Process raw data into CSV files.
Load Data In File to rawData Table.
Insert rawData table into Table1. (on dupe key do my function)
Truncate rawData
Repeat.
Worked fine until im merging 6M+ Rows into 100M rows and expecting it to take under an hour.

I got 16G of ram so I set my Key_Buffer_Pool to 6G.
I have my query cache pool to 16M
I have my query cache limit to 10M
I would just replace the information however it has to be an Upsert, Update the fields that are true if exists and insert if it does not.

Things im looking into atm; – Possibly switching server table to InnoDB? |-> Not sure about the performance, as the insert into an empty table is fine, its the merge that's slow.

Maybe allowing more table cache? Or even Query Cache?
mysql sql mysqli innodb myisam

Merge Code:

b.3_InMarket = (b.3_InMarket OR r.3_InMarket),

To compare my 2 bool columns.

Update

  • Ok I set Raid0
  • Changed my query to Lock Write on tables when inserting
  • When importing csv im disabling keys then re-enabling them before upsert.
  • Changed concurrent_insert to 2

Best Answer

If your table has a table with more than 1000 columns, it cannot be converted to InnoDB. In that case, run this query

SELECT CEILING(SUM(index_length)/POWER(1024,2)) num
FROM information_schema.tables WHERE engine='MyISAM';

This will give you the correct size for key_buffer_size in MB.

Since you are doing an UPSERT, you should set concurrent_insert to 2 to make INSERTs go faster. You may want to consider changing the table's row format to Fixed. I wrote about why to do both in StackOverflow. In essence, if you make the table's row format Fixed, all table rows are the same size. Thus, INSERTs and UPDATEs would operate on the exact same length of data. Management of row access is far more reasonable.

Since MyISAM only caches indexes (in the key buffer), all data must be read from disk. anything you can do to getting better RAID performance (as asked by @TomTom) would help your cause as well.