MySQL Write performance bottleneck

bulk-insertinnodbMySQLmysql-5.5performance

We are trying to diagnose a MySQL Bottleneck, specifically for data writes. Basically, we have a single process that picks files from a folder and load into a MySQL table X using LOAD DATA CONCURRENT INFILE. Currently it appears that we are producing more files than that can be consumed by MySQL. Specifically, we are producing around a million rows a minute (all files combined) and our ingestion rate is at 15K rows per second (900,000 write)(based on Show Engine INNODB Status).

The table X has 31 columns of which 17 are Integers, 12 Floats and 1 Char (10), fairly straightforward. It also has an auto-increment integer column and a primary key on timestamp (epoch) and the auto-incr. column.

MySQL Instance Info (Physical host): 40 Core, 256 GB RAM, 8 SSD as a single logical drive (RAID 0) We are using both, InnoDb and MYIASM but the table we are trying to optimize writes for resides in Innodb.

Current resource utilization: 10% CPU, 50% system memory (allocated), Low Disk utilization

MySQL Config: table_cache = 1000
thread_cache = 60
max_heap_table_size = 8192M
join_buffer_size = 256K
sort_buffer_size = 256K
query_cache_size = 512M
query_cache_limit = 512M
key_buffer_size = 8000M 
bulk_insert_buffer_size = 2000M
myisam_sort_buffer_size = 8000M
tmp_table_size = 8192M 
myisam_repair_threads  = 5
innodb_open_files = 10000
open_files_limit = 10000
concurrent_insert = 2

#innodb
innodb_flush_log_at_trx_commit = 2
innodb_support_xa = 0
innodb_buffer_pool_size = 128G
innodb_buffer_pool_instances =8
innodb_flush_method=O_DIRECT
innodb_log_file_size=1G
innodb_log_buffer_size=256M
innodb_thread_concurrency=8
innodb_file_per_table innodb_stats_on_metadata = 0
innodb_io_capacity=10000

We turned on Performance schema and following are the culprit events sorted by total wait time. The first set has non-INNODB event and 2nd set is for INNODB events.

mysql> SELECT EVENT_NAME, SUM_TIMER_WAIT FROM events_waits_summary_global_by_event_name where event_name not like '%innodb%' ORDER BY SUM_TIMER_WAIT DESC LIMIT 20;
+-------------------------------------------------------+-----------------+
| EVENT_NAME                                            | SUM_TIMER_WAIT  |
+-------------------------------------------------------+-----------------+
| wait/synch/cond/sql/MDL_context::COND_wait_status     | 247003180269200 |
| wait/synch/mutex/mysys/KEY_CACHE::cache_lock          | 225136157662400 |
| wait/io/file/myisam/dfile                             | 112586106594800 |
| wait/synch/cond/mysys/my_thread_var::suspend          | 36942967073600  |
| wait/io/file/sql/load                                 | 28575680068400  |
| wait/synch/mutex/sql/HA_DATA_PARTITION::LOCK_auto_inc | 15671510821200  |
| wait/synch/rwlock/myisam/MYISAM_SHARE::key_root_lock  | 7580580832000   |
| wait/io/file/sql/global_ddl_log                       | 759675521600    |
| wait/io/file/myisam/kfile                             | 478606412800    |
| wait/synch/mutex/sql/LOCK_open                        | 317343598000    |
| wait/io/file/sql/FRM                                  | 281702406000    |
| wait/synch/mutex/myisam/MYISAM_SHARE::intern_lock     | 237524998400    |
| wait/synch/mutex/mysys/THR_LOCK::mutex                | 148067737600    |
| wait/synch/cond/myisam/MI_SORT_INFO::cond             | 53355489600     |
| wait/synch/mutex/sql/TABLE_SHARE::LOCK_ha_data        | 40690076800     |
| wait/io/file/sql/slow_log                             | 32342937200     |
| wait/io/file/sql/partition                            | 32000921600     |
| wait/synch/cond/mysys/IO_CACHE_SHARE::cond            | 20865078800     |
| wait/synch/mutex/mysys/THR_LOCK_open                  | 18582940800     |
| wait/synch/mutex/sql/THD::LOCK_thd_data               | 15040836800     |
+-------------------------------------------------------+-----------------+




mysql> SELECT EVENT_NAME, SUM_TIMER_WAIT FROM events_waits_summary_global_by_event_name where event_name like '%innodb%' ORDER BY SUM_TIMER_WAIT DESC LIMIT 20;
+-----------------------------------------------+-----------------+
| EVENT_NAME                                    | SUM_TIMER_WAIT  |
+-----------------------------------------------+-----------------+
| wait/io/file/innodb/innodb_data_file          | 161894926356400 |
| wait/synch/mutex/innodb/log_sys_mutex         | 78901379729200  |
| wait/io/file/innodb/innodb_log_file           | 54154967108000  |
| wait/synch/mutex/innodb/log_flush_order_mutex | 31350108798800  |
| wait/synch/rwlock/innodb/btr_search_latch     | 29658726116800  |
| wait/synch/mutex/innodb/buf_pool_mutex        | 26124486124000  |
| wait/synch/mutex/innodb/kernel_mutex          | 9542359283600   |
| wait/synch/mutex/innodb/trx_undo_mutex        | 4636908475600   |
| wait/synch/mutex/innodb/autoinc_mutex         | 2176569546400   |
| wait/synch/mutex/innodb/fil_system_mutex      | 1580851583600   |
| wait/synch/mutex/innodb/dict_sys_mutex        | 764196921200    |
| wait/synch/mutex/innodb/flush_list_mutex      | 493146643200    |
| wait/synch/mutex/innodb/ibuf_mutex            | 415302790000    |
| wait/synch/rwlock/innodb/dict_table_stats     | 346864396800    |
| wait/synch/mutex/innodb/mutex_list_mutex      | 341627459200    |
| wait/synch/rwlock/innodb/checkpoint_lock      | 263200847600    |
| wait/synch/mutex/innodb/rw_lock_list_mutex    | 247598360000    |
| wait/synch/mutex/innodb/trx_doublewrite_mutex | 84526368000     |
| wait/synch/mutex/innodb/rseg_mutex            | 76199049600     |
| wait/synch/mutex/innodb/ibuf_bitmap_mutex     | 12117140800     |
+-----------------------------------------------+-----------------+

I would really appreciate if someone could point us in the right direction.

Also, we have been tweaking the innodb variables such as innodb_concurrency_tickets, innodb_write_io_threads, innodb_flash_log_at_trx_commit, etc but are yet to observe a major shift in performance.

Best Answer

Step 1: Turn OFF the Query cache. Every write causes purging of all entries in the QC for that table !

query_cache_type = OFF
query_cache_size = 0

Please provide SHOW CREATE TABLE. Getting rid of unnecessary indexes will improve INSERT performance, perhaps significantly. It sounds like you have

PRIMARY KEY(ts, id)
UNIQUE(id)

It would be more efficient to have just

PRIMARY KEY(ts, id)
INDEX(id)

Yes, UNIQUE is not required for AUTO_INCREMENT (assuming you don't deliberately try to insert dup ids).

Are all the columns as small as practical? That is, are you blindly using INT when, for example, SMALLINT UNSIGNED would work? Smaller --> less I/O --> faster. (And you appear to be I/O-bound.)

If you are using FusionIO, you can safely turn off the double write buffer.

Have you considered using multiple connections to insert multiple datasets at the 'same' time?

Are you doing any processing (eg, normalization) of the data as part of the ingestion?