MySQL commits taking time

amazon-rdsMySQLmysql-5.7performanceslow-log

We are facing an issue, where at certain times there are number of commits that takes a more than a second. The time interval for this is not fixed. Our application sends thousands of commits but at some instance the commits with exact same timestamp gets slow.

This is extracted from MySQL slow query logs
enter image description here

There isn't anyother query in the slow query log at the same time. Any pointers on how it can be further investigated?

We are using MySQL 5.7.16 on Amazon RDS.

Best Answer

Since your commits are stalling, it's pretty safe to assume you're running with innodb_flush_log_at_trx_commit=1 (see #4 here), so one obvious change you could do is switch to innodb_flush_log_at_trx_commit=0, however that may not be the right choice for you - see the previous link.

In any case, COMMITs don't usualy stall due to other activity on the server (like queries), unless that other activity locks some kind of mutex for far too long.

If I had to guess without any further evidence, I would say that your writes are so intensive, that MySQL is unable to flush fast enough using adaptive checkpointing, so it exceeds 75% of the log capacity and starts sharp checkpointing (which blocks server activity for a short period).

There are few quick ways to make MySQL handle writes better (besides one already mentioned above):

  1. Increase innodb log files (innodb_log_file_size) - you can go up to 4GB or so for your logs (esimating based on zero information given on the amount of writes),
  2. Increase pIOPS capacity for the data drive
  3. Increase innodb_io_capacity (up to ~70-80% of pIOPS capacity)

Otherwise it would be really useful to see disk activity from AWS, amount of data server currently writes (you can calculate it using Log sequence number delta between two consecutive show engine innodb status\G runs)

There are other ways too, but for that, I'd need a lot more details on the actual workload.