MySQL cycling between periods of high I/O and high lock contention

innodblockingMySQL

Starting about 2 days ago, my CloudSQL DB started having this periodic cycle of high latency. The time between these latency bursts vary, from 30 minutes to 50 minutes.

I finally set up a PMM instance today to get a better look at things. I see that the periods of high latency coincides with high active thread count and a lot of lock activity. CloudSql metrics show that these high latency periods also correlate with a reduction in the buffer pool dirty page count

Our workfloads are pretty consistent, certainly we have nothing running that's cyclic with a varying period. Changing some MySQL configs (io threads and lru scan depth) seem to have some impact on this period.

I uploaded some PMM screenshots here, of I/O, locks, and flushing over the last couple of hours. The pattern is pretty apparent.

Data I/O
enter image description here

Locks
enter image description here

Flushing
enter image description here

What is going on? I assume there is something going on under the hood of MySQL operating normally while using up some resource, then going through this period of high latency while trying to free up more of that resource to be able to operate normally again, but I can't piece together exactly what this is.

For more context: this started happening when we added a couple columns (unindexed) to a table. Since then we've reversed that ALTER TABLE and restarted the DB a couple of times to apply config changes. I'm interested both in finding a root cause and in applying a short term band aid to decrease the effect of these spikes.

Thanks!

Best Answer

Point 1 : Locking Management

You must check which processes are staying in locking state & what you can do it for locking i.e creating index for faster retrieval ... You can check deadlock in show engine innodb status

Point 2: Redo Log Buffer & Files

Make sure the redo log buffer & files are big enough to capture many transaction.

Point 3: Flusing Method

Set innodb_flush_method=O_DIRECT for speedup with write operation.

Still this is not enough, you need to check Innodb buffer pool,OS level parameters(iostat,vmstat...)