We have an application (on Amazon RDS MySQL 5.7; db.m5.24xlarge
; 96 cores) that takes in a lot of data, in parallel processors+queues. It mostly goes to one table, with primary keys not spreading over multiple queues. Entries are batched in transactions.
When we increase the amount of processors+queues, there's a tipping point where all wait time is spent on wait/synch/mutex/innodb/lock_mutex
, but I can't really find what that means. It even takes CPU cycles, so these are spinlock mutexes?
Performance insights:
I already disabled the deprecated query cache and adaptive hash index, because they are all mutex-governed and don't help our INSERT ON DUPLICATE KEY UPDATE
heavy workload.
So, what is the resource governed by wait/synch/mutex/innodb/lock_mutex
, and do I have control over it, either with settings or perhaps using a lower transaction isolation level?
Edit:
show engine innodb status
shows:
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 10591578
--Thread 47139002062592 has waited at lock0lock.cc line 6415 for 0 seconds the semaphore:
Mutex at 0x2b2be1000058, Mutex LOCK_SYS created lock0lock.cc:454, lock var 1
wait has ended
--Thread 47138574509824 has waited at lock0lock.cc line 6415 for 0 seconds the semaphore:
Mutex at 0x2b2be1000058, Mutex LOCK_SYS created lock0lock.cc:454, lock var 1
wait has ended
--Thread 47136822810368 has waited at lock0lock.cc line 6342 for 0 seconds the semaphore:
Mutex at 0x2b2be1000058, Mutex LOCK_SYS created lock0lock.cc:454, lock var 1
wait has ended
--Thread 47137720436480 has waited at lock0lock.cc line 6342 for 0 seconds the semaphore:
Mutex at 0x2b2be1000058, Mutex LOCK_SYS created lock0lock.cc:454, lock var 1
and many more
Edit: also interesting in the MySQL 8 changelog, it says:
InnoDB: To improve concurrency for operations that require access to
lock queues for table and row resources, the lock system mutex
(lock_sys->mutex) was replaced by sharded latches, and lock queues
were grouped into table and page lock queue shards, with each shard
protected by a dedicated mutex. Previously, the single lock system
mutex protected all lock queues, which was a point of contention on
high-concurrency systems. The new sharded implementation permits more
granular access to lock queues.
So am I affected by that? I'll just have to perform a 'quick' upgrade…
Best Answer
The sharded lock-sys was introduced in this commit.
They have good commit messages, which explains what lock-sys is (which is the original question I had):
So, this wasn't anything you have control over as a user, aside from reducing concurrency.
Test running the system on MySQL 8 shows a marked improvement:
On a 96 core MySQL server, these are 64 queue processors (=64 processes) operating on the same table, but all on different rows (mostly updating). You can see that there is no mutex wait state anymore, aside from CPU and SQL handling, which is good.