Mysql – Stale table lock in AWS Aurora

auroraawsinnodbMySQLtransaction

We have an AWS Aurora (MySQL engine, r3.large) we are using to test a migration.

We have noticed a couple of queries that simply don't appear to run. When I run

show engine innodb status;

it reports that there is a single table locked:

2017-06-21 18:20:49 2aea8928a700 Transaction:
TRANSACTION 6093969, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
16 lock struct(s), heap size 376, 6164 row lock(s), undo log entries 1
MySQL thread id 12219, OS thread handle 0x2aea8928a700, query id 8842803 10.127.0.24 root updating
delete from <tablename> where id=<id>
Foreign key constraint fails for table `mydb`.`<otherTable>`:

This is a java application. I have stopped the tomcat instances, and verified that there are no active connections, but the lock does not clear up.

I looked at a couple of tables in information_schema (INNODB_LOCKS, INNODB_LOCK_WAITS, and INNODB_TRX) but there are 0 records.

Is this an aurora problem? Any solution other than AWS Support?

Best Answer

I can't tell if this involves Aurora, but here is something worth noting: There are occasions when InnoDB will leave lock lying around. How does this happen ?

About six years ago, I answer this question : Are InnoDB Deadlocks exclusive to INSERT/UPDATE/DELETE?. In that post I mentioned the following from the MySQL Documentation:

When InnoDB performs a complete rollback of a transaction, all locks set by the transaction are released. However, if just a single SQL statement is rolled back as a result of an error, some of the locks set by the statement may be preserved. This happens because InnoDB stores row locks in a format such that it cannot know afterward which lock was set by which statement.

This is still in the MySQL 5.7 Docs on Deadlock Detection and Rollback (paragraph 3)

You stated in the question

I looked at a couple of tables in information_schema (INNODB_LOCKS, INNODB_LOCK_WAITS, and INNODB_TRX) but there are 0 records.

In my answer to the 6-year-old question, I showed the query of this tabled used by a Percona Nagios Check plugin:

SELECT COALESCE(MAX(IF(p.command = 'Sleep', p.time, 0)), 0) AS idle_in_trx
FROM INFORMATION_SCHEMA.INNODB_LOCK_WAITS AS w
INNER JOIN INFORMATION_SCHEMA.INNODB_TRX        AS b ON  b.trx_id = w.blocking_trx_id
INNER JOIN INFORMATION_SCHEMA.INNODB_TRX        AS r ON  r.trx_id = w.requesting_trx_id
LEFT JOIN  INFORMATION_SCHEMA.PROCESSLIST       AS p ON  p.id     = b.trx_mysql_thread_id;

I tweaked that query to display the Process ID of the Blocking Transaction, the Process ID of the Requesting Transactions blocked and their respective values from INFORMATION_SCHEMA.PROCESSLIST:

SELECT COALESCE(MAX(IF(pb.command='Sleep',pb.time, 0)), 0) idle_in_trx,
pb.ID BLK_ID,pb.USER BLK_USER,pb.HOST BLK_HOST,pb.DB BLK_DB,
pb.COMMAND BLK_CMD,pb.TIME BLK_TIME,pb.STATE BLK_STATE,pb.INFO BLK_INFO,
pr.ID REQ_ID,pr.USER REQ_USER,pr.HOST REQ_HOST,pr.DB REQ_DB,
pr.COMMAND REQ_CMD,pr.TIME REQ_TIME,pr.STATE REQ_STATE,pr.INFO REQ_INFO,w.*
FROM       INFORMATION_SCHEMA.INNODB_LOCK_WAITS AS w
INNER JOIN INFORMATION_SCHEMA.INNODB_TRX        AS b  ON b.trx_id = w.blocking_trx_id
INNER JOIN INFORMATION_SCHEMA.INNODB_TRX        AS r  ON r.trx_id = w.requesting_trx_id
LEFT JOIN  INFORMATION_SCHEMA.PROCESSLIST       AS pb ON pb.id    = b.trx_mysql_thread_id
LEFT JOIN  INFORMATION_SCHEMA.PROCESSLIST       AS pr ON pr.id    = r.trx_mysql_thread_id
\G

If you run this, you will see one of two things:

  1. The Process ID causing this log jam of locks
  2. Everything coming up NULL (No transactions are blocked). This indicates that the lock is just stuck in LA-LA-LAND (Aurora is at fault). In that event, reboot RDS.