Mysql/innodb deadlock on simple delete query

deadlockinnodbMySQL

Mysql/innodb 8.0.16, read committed transactions, several statements in transaction raise deadlock on deletion of non intercepting rows. Trying to understand what is happening:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-05-14 21:57:44 0x7fe9546c6700
*** (1) TRANSACTION:
TRANSACTION 2852, ACTIVE 0 sec fetching rows
mysql tables in use 1, locked 1
LOCK WAIT 8 lock struct(s), heap size 1136, 14 row lock(s), undo log entries 25
MySQL thread id 146, OS thread handle 140640122267392, query id 1586 localhost 127.0.0.1 oc5z updating
DELETE FROM deal_product_rows_tmp WHERE batch_no=7533
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 4 n bits 88 index PRIMARY of table `db1`.`deal_product_rows_tmp` trx id 2852 lock_mode X locks rec but not gap waiting
Record lock, heap no 14 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 4; hex 8000650e; asc   e ;;
 1: len 6; hex 000000000b25; asc      %;;
 2: len 7; hex 81000001340110; asc     4  ;;
 3: len 4; hex 80001d6e; asc    n;;
 4: len 4; hex 8010a626; asc    &;;
 5: len 4; hex 8000253a; asc   %:;;
 6: len 4; hex 8000986e; asc    n;;
 7: len 4; hex 80000002; asc     ;;
 8: len 30; hex 415254455820d184d0bed180d0bcd18b20d0bfd180d18fd0bcd0bed183d0; asc WRX                         ; (total 81 bytes);
 9: len 8; hex 0000000000208c40; asc        @;;
 10: len 1; hex 4d; asc M;;

*** (2) TRANSACTION:
TRANSACTION 2853, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
8 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 12
MySQL thread id 147, OS thread handle 140640120497920, query id 1594 localhost 127.0.0.1 oc5z updating
DELETE FROM deal_product_rows_tmp WHERE batch_no=7534
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 11 page no 4 n bits 88 index PRIMARY of table `db1`.`deal_product_rows_tmp` trx id 2853 lock_mode X locks rec but not gap
Record lock, heap no 14 PHYSICAL RECORD: n_fields 11; compact format; info bits 0
 0: len 4; hex 8000650e; asc   e ;;
 1: len 6; hex 000000000b25; asc      %;;
 2: len 7; hex 81000001340110; asc     4  ;;
 3: len 4; hex 80001d6e; asc    n;;
 4: len 4; hex 8010a626; asc    &;;
 5: len 4; hex 8000253a; asc   %:;;
 6: len 4; hex 8000986e; asc    n;;
 7: len 4; hex 80000002; asc     ;;
 8: len 30; hex 415254455820d184d0bed180d0bcd18b20d0bfd180d18fd0bcd0bed183d0; asc WRX                         ; (total 81 bytes);
 9: len 8; hex 0000000000208c40; asc        @;;
 10: len 1; hex 4d; asc M;;

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 11 page no 4 n bits 96 index PRIMARY of table `db1`.`deal_product_rows_tmp` trx id 2853 lock_mode X locks rec but not gap waiting
Record lock, heap no 2 PHYSICAL RECORD: n_fields 11; compact format; info bits 32
 0: len 4; hex 80006502; asc   e ;;
 1: len 6; hex 000000000b24; asc      $;;
 2: len 7; hex 0200000128012d; asc     ( -;;
 3: len 4; hex 80001d6d; asc    m;;
 4: len 4; hex 80109d40; asc    @;;
 5: len 4; hex 800023ba; asc   # ;;
 6: len 4; hex 800098e6; asc     ;;
 7: len 4; hex 80000007; asc     ;;
 8: len 9; hex 466f696c20676c7565; asc Foil glue;;
 9: len 8; hex 0000000000005940; asc       Y@;;
 10: len 1; hex 4d; asc M;;

*** WE ROLL BACK TRANSACTION (2)

Table structure:

    CREATE TABLE `deal_product_rows_tmp` (
    `batch_no` int(11) NOT NULL,
    `bitrix_id` int(11) NOT NULL,
    `deal_id` int(11) NOT NULL,
    `product_id` int(11) NOT NULL,
    `quantity` int(11) NOT NULL,
    `product_name` varchar(1000) NOT NULL,
    `price` double DEFAULT NULL,
    `status` varchar(50) NOT NULL,
    `tmp_id` int(11) NOT NULL AUTO_INCREMENT,
    PRIMARY KEY (`tmp_id`),
    KEY `idx_deal_row_tmp_deal` (`deal_id`),
    KEY `idx_deal_row_tmp_batchno` (`batch_no`)
    ) ENGINE=InnoDB AUTO_INCREMENT=7500 DEFAULT CHARSET=utf8 

1) queries delete separated rows, how it is possible RECORD LOCKS on the same row?

2) even if the transactions applying lock on the same row, why deadlock? why first transaction can't wait until second release the lock ?

UPDATE1: App logic – spring boot app receiving rest call from crm system, fetching client order (deal) product rows and trying to sync the data in micro dwh. There are multiple statements in RC transaction, only one type of transaction implemented in this app. First statement REPLACE INTO deal table, this should prevent processing batches on the same deal_id in parallel (FK keys not used at the moment). Next the transaction inserting received product rows in "tmp" table and comparing on target "dwh" table. inserting missing deal products, updating status of "deleted" product rows. Next step deleting "tmp" data from deal_product_rows_tmp (point of deadlock) and final step – mark batch as completed (update batch set status=.. table where batch_no=? )

Best Answer

Normally people are confused about gap locking and the extra locking due to bad indexing on the whole table or more records than expected, but that is not your case:

  • You have the appropriate secondary index
  • The locks declare no gap locking is necessary (as expected)
  • The records locked/wanted are different
  • I cannot reproduce the issue unless I delete the idx_deal_row_tmp_batchno index
  • The records are on the same page, but they should not, in theory, affect each other

From the limited information, what I can guess is:

  • You are deleting several records in a batch (as one holds 3 row locks and the other 14)
  • You run into a race condition, where Transaction 1 is able to lock batch_no=7533 record, and Transaction 2 is able to lock batch_no=7534, but then they also want to update on separate queries (but the same transaction) the other records
  • a dependency is detected and InnoDB kills the most recent one to prevent an infinite loop (deadlock)

They cannot just wait, because otherwise it would be an infinite wait (as they depend on each other). If it was not a cycle, the second transaction would indeed wait up to innodb_lock_wait_timeout seconds.

There are several strategies you can do:

  • Monitor deadlocks, if they are not common, just do nothing (retry on the application). As long as errors don't happen frequently they won't be a huge penalty
  • Change your update strategy to prevent collisions (e.g. delete in a single, larger batch -although that may increase latency-, delete one row per transaction, make application coordinate for deletion, etc.
  • Use the new 8.0 features SKIP LOCKED and NOWAIT to ignore immediately locked rows, specially intersting for batch processes: https://mysqlserverteam.com/mysql-8-0-1-using-skip-locked-and-nowait-to-handle-hot-rows/