First Transaction
------------------------
LATEST DETECTED DEADLOCK
------------------------
130320 15:53:37
*** (1) TRANSACTION:
TRANSACTION 0 83395751, ACTIVE 35 sec, process no 26405, OS thread id 140507872417536 starting index read
mysql tables in use 3, locked 3
LOCK WAIT 3 lock struct(s), heap size 1216, 2 row lock(s)
MySQL thread id 1163191, query id 199629038 localhost sosci Updating
UPDATE `database`.`table` SET `invalidate`='2013-03-21 03:53:02' WHERE ((token='C7G8X3HABCDEFGH') AND (invalidate IS NULL)) AND (project=26118) LIMIT 1
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 65548 n bits 192 index `PRIMARY` of table `database`.`table` trx id 0 83395751 lock_mode X locks rec but not gap waiting
Record lock, heap no 4 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
0: len 4; hex 000c2591; asc % ;; 1: len 6; hex 000004e36ace; asc j ;; 2: len 7; hex 8000000a830110; asc ;; 3: len 4; hex 80000001; asc ;; 4: len 4; hex 80006606; asc f ;; 5: len 1; hex 07; asc ;; 6: len 16; hex 32455637363853485447444734584252; asc 2EV768SHTGDG4XBR;; 7: SQL NULL; 8: len 30; hex 3935363436362c656e672c616e6e612e63616d706f7265736940676d6169; asc 956466,eng,anna.camporesi@gmai;...(truncated); 9: SQL NULL; 10: len 8; hex 8000124ef477640e; asc N wd ;; 11: len 8; hex 8000124ef495e88e; asc N ;;
Second Transaction
*** (2) TRANSACTION:
TRANSACTION 0 83395676, ACTIVE 37 sec, process no 26405, OS thread id 140507856160512 fetching rows, thread declared inside InnoDB 451
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1216, 53 row lock(s), undo log entries 1
MySQL thread id 1163198, query id 199628885 localhost sosci updating
DELETE FROM `database`.`table` WHERE ((action="limit") AND (info='login') AND (creation < DATE_SUB(NOW(), INTERVAL 10 MINUTE)))
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 0 page no 65548 n bits 192 index `PRIMARY` of table `database`.`table` trx id 0 83395676 lock_mode X
Record lock, heap no 4 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
0: len 4; hex 000c2591; asc % ;; 1: len 6; hex 000004e36ace; asc j ;; 2: len 7; hex 8000000a830110; asc ;; 3: len 4; hex 80000001; asc ;; 4: len 4; hex 80006606; asc f ;; 5: len 1; hex 07; asc ;; 6: len 16; hex 32455637363853485447444734584252; asc 2EV768SHTGDG4XBR;; 7: SQL NULL; 8: len 30; hex 3935363436362c656e672c616e6e612e63616d706f7265736940676d6169; asc 956466,eng,anna.camporesi@gmai;...(truncated); 9: SQL NULL; 10: len 8; hex 8000124ef477640e; asc N wd ;; 11: len 8; hex 8000124ef495e88e; asc N ;;
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 0 page no 65548 n bits 192 index `PRIMARY` of table `database`.`table` trx id 0 83395676 lock_mode X waiting
Record lock, heap no 117 PHYSICAL RECORD: n_fields 12; compact format; info bits 0
0: len 4; hex 000c31d2; asc 1 ;; 1: len 6; hex 000004f884fc; asc ;; 2: len 7; hex 80000011040110; asc ;; 3: len 4; hex 80000001; asc ;; 4: SQL NULL; 5: len 1; hex 06; asc ;; 6: SQL NULL; 7: len 15; hex 3133322e3139392e3132312e313632; asc 132.199.121.162;; 8: len 5; hex 6c6f67696e; asc login;; 9: len 1; hex 81; asc ;; 10: len 8; hex 8000124ef49502aa; asc N ;; 11: SQL NULL;
*** WE ROLL BACK TRANSACTION (1)
What Do They Have in Common ?
Each transaction is trying to get an exclusive lock on the the same part of the PRIMARY KEY : page no 65548 n bits 192
RECORD LOCKS space id 0 page no 65548 n bits 192 index `PRIMARY` of table `database`.`table` trx id 0 83395751 lock_mode X locks rec but not gap waiting
RECORD LOCKS space id 0 page no 65548 n bits 192 index `PRIMARY` of table `database`.`table` trx id 0 83395676 lock_mode X
Thus, somebody had to rollback. The result: Transaction #1 rolled back.
The query from Transaction #2 got to execute
DELETE FROM `database`.`table` WHERE ((action="limit") AND (info='login') AND (creation < DATE_SUB(NOW(), INTERVAL 10 MINUTE)))
Read your error log. Somewhere in it should be the rollback message and that transaction #1 was restarted.
You just asked
Why did this "situation" block the access to the table (or at least to some entries) for more than one minute? And why are the two transactions (dead-)locking each other - instead of running one after the other?
Look at the Locks Again. Each transaction is locking the PRIMARY KEY. The PRIMARY KEY lives in the Clustered Index (See my post Best of MyISAM and InnoDB for a description of the Clustered Index). Certain keys live together on the same space, page, and bit. Now look at the queries:
Transaction #1
UPDATE `database`.`table` SET `invalidate`='2013-03-21 03:53:02' WHERE ((token='C7G8X3HABCDEFGH') AND (invalidate IS NULL)) AND (project=26118) LIMIT 1
Transaction #2
DELETE FROM `database`.`table` WHERE ((action="limit") AND (info='login') AND (creation < DATE_SUB(NOW(), INTERVAL 10 MINUTE)))
It just so happens that the PRIMARY KEY for the associated rows are literally roommates. It like asking two Olympic Weightlifters to walk through the same door. What makes it worse is that the first query can be somewhat random in behavior. Why?
You are replacing invalidate
(NULL at the moment you started to run the query) with an actual datetime value. If there are any secondary (non-unique indexes) that have invalidate
as one of the columns, the corresponding PRIMARY KEY must be locked to stabilize the secondary key's updating. That takes place in ibdata1's insert buffer (See Insert Buffer in the Map of InnoDB)
Any change to a column that is a member of one or more indexes requires a lock of the PRIMARY KEY at the row level. Unfortunately, multiple row ID coexist on the same page with the Clustered Index.
Transaction #1 had 2 locks. Transaction #2 had 53 locks. At a fateful moment, they wanted to lock the same row. Rather than InnoDB performing "perfect world" row locking, note that setting longer timeout value for transactions is an option. All DBMS products must deal with Deadlock Resolution on a LIFO (Last In, First Out) basis. That's why transaction #2 was allowed to execute without full interruption.
Mystery Solved
You may do deletes in batches from some script:
1) mark account for deletion (seems you are doing it right now)
2) in script - select one user for some marked account and delete it - as you expect many users per account, this will delete only small portion of all messages and attachments so should be fast enough
3) repeat 2 with remaining users of a given account, one small batch at a time, other processes can get work done between the locks
4) delete the account - no users so fast
If deleting one user can still take too long, you can go one level down and delete batches of say 1000 messages and then delete the account with all users, now without messages so fast again.
The problem is that you lose ACID properties as it all runs in multiple transactions -> you cannot rollback after committing first delete and other queries running in between will see partial data of users (incomplete lists of messages/users depending on the granularity of your deletes). But only you can say if it is a real problem for you or if is_deleted=1 means that account is not used anywhere anymore so it is safe.
Best Answer
I don't know about MySQL but Postgres never locks the whole table when running DML statements - regardless of the number of affected rows.
Additionally writers never block readers, so even if you update all rows in a table, other transactions will still be able to SELECT from that table - obviously seeing the old values until your DML statement is committed.
Deleting rows from the table will also not block concurrent inserts (as long as there are no primary key conflicts). And inserting rows won't block deleting existing rows.
Again: the above is true for Postgres. I don't use MySQL, so I can't comment on that.