primary - percona mysql deadlock



Why am I getting deadlock in MySQL (1)

I'm getting deadlock in my MySQL table. Only a single table is involved, and I can consistently reproduce it. It only happens when I have multiple threads running the code.

Here is the table:

CREATE TABLE `users_roles` (
  `role_id` bigint(20) NOT NULL,
  `user_id` bigint(20) NOT NULL,
  `created` datetime NOT NULL,
  PRIMARY KEY (`user_id`,`role_id`),
  KEY `created` (`created`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Then, I run these 2 queries in each thread, with each thread having a different value for user_id.

BEGIN;
DELETE FROM `users_roles` WHERE user_id = X;
INSERT INTO `users_roles` VALUES (7, X, NOW()); -- DEADLOCK ON THIS QUERY
COMMIT;

It should be noted that user_id X never exists in the database when the DELETE statement is called. The bit of code that runs these queries is being used to create a new user. The function, however, allows me to modify which account a user is on, and as such, deletes the existing roles from the old user's team.

So, when enough of these queries are run in parallel, I start getting deadlocks. The deadlock portion of the InnoDB status shows this after every deadlock.

------------------------
LATEST DETECTED DEADLOCK
------------------------
2014-05-09 16:02:20 7fbc99e5f700

*** (1) TRANSACTION:
TRANSACTION 6241424274, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 6 lock struct(s), heap size 1248, 3 row lock(s), undo log entries 6
MySQL thread id 3772090, OS thread handle 0x7fbc1f451700, query id 4010665755 10.0.141.36 1403_users update

*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1331 page no 10665 n bits 192 index `PRIMARY` of table `users_data`.`users_roles` trx id 6241424274 lock_mode X insert intention waiting

*** (2) TRANSACTION:
TRANSACTION 6241424275, ACTIVE 0 sec inserting
mysql tables in use 1, locked 1
6 lock struct(s), heap size 1248, 3 row lock(s), undo log entries 6
MySQL thread id 3770297, OS thread handle 0x7fbc99e5f700, query id 4010665767 10.0.137.28 1403_users update
INSERT INTO users_roles(role_id, user_id, created) values(5, 102228093, NOW()) ON DUPLICATE KEY UPDATE user_id=user_id

*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 1331 page no 10665 n bits 192 index `PRIMARY` of table `users_data`.`users_roles` trx id 6241424275 lock_mode X

*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 1331 page no 10665 n bits 192 index `PRIMARY` of table `users_data`.`users_roles` trx id 6241424275 lock_mode X insert intention waiting

*** WE ROLL BACK TRANSACTION (2)

As for debugging, or experimenting trying to find where the issue actually is, I've been able to get rid of all the deadlocks by removing the DELETE statements from the code. While this does fix the issue, I'd like to understand it.

What I do understand is the way MySQL handles gap locks. I understand that they are at work in this issue, because the rows don't exist when I do a DELETE statement. What I don't understand is why both transactions in the innodb status are generated from the same code, but only one of them, transaction (2) has an exclusive lock. It's as if transaction (1) doesn't even attempt to get that exclusive lock (without insert intention).

Assuming the locks are correct, I can understand why deadlock ensues: transaction (2) gets the exclusive lock, transaction (1) requests insert intention, and then transaction (2) requests insert intention. This makes sense. What doesn't make sense is the absence of an exclusive lock (without insert intention) in transaction (1).


Edit:

I was able to reproduce this with a particular order of commands.

Here is the table:

CREATE TABLE `a` (
  `id` tinyint(3) unsigned NOT NULL,
  `b` int(11) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

Here are the queries. Open up 4 terminals to mysql and perform the queries in this order.

session 1: BEGIN;
session 2: BEGIN;
session 3: BEGIN;
session 4: BEGIN;
session 1: DELETE FROM `a` WHERE `id` = 5;
session 2: DELETE FROM `a` WHERE `id` = 10;
session 3: DELETE FROM `a` WHERE `id` = 7;
session 4: DELETE FROM `a` WHERE `id` = 12;
session 1: INSERT INTO `a` VALUES (5, 1);
session 2: INSERT INTO `a` VALUES (10, 1); -- deadlock here
session 3: INSERT INTO `a` VALUES (7, 1); -- deadlock here
session 4: INSERT INTO `a` VALUES (12, 1); -- deadlock here

Here is the InnoDB status immediately preceding any of the inserts.

------------
TRANSACTIONS
------------
Trx id counter 11396965
Purge done for trx's n:o < 11396913 undo n:o < 0 state: running but idle
History list length 1248
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 11396962, ACTIVE 9 sec
2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 3425, OS thread handle 0x7fcd14197700, query id 29686 localhost dev cleaning up
TABLE LOCK table `matthew`.`a` trx id 11396962 lock mode IX
RECORD LOCKS space id 7291 page no 3 n bits 80 index `PRIMARY` of table `matthew`.`a` trx id 11396962 lock_mode X
---TRANSACTION 11396961, ACTIVE 10 sec
2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 3426, OS thread handle 0x7fccda225700, query id 29673 localhost dev cleaning up
TABLE LOCK table `matthew`.`a` trx id 11396961 lock mode IX
RECORD LOCKS space id 7291 page no 3 n bits 80 index `PRIMARY` of table `matthew`.`a` trx id 11396961 lock_mode X
---TRANSACTION 11396960, ACTIVE 11 sec
2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 3391, OS thread handle 0x7fccd4d7f700, query id 29672 localhost dev cleaning up
TABLE LOCK table `matthew`.`a` trx id 11396960 lock mode IX
RECORD LOCKS space id 7291 page no 3 n bits 80 index `PRIMARY` of table `matthew`.`a` trx id 11396960 lock_mode X
---TRANSACTION 11396959, ACTIVE 13 sec
2 lock struct(s), heap size 376, 1 row lock(s)
MySQL thread id 3392, OS thread handle 0x7fccd4bf9700, query id 29671 localhost dev cleaning up
TABLE LOCK table `matthew`.`a` trx id 11396959 lock mode IX
RECORD LOCKS space id 7291 page no 3 n bits 80 index `PRIMARY` of table `matthew`.`a` trx id 11396959 lock_mode X

After the first insert has been called, the terminal for session 1 hangs on a lock. The InnoDB status shows:

---TRANSACTION 11396959, ACTIVE 841 sec inserting
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 376, 2 row lock(s)
MySQL thread id 3392, OS thread handle 0x7fccd4bf9700, query id 30234 localhost dev update
------- TRX HAS BEEN WAITING 32 SEC FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 7291 page no 3 n bits 80 index `PRIMARY` of table `matthew`.`a` trx id 11396959 lock_mode X insert intention waiting
------------------
TABLE LOCK table `matthew`.`a` trx id 11396959 lock mode IX
RECORD LOCKS space id 7291 page no 3 n bits 80 index `PRIMARY` of table `matthew`.`a` trx id 11396959 lock_mode X
RECORD LOCKS space id 7291 page no 3 n bits 80 index `PRIMARY` of table `matthew`.`a` trx id 11396959 lock_mode X insert intention waiting

I believe I have found the issue.

From http://dev.mysql.com/doc/refman/5.6/en/innodb-locks-set.html:

Prior to inserting the row, a type of gap lock called an insertion intention gap lock is set. This lock signals the intent to insert in such a way that multiple transactions inserting into the same index gap need not wait for each other if they are not inserting at the same position within the gap. Suppose that there are index records with values of 4 and 7. Separate transactions that attempt to insert values of 5 and 6 each lock the gap between 4 and 7 with insert intention locks prior to obtaining the exclusive lock on the inserted row, but do not block each other because the rows are nonconflicting.

If a duplicate-key error occurs, a shared lock on the duplicate index record is set. This use of a shared lock can result in deadlock should there be multiple sessions trying to insert the same row if another session already has an exclusive lock. This can occur if another session deletes the row.

Here's what that means for my case.

The deletes, because they affected no rows, all got a shared lock (mode IX) on the end-of-table gap. Once the insert was executed, the shared lock was still held by all threads, and the insert intention waited for the release of this shared lock.

The solution is to not do the following in parallel:

  1. Delete the rows you want to insert, when the rows aren't there.
  2. Insert the rows

So, the InnoDB engine status was just wrong. It failed to show that each of the transactions held the same lock. It failed to show that each lock was lock_mode IX, not X. It failed to show that each thread also had an insert intention lock waiting to be granted. Altogether, it was a pretty spectacular failure of the SHOW ENGINE INNODB STATUS;.





database-deadlocks