mysql死锁 DeadLock定位与解决

mysql死锁问题定位与解决java后台报错如下java.util.concurrent.ExecutionException: org.springframework.dao.DeadlockLoserDataAccessException:### Error updating database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQ...

深山猿

26209人浏览 · 2019-01-24 19:41:33

深山猿 · 2019-01-24 19:41:33 发布

很多死锁场景参考：https://blog.csdn.net/tr1912/article/details/81668423

mysql死锁问题定位与解决

java后台报错如下
java.util.concurrent.ExecutionException: org.springframework.dao.DeadlockLoserDataAccessException:
### Error updating database. Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
### The error may involve com.ruisitech.bi.mapper.bireport.RhMapper.updateRhShowInfo-Inline
### The error occurred while setting parameters
### SQL: update dau_result set num= ? where time=? and condition=?
### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction

从mysql中查看具体发生死锁的语句

mysql中执行语句：show engine innodb status;
关注的信息如下：
------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-01-24 17:03:26 7efcb18aa700
*** (1) TRANSACTION:
TRANSACTION 1748980943, ACTIVE 0 sec fetching rows
mysql tables in use 3, locked 3
LOCK WAIT 10 lock struct(s), heap size 2936, 121 row lock(s)
MySQL thread id 12653229, OS thread handle 0x7efef2fd6700, query id 1708475432 172.16.27.230 boss_ddl Searching rows for update
update dau_result
       set num= 68
       where time='20181001' and condition='332'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8959 page no 21 n bits 352 index `PRIMARY` of table `boss_bi`.`dau_result` trx id 1748980943 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 1748980942, ACTIVE 0 sec fetching rows, thread declared inside InnoDB 4997
mysql tables in use 3, locked 3
12 lock struct(s), heap size 2936, 115 row lock(s)
MySQL thread id 12652735, OS thread handle 0x7efcb18aa700, query id 1708475433 172.16.27.230 boss_ddl Searching rows for update
update dau_result
       set num= 244618
       where time='20181002' and condition='333'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 8959 page no 21 n bits 352 index `PRIMARY` of table `boss_bi`.`dau_result` trx id 1748980942 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8959 page no 21 n bits 352 index `PRIMARY` of table `boss_bi`.`dau_result` trx id 1748980942 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (1)
------------

现在将问题抽象出来，假设建表语句如下：
CREATE TABLE `dau_result` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT '自增主键',
`time` varchar(16) NOT NULL DEFAULT '' COMMENT '时间，格式如1001',
`condition` int(11) NOT NULL DEFAULT '-1' COMMENT '条件id',
`num` int(11) NOT NULL DEFAULT '0' COMMENT '的数量',
PRIMARY KEY (`id`) USING BTREE,
KEY `idx_time` (`time`) USING BTREE,
KEY `idx_condition` (`condition`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=24609 DEFAULT CHARSET=utf8mb4 COMMENT='日活量查询结果表';

表中的数据有如下6条（1,20181001,332,10）,（2,20181001,333,10）（3,20181002,332,10）,（4,20181002,333,10）,（5,20181003,332,10）,（6,20181003,333,10)

发生死锁的两个sql如下：
sqlA：update dau_result set num= 244618 where time='20181002' and condition='333'
sqlB：update dau_result set num= 68 where time='20181001' and condition='332'

首先确定的是两个sql最终的定位不是同一条记录，怎么发生死锁了呢？其实即使是定位到同一条记录也只是会阻塞而已，并不会导致死锁，下面分析导致死锁的原因。
explain update dau_result set num= 68 where time='20181002' and condition='333' ;
结果：
key:idx_query_condition,idx_time
extra:Using intersect(idx_condition,idx_time); Using where; Using temporary

单个sql会同时走idx_condition和idx_time，但是两个sql的time和condition都不同，但是上面explain的结果中，key同时使用了idx_condition,idx_time两个索引，extra中也包含了Using intersect(idx_condition,idx_time)，所以mysql做了优化），查询符合条件的记录时，实际使用了下面两个个sql并取交集：

sqlA1：select * from dau_result where time='20181002' for update;//此时对应上锁的id范围是id=3和id=4
sqlA2：select * from dau_result where condition='333' for update ;//此时对应上锁的id范围是id=2和id=4和id=6

注意虽然虽然sqlA1和sqlA2都对id=4上互斥锁，但是不会有阻塞问题，因为sqlA1和sqlA2在同一个事务中。取交集后定位id=4的进行修改，sql如下

sqlA3：update *** where id = 4 （time='20181002' and condition='333'语句对应的id=4）

说明：上面需要有mysql锁相关知识，可以参考：

https://blog.csdn.net/h2604396739/article/details/86518943和https://blog.csdn.net/h2604396739/article/details/81483051

如果对sqlB也做如上分析，则sqlB1上锁范围为：id=1和id=2

sqlB2上锁范围，id=1，id=3和id=5

那多线程中整体执行流程就可能如下图：

	session1（sqlA）	session2（sqlB）
t1	sqlA1：id=3和id=4上锁
t2		sqlB1：id=1和id=2上锁
t3	sqlA2：id=2和id=4和id=6上锁（id=2导致blocked）
t4		sqlB2：id=1，id=3和id=5（id=3导致blocked）
t5	deadlock	deadlock
t6	sqlA rollback，并且释放id=3的锁
t7		sqlB获取锁并执行

导致死锁原因：

被访问到的记录上的是互斥锁；并且因为分别走了两个索引导致上锁范围扩大。

解决方案
1修改为共享锁：定位实际修改记录过程中被访问记录使用共享锁，先按照条件查询对应的id（因为select上的是共享锁S，不会导致死锁），然后再用id来更新语句，因为id是唯一主键，两条sql会对应不同的主键，互斥锁也会加在不同的id上，不会导致死锁。
即update dau_result set num= 68, update_time=now() where time='20181001' and query_condition='332' 变更为下面两条sql:
select id from dau_result where time='20181001' and condition='332'';#此时仅仅有共享锁，不会死多
update dau_result set num= 68 where id=1 ;#只对一条记录上锁，一定不会死锁，至多阻塞

2减小上锁范围：强制使用索引idx_time或idx_condition,减小上锁的范围。以idx_time为例此时对应sql如下

update dau_result FORCE index(idx_time) set num= 68 where time='20181001' and query_condition='332' ;

update dau_result FORCE index(idx_time) set num= 244618 where time='20181002' and query_condition='333' ;

执行时序如下图，不会存在锁冲突的问题。

	session1（sqlA）	session2（sqlB）
t1	sqlA1：id=3和id=4上锁
t2		sqlB1：id=1和id=2上锁
t3	sqlA 执行	sqlB执行

3如果idx_condition索引是冗余的，则可以直接删除，这样就一定会使用索引idx_time，本质和方案二一致，也能解决问题

方案一修改起来比较费力，而且会成倍增加查询次数；方案三可能每个索引也确实有用，所以有时方案三不可行；方案二的问题是将指定索引（FORCE index(idx_time)）写在了代码(mapper.xml)里，如果索引idx_time被删除了，那么记得一定同步修改代码里面的force index 。这里推荐方案二。

--------------------------------------- add on 2020-05-26
异常信息：
org.springframework.dao.DeadlockLoserDataAccessException:
### Error updating database. Cause: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
### The error may exist in com/tal/coding/platform/mapper/StatsStuRankMapper.java (best guess)
### The error may involve com.tal.coding.platform.mapper.StatsStuRankMapper.insertBatch-Inline
### The error occurred while setting parameters
### SQL: REPLACE INTO stats_stu_rank (id, lesson_id, class_id, cuc_num, res_score_one, res_score_two, res_score_three, res_score_four ) VALUES (?, ?,?, ?, ?, ?, ?, ?) , (?, ?,?, ?, ?, ?, ?, ?) , (?, ?,?, , (?, ?,?, ?, ?, ?, ?, ?) , (?, ?,?, ?, ?, ?, ?, ?) , (?, ?,?, ?, ?, ?, ?, ?) , (?, ?,?, ?, ?, ?, ?, ?) , (?, ?,?, ?, ?, ?, ?, ?)
### Cause: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
; Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:271)

认识：首先有了唯一索引，才可以使用replace into执行，其执行过程如下：

直接执行插入，如果成功则结束；如果唯一索引已经存在，会检测到DB_DUPLICATE_KEY，此时会回滚，删除和重新插入记录，并且对记录上排他锁。

此时如果再有别的线程操作改行记录，则会报上面的死锁异常。

replace into执行过程详解：

https://www.cnblogs.com/flyingeagle/articles/10145784.html

所以本质是操作唯一索引导致的死锁
abc三线程操作同一条记录（实际delete可能是）
a 获取锁，删除记录，未提交
b 等待行锁
a 提交事务
b 获取到行锁
c 获取到行 x waiting锁
b 发现数据被更新了，需要restart获取锁，尝试获取行x waiting锁
至此死锁
怎么定位到的是死锁？？？？？？
我们是怎么解决的死锁？
你觉得怎么解决？ redis分布式锁
    @Transactional(rollbackFor = Exception.class)
    public Pair<Boolean, String> batchUnPublishBundle(@NonNull BatchUnpublishBundleDto dto) {
        // 针对某个老师收题的行为加锁
        final String requestId = UUID.randomUUID().toString();
        final String lockName = "Locker:BATCH-UNPUBLISH-BUNDLE-" +
                dto.getCourseId() + "-" + dto.getLessonId() + "-" + dto.getBundleId();
        boolean success = LockUtil.tryLock(lockName, requestId, 10);
        log.debug("批量收题加锁[{}], {}", lockName, success ? "成功" : "失败");
        if (!success) {
            return Pair.of(Boolean.FALSE, "用户操作过于频繁！");
        }
        .......
    }

一门面向 Data 和 AI 的低代码、云原生的开源编程语言

无需安装部署，在线快速体验 Byzer

更多推荐

编译时权限控制

前言权限控制，对于 MLSQL 而言的重要程度可以说是生命线。 MLSQL 需要面对各式各样的资源访问，比如 MySQL, Oracle,HDFS，Hive，Kafka，Sorl，ElasticSearch，Redis，API，Web等等，不同用户对这些数据源（以及表，列）的权限是不一样的。传统模式是，每个用户都需要有个 proxy user，然后到每个数据源里面给这个 proxy user

Byzer 白泽

Byzer 支持 JDBC 聚合下推

聚合下推PR链接聚合下推我们知道 Byzer 支持多数据源和联邦查询，可以方便分析师在一个平台上快速的分析来自多种数据源的数据，从而进行灵活的探索式分析。使用场景和优势当前 Byzer 中加载数据的方式会拉取明细数据到 spark 中进行聚合计算，对于小数据量或者在分布式文件系统上的数据源来说是常规操作。但是对于 JDBC 数据源或者有分析能力的 OLAP 系统来说拉取明细数据可能就不是最

Byzer 白泽

Byzer 术语表

Byzer-lang Byzer ，又称为 Byzer-lang，一门面向 Data 和 AI 的低代码、云原生的开源编程语言。 Byzer 是一门结合了声明式编程和命令式编程的混合编程语言，其低代码且类 SQL 的编程逻辑配合内置算法及插件的加持，能帮助数据工作者们高效打通数据链路，完成数据的清洗转换，并快速地进行机器学习相关的训练及预测。 Byzer 希望能够提供一套语言、一个引擎，就能覆盖整