很多死锁场景参考:https://blog.csdn.net/tr1912/article/details/81668423

mysql死锁问题定位与解决

java后台报错如下
java.util.concurrent.ExecutionException: org.springframework.dao.DeadlockLoserDataAccessException: 
### Error updating database.  Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
### The error may involve com.ruisitech.bi.mapper.bireport.RhMapper.updateRhShowInfo-Inline
### The error occurred while setting parameters
### SQL: update dau_result   set  num= ?  where time=? and condition=? 
### Cause: com.mysql.jdbc.exceptions.jdbc4.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction

从mysql中查看具体发生死锁的语句

mysql中执行语句:show engine innodb status;
关注的信息如下:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2019-01-24 17:03:26 7efcb18aa700
*** (1) TRANSACTION:
TRANSACTION 1748980943, ACTIVE 0 sec fetching rows
mysql tables in use 3, locked 3
LOCK WAIT 10 lock struct(s), heap size 2936, 121 row lock(s)
MySQL thread id 12653229, OS thread handle 0x7efef2fd6700, query id 1708475432 172.16.27.230 boss_ddl Searching rows for update
update dau_result
        set  num= 68
        where time='20181001' and condition='332' 
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8959 page no 21 n bits 352 index `PRIMARY` of table `boss_bi`.`dau_result` trx id 1748980943 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 1748980942, ACTIVE 0 sec fetching rows, thread declared inside InnoDB 4997
mysql tables in use 3, locked 3
12 lock struct(s), heap size 2936, 115 row lock(s)
MySQL thread id 12652735, OS thread handle 0x7efcb18aa700, query id 1708475433 172.16.27.230 boss_ddl Searching rows for update
update dau_result
        set  num= 244618
        where time='20181002' and condition='333'
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 8959 page no 21 n bits 352 index `PRIMARY` of table `boss_bi`.`dau_result` trx id 1748980942 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 8959 page no 21 n bits 352 index `PRIMARY` of table `boss_bi`.`dau_result` trx id 1748980942 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (1)
------------

现在将问题抽象出来,假设建表语句如下:
CREATE TABLE `dau_result` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '自增主键',
  `time` varchar(16) NOT NULL DEFAULT '' COMMENT '时间,格式如1001',
  `condition` int(11) NOT NULL DEFAULT '-1' COMMENT '条件id',
  `num` int(11) NOT NULL DEFAULT '0' COMMENT '的数量',
  PRIMARY KEY (`id`) USING BTREE,
  KEY `idx_time` (`time`) USING BTREE,
  KEY `idx_condition` (`condition`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=24609 DEFAULT CHARSET=utf8mb4 COMMENT='日活量查询结果表';

表中的数据有如下6条(1,20181001,332,10),(2,20181001,333,10)(3,20181002,332,10),(4,20181002,333,10),(5,20181003,332,10),(6,20181003,333,10)

发生死锁的两个sql如下:
sqlA:update dau_result set  num= 244618 where time='20181002' and condition='333' 
sqlB:update dau_result set  num= 68 where time='20181001' and condition='332'

首先确定的是两个sql最终的定位不是同一条记录,怎么发生死锁了呢?其实即使是定位到同一条记录也只是会阻塞而已,并不会导致死锁,下面分析导致死锁的原因。
explain update dau_result set  num= 68 where time='20181002' and condition='333' ;
结果:
key:idx_query_condition,idx_time
extra:Using intersect(idx_condition,idx_time); Using where; Using temporary

单个sql会同时走idx_condition和idx_time,但是两个sql的time和condition都不同,但是上面explain的结果中,key同时使用了idx_condition,idx_time两个索引,extra中也包含了Using intersect(idx_condition,idx_time),所以mysql做了优化),查询符合条件的记录时,实际使用了下面两个个sql并取交集:

sqlA1:select * from  dau_result where  time='20181002'  for update;//此时对应上锁的id范围是id=3和id=4
sqlA2:select * from  dau_result where  condition='333'  for update ;//此时对应上锁的id范围是id=2和id=4和id=6

注意虽然虽然sqlA1和sqlA2都对id=4上互斥锁,但是不会有阻塞问题,因为sqlA1和sqlA2在同一个事务中。取交集后定位id=4的进行修改,sql如下

sqlA3:update *** where id = 4 (time='20181002' and condition='333'语句对应的id=4)

说明:上面需要有mysql锁相关知识,可以参考:

https://blog.csdn.net/h2604396739/article/details/86518943https://blog.csdn.net/h2604396739/article/details/81483051

如果对sqlB也做如上分析,则sqlB1上锁范围为:id=1和id=2

sqlB2上锁范围,id=1,id=3和id=5

那多线程中整体执行流程就可能如下图:

session1(sqlA)session2(sqlB)
t1sqlA1:id=3和id=4上锁
t2sqlB1:id=1和id=2上锁
t3sqlA2:id=2和id=4和id=6上锁(id=2导致blocked)
t4sqlB2:id=1,id=3和id=5(id=3导致blocked)
t5deadlockdeadlock
t6sqlA rollback,并且释放id=3的锁
t7sqlB获取锁并执行

导致死锁原因:

被访问到的记录上的是互斥锁;并且因为分别走了两个索引导致上锁范围扩大。

解决方案
1修改为共享锁:定位实际修改记录过程中被访问记录使用共享锁,先按照条件查询对应的id(因为select上的是共享锁S,不会导致死锁),然后再用id来更新语句,因为id是唯一主键,两条sql会对应不同的主键,互斥锁也会加在不同的id上,不会导致死锁。
即update dau_result set  num= 68, update_time=now() where time='20181001' and query_condition='332' 变更为下面两条sql:
select id from dau_result where time='20181001' and condition='332'';#此时仅仅有共享锁,不会死多
update dau_result set  num= 68 where id=1 ;#只对一条记录上锁,一定不会死锁,至多阻塞

2减小上锁范围:强制使用索引idx_time或idx_condition,减小上锁的范围。以idx_time为例此时对应sql如下

update dau_result FORCE index(idx_time) set  num= 68 where time='20181001' and query_condition='332' ;

update dau_result FORCE index(idx_time) set  num= 244618 where time='20181002' and query_condition='333' ;

执行时序如下图,不会存在锁冲突的问题。

session1(sqlA)session2(sqlB)
t1sqlA1:id=3和id=4上锁
t2sqlB1:id=1和id=2上锁
t3sqlA 执行sqlB执行


3如果idx_condition索引是冗余的,则可以直接删除,这样就一定会使用索引idx_time,本质和方案二一致,也能解决问题

方案一修改起来比较费力,而且会成倍增加查询次数;方案三可能每个索引也确实有用,所以有时方案三不可行;方案二的问题是将指定索引(FORCE index(idx_time))写在了代码(mapper.xml)里,如果索引idx_time被删除了,那么记得一定同步修改代码里面的force index 。这里推荐方案二。

--------------------------------------- add on 2020-05-26
异常信息:
org.springframework.dao.DeadlockLoserDataAccessException: 
### Error updating database.  Cause: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
### The error may exist in com/tal/coding/platform/mapper/StatsStuRankMapper.java (best guess)
### The error may involve com.tal.coding.platform.mapper.StatsStuRankMapper.insertBatch-Inline
### The error occurred while setting parameters
### SQL: REPLACE INTO stats_stu_rank (id, lesson_id, class_id, cuc_num,     res_score_one, res_score_two, res_score_three, res_score_four ) VALUES                (?, ?,?, ?,          ?, ?, ?, ?)      ,          (?, ?,?, ?,          ?, ?, ?, ?)      ,          (?, ?,?,      ,          (?, ?,?, ?,          ?, ?, ?, ?)      ,          (?, ?,?, ?,          ?, ?, ?, ?)      ,          (?, ?,?, ?,          ?, ?, ?, ?)      ,          (?, ?,?, ?,          ?, ?, ?, ?)      ,          (?, ?,?, ?,          ?, ?, ?, ?)
### Cause: com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
; Deadlock found when trying to get lock; try restarting transaction; nested exception is com.mysql.cj.jdbc.exceptions.MySQLTransactionRollbackException: Deadlock found when trying to get lock; try restarting transaction
        at org.springframework.jdbc.support.SQLErrorCodeSQLExceptionTranslator.doTranslate(SQLErrorCodeSQLExceptionTranslator.java:271)

认识:首先有了唯一索引,才可以使用replace into执行,其执行过程如下:

直接执行插入,如果成功则结束;如果唯一索引已经存在,会检测到DB_DUPLICATE_KEY,此时会回滚,删除和重新插入记录,并且对记录上排他锁。

此时如果再有别的线程操作改行记录,则会报上面的死锁异常。

replace into执行过程详解:

https://www.cnblogs.com/flyingeagle/articles/10145784.html

所以本质是操作唯一索引导致的死锁
abc三线程操作同一条记录(实际delete可能是)
a 获取锁,删除记录,未提交
b 等待行锁
a 提交事务
b 获取到行锁
c 获取到行 x waiting锁
b 发现数据被更新了,需要restart获取锁,尝试获取行x waiting锁
至此死锁
怎么定位到的是死锁??????
我们是怎么解决的死锁?
你觉得怎么解决? redis分布式锁
    @Transactional(rollbackFor = Exception.class)
    public Pair<Boolean, String> batchUnPublishBundle(@NonNull BatchUnpublishBundleDto dto) {
        // 针对某个老师收题的行为加锁
        final String requestId = UUID.randomUUID().toString();
        final String lockName = "Locker:BATCH-UNPUBLISH-BUNDLE-" +
                dto.getCourseId() + "-" + dto.getLessonId() + "-" + dto.getBundleId();
        boolean success = LockUtil.tryLock(lockName, requestId, 10);
        log.debug("批量收题加锁[{}], {}", lockName, success ? "成功" : "失败");
        if (!success) {
            return Pair.of(Boolean.FALSE, "用户操作过于频繁!");
        }
        .......
    }

Logo

更多推荐