某个业务采用BoneCP
连接池遇到的情况,200个dubbo线程都阻塞在com.jolbox.bonecp.BoneCPDataSource.maybeInit
方法上,这里wait的是同一个锁:
"DubboServerHandler-thread-203" daemon prio=10 tid=0x00007fc2400e7800 nid=0x658c waiting on condition [0x00007fc233dbb000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x00000000c071f518> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:731)
at com.jolbox.bonecp.BoneCPDataSource.maybeInit(BoneCPDataSource.java:133)
at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:112)
at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:111)
at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:77)
at org.mybatis.spring.SqlSessionUtils.getSqlSession(SqlSessionUtils.java:116)
at org.mybatis.spring.SqlSessionTemplate$SqlSessionInterceptor.invoke(SqlSessionTemplate.java:333)
at com.sun.proxy.$Proxy11.selectList(Unknown Source)
at org.mybatis.spring.SqlSessionTemplate.selectList(SqlSessionTemplate.java:189)
at org.apache.ibatis.binding.MapperMethod.executeForList(MapperMethod.java:100)
at org.apache.ibatis.binding.MapperMethod.execute(MapperMethod.java:70)
at org.apache.ibatis.binding.MapperProxy.invoke(MapperProxy.java:38)
从BoneCP的代码里看不出直接问题,是一个ReentrantReadWriteLock
去获取锁的逻辑:
private void maybeInit() throws SQLException {
this.rwl.readLock().lock(); // -------> 133行,所有线程都阻塞在这里
if (this.pool == null){
this.rwl.readLock().unlock();
this.rwl.writeLock().lock();
if (this.pool == null){ //read might have passed, write might not
try {
if (this.getDriverClass() != null){
loadClass(this.getDriverClass());
}
}
catch (ClassNotFoundException e) {
throw new SQLException(PoolUtil.stringifyException(e));
}
logger.debug(this.toString());
this.pool = new BoneCP(this);
}
this.rwl.writeLock().unlock(); // Unlock write
} else {
this.rwl.readLock().unlock(); // Unlock read
}
}
按说通常如果是死锁的话,应该是某个线程A先获取到锁X,然后再去获取另一个锁Y时发被线程B占用,且线程B又要获取锁X,但这里的现象却是都在等同一把锁,并且一直处于这种状态导致服务不可用。jstack -F
看看是否能检测到死锁,输出也是No deadlocks found.
(也可能是ReentrantLock的读锁没有ownership的原因,jvm无法跟踪并检测出来,参考这篇和这个bug里Doug Lea的回复),怀疑也可能跟jdk和os的版本有关,可能是该版本的bug(或其他我不了解的)。