先做一個簡單事務鎖等待:
視窗1:
![](https://img.laitimes.com/img/_0nNw4CM6IyYiwiM6ICdiwiIyVGduV2QvwVe0lmdhJ3ZvwFM38CXlZHbvN3cpR2Lc1TPB10QGtWUCpEMJ9CXsxWam9CXwADNvwVZ6l2c052bm9CXUJDT1wkNhVzLcRnbvZ2LcZXUYpVd1kmYr50MZV3YyI2cKJDT29GRjBjUIF2LcRHelR3LcJzLctmch1mclRXY39DN1MjM1QTM0EzNwYDM2EDMy8CX0Vmbu4GZzNmLn9Gbi1yZtl2Lc9CX6MHc0RHaiojIsJye.jpg)
視窗2:修改同樣的資料
這時候我們會發現視窗2 的語句被卡住了。下面做個oradebug:
視窗3:
根據提示,打開所生成的trace檔案,尋找有用的資訊。
Chain 1:
-------------------------------------------------------------------------------
Oracle session identified by:
{
instance: 1 (orcl.orcl)
os id: 4768
process id: 21, ORACLE.EXE (SHAD)
session id: 155
session serial #: 28
}
is waiting for 'enq: TX - row lock contention' with wait info:
{
p1: 'name|mode'=0x54580006
p2: 'usn<<16 | slot'=0x7000b
p3: 'sequence'=0x9fd
time in wait: 2 min 2 sec
timeout after: never
wait id: 24
blocking: 0 sessions
current sql: update test set id=1111 where id=2
short stack: <-_ksedsts()+286<-_ksdxfstk()+14<-_ksdxcb()+1780<-_ssthreadsrgruncallback()+432<[email protected]()+825<-7C80B726<-00000000<-7C92E514<-7C80253D<-_skgpwwait()+128<-_ksliwat()+1440<-_kslwaitctx()+146<-_ksqcmi()+10381<-_ksqgtlctx()+1980<-_ksqgelctx()+502<-_ktcwit1()+316<-_kdddgb()+2378<-_kdusru()+407<-_updrowFastPath()+2375<-__VInfreq__qerupFetch()+1247<-_updaul()+1314<-_updThreePhaseExe()+272<-_updexe()+373<-_opiexe()+14214<-_kpoal8()+2292<-_opiodr()+1299<-_ttcpip()+2790<-_opitsk()+1278<-_opiino()+1067<-
wait history:
1. event: 'SQL*Net message from client'
wait id: 23 p1: 'driver id'=0x42455100
time waited: 15.706029 sec p2: '#bytes'=0x1
2. event: 'SQL*Net message to client'
wait id: 22 p1: 'driver id'=0x42455100
time waited: 0.000002 sec p2: '#bytes'=0x1
3. event: 'SQL*Net message from client'
wait id: 21 p1: 'driver id'=0x42455100
time waited: 0.000026 sec p2: '#bytes'=0x1
}
and is blocked by
=> Oracle session identified by:
{
instance: 1 (orcl.orcl)
os id: 4812
process id: 18, ORACLE.EXE (SHAD)
session id: 170
session serial #: 9
}
which is waiting for 'SQL*Net message from client' with wait info:
{
p1: 'driver id'=0x42455100
p2: '#bytes'=0x1
time in wait: 2 min 29 sec
timeout after: never
wait id: 38
blocking: 1 session
current sql: <none>
short stack: <-_ksedsts()+286<-_ksdxfstk()+14<-_ksdxcb()+1780<-_ssthreadsrgruncallback()+432<[email protected]()+825<-7C80B726<-00000000<-7C92E514<-07F9535B<-07F93160<-06A31A0C<-06A321E1<-0699FF8F<-__PGOSF35__opikndf2()+989<-_opitsk()+584<-_opiino()+1067<-_opiodr()+1299<-_opidrv()+1319<-_sou2o()+45<-_opimai_real()+130<-_opimai()+92<[email protected]()+792<-7C80B726<-00000000
wait history:
1. event: 'SQL*Net message to client'
wait id: 37 p1: 'driver id'=0x42455100
time waited: 0.000005 sec p2: '#bytes'=0x1
2. event: 'SQL*Net message from client'
wait id: 36 p1: 'driver id'=0x42455100
time waited: 6.980115 sec p2: '#bytes'=0x1
3. event: 'SQL*Net message to client'
wait id: 35 p1: 'driver id'=0x42455100
time waited: 0.000003 sec p2: '#bytes'=0x1
}
在trace檔案裡面可以清晰的看到
sid 155的會話在等待:
enq: TX - row lock contention
該會話被sid 170阻塞,sid 170的會話在等待
SQL*Net message from client
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
其實在實際生産環境中,不像我實驗的那麼簡單,多是資料庫内部發生了死鎖(争奪核心資源級别,比如pins和latch),而不是我實驗中這種簡單的dml鎖。
其中hanganalyze也是具有級别的:
1-2:隻有hanganalyze輸出,不dump任何程序
3:Level2+Dump出在IN_HANG狀态的程序
4:Level3+Dump出在等待鍊裡面的blockers(狀态為LEAF/LEAF_NW/IGN_DMP)
5:Level4+Dump出所有在等待鍊中的程序(狀态為NLEAF)
oracle官方建議不超過level 3,超過level 3會對系統帶來額外的負擔,一般level 3即可解決問題,定位到阻塞源頭。
對于叢集範圍的debug,要使用如下指令:
和單執行個體的差別就是多了兩條指令:
oradebug setinst all
oradebug -g def hanganalyze 3;
由于我是單執行個體環境,報錯了。。。。。
---------------------------------------------------------------------------------------------------------------------------------------------------------------
将hanganalyze知識簡單整理下,以備後用。