昨天看awr报表,发现ITL waits,很少在前5位看到这个等待事件,当然解决很简单,是一个很小的表,仅仅占用1块,开发人员把它当作seq使用,真不知道程序员怎么想的???解决方法很简单,加大pctfree,然后move,在rebuild索引就ok了。
我仔细看了里面相关的语句,仅仅有一个事务有点慢,其它都很快,正常ITL waits不应该排这么前,感觉有点奇怪。作了一个测试:
1.建立测试环境:
SQL> select * from v$version ;
BANNER
--------------------------------------------------------------------------------
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
PL/SQL Release 11.2.0.1.0 - Production
CORE 11.2.0.1.0 Production
TNS for Linux: Version 11.2.0.1.0 - Production
NLSRTL Version 11.2.0.1.0 - Production
2.建立测试数据:
SQL> create table t pctfree 0 as select rownum id ,'test' name from dual connect by levelSQL> create unique index i_t_id on t(id);
SQL> exec dbms_stats.gather_table_stats(ownname=>USER, tabname=>'T');
--我建立了一个表,pctfree=0,这样块内空余空间基本没有,仅仅两个ITL槽,不能在增加空间分配ITL槽。
SQL> select rowid from t where id=1;
ROWID
------------------
AAAUSUAAEAAAAO7AAA
SQL> select min(id),max(id) from t where rowid between 'AAAUSUAAEAAAAO7AAA' and 'AAAUSUAAEAAAAO7DDD';
MIN(ID) MAX(ID)
---------- ----------
1 581
--可以确定id从1到581都在一个块中。
3.测试数据:
--修改数据,不commit.
打开会话1:
update t set name='TEST' where id=1;
打开会话2:
update t set name='TEST' where id=2;
打开会话3:
update t set name='TEST' where id=3;
打开会话4:
update t set name='TEST' where id=4;
也许还有些空间,能容纳3个ITL槽,在第4个会话的时候挂起。
在打开一个会话5执行,可以发现出现enq: TX - allocate ITL entry等待事件:
SQL> SELECT SID,SEQ#,EVENT FROM V$SESSION_WAIT WHERE event NOT IN (SELECT NAME FROM v$event_name WHERE wait_class = 'Idle');
SID SEQ# EVENT
---------- ---------- ----------------------------------------
138 148 enq: TX - allocate ITL entry
71 124 asynch descriptor resize
使用strace监测挂起进程:
$ strace -ttT -p 14378
Process 14378 attached - interrupt to quit
14:53:50.649628 gettimeofday({1330066430, 649778}, NULL) = 0
14:53:50.649912 gettimeofday({1330066430, 649983}, NULL) = 0
14:53:50.650084 gettimeofday({1330066430, 650115}, NULL) = 0
14:53:50.650185 semtimedop(5210112, 0x7fbfff5748, 548682028960, NULL) = -1 EAGAIN (Resource temporarily unavailable)
14:53:53.651455 gettimeofday({1330066433, 651528}, NULL) = 0
14:53:53.651624 gettimeofday({1330066433, 651702}, NULL) = 0
14:53:53.651784 getrusage(RUSAGE_SELF, {ru_utime={0, 25996}, ru_stime={0, 22996}, ...}) = 0
14:53:53.651972 gettimeofday({1330066433, 652035}, NULL) = 0
14:53:53.652109 getrusage(RUSAGE_SELF, {ru_utime={0, 25996}, ru_stime={0, 22996}, ...}) = 0
14:53:53.652259 gettimeofday({1330066433, 652300}, NULL) = 0
14:53:53.652363 semtimedop(5210112, 0x7fbfff5748, 548682028960, NULL) = -1 EAGAIN (Resource temporarily unavailable)
14:53:56.653983 gettimeofday({1330066436, 654065}, NULL) = 0
14:53:56.654174 gettimeofday({1330066436, 654235}, NULL) = 0
14:53:56.654341 getrusage(RUSAGE_SELF, {ru_utime={0, 25996}, ru_stime={0, 22996}, ...}) = 0
14:53:56.654522 gettimeofday({1330066436, 654590}, NULL) = 0
14:53:56.654678 getrusage(RUSAGE_SELF, {ru_utime={0, 25996}, ru_stime={0, 22996}, ...}) = 0
14:53:56.654873 gettimeofday({1330066436, 654951}, NULL) = 0
14:53:56.655037 semtimedop(5210112, 0x7fbfff5748, 548682028960, NULL) = -1 EAGAIN (Resource temporarily unavailable)
14:53:59.656491 gettimeofday({1330066439, 656568}, NULL) = 0
14:53:59.656674 gettimeofday({1330066439, 656735}, NULL) = 0
14:53:59.656839 getrusage(RUSAGE_SELF, {ru_utime={0, 25996}, ru_stime={0, 22996}, ...}) = 0
14:53:59.657013 gettimeofday({1330066439, 657090}, NULL) = 0
14:53:59.657178 getrusage(RUSAGE_SELF, {ru_utime={0, 25996}, ru_stime={0, 22996}, ...}) = 0
14:53:59.657372 gettimeofday({1330066439, 657450}, NULL) = 0
14:53:59.657538 semtimedop(5210112, 0x7fbfff5748, 548682028960, NULL
Process 14378 detached
--可以发现系统调用semtimedop,这是一个linux的sleep信号,等待3秒再检测。
4.开始rollback操作:
rollback ;
--检查会话4,会话4依旧挂起!
--检查会话4,会话4执行!
--可以发现,会话4必须要等待会话3结束事务,才能继续操作。
5.测试有5个会话的情况。
打开会话5:
update t set name='TEST' where id=5;
在第4,5个会话的时候挂起。
--检查会话4,5会话4依旧挂起,会话5执行!
--检查会话4,会话4居然执行了!
--反复测试,仅仅在出现多个ITL等待时,前面的情况才不会出现!