6、 enq:TC-contention
在手动执行检查点操作中,一部分需要获得TC锁(thread checkpointlock 或 tablespace checkpointlock )在获得TC锁过程中,若发生争用,则需要等待enq:TC-contention 事件。事实上获得TC锁的过程稍微复杂。
1) 服务器进程首先以X模式获得TC锁
2) 服务器进程将已获得的TC锁变更为SSX模式。同时,CKPT进程以SS模式获得该锁。CKPT获得锁后执行检查点操作。
3) 服务器欲重新以X模式获得TC锁,等待CKPT释放该锁,这时的等待事件就是enqueue:TC-contention
4) 检查点工作结束后,CKPT进程将会释放TC锁,服务器进程就会获得TC锁,因此得知检查点工作已经结束。
Enq:TC-contention 等待即便在没有多个进程引起争用的情况下,也可以发生,在这一点上与其他锁争用引起的等待现象不同。需要理解的是在等待现象中,存在只有争用才能引发的等待现象,但是也存在不发生争用,也会单纯为了等待工作结束而等待的情况。。
发生检查点的情况虽然很多,但不是所有的情况都会发生TC锁引起的等待,之后再进程由服务器进程引发的检查点同步过程中发生。
enq:TC-contention 等待发生的代表案例如下:并行查询 和 表空间热备
1:、并行查询(parallel query)
pq发生检查点的原因是slave session引起的direct path read。这就是所谓的“直接路径读”,它不经过高速缓冲区直接读取数据文件。oracle在如下三种情况下使用direct path read(也叫physical read direct) 方式的读取。
(1)内存区域上不能完成排序工作时,会在临时段的区域里存储和读取的过程中,发生direct path write ,direct path read 。这时的等待事件可以通过direct path read temp、direct path write temp 观察。
(2)slave session(从属会话)为了扫描直接读取的数据文件时,使用direct path read 。这时等待事件通过direct path read 事件观察。
(3)若判断是因为I/O系统的性能下降,导致不能将以足够快的速度读取,oracle为了临时方便会使用direct path read。
slave session执行direct path read 对象时数据文件,从数据文件上直接读取数据时,因为不经过SGA,所以可能发生当期SGA上的块和数据文件上的块之间版本不一致的现象,为了防止这些现象,oracle对数据文件执行direct path read 之前,应该执行检查点。coordinate session在驱动slave session之前,对于执行direct path read现象,请求段级别的检查点,检查点发生之前一直处于enq:TC-contention等待事件状态。coordinate
session上可以发现enq:TC-eontention等待,slave session上则可以发现direct path read 等待。
2、表空间热备份(tablespace hot backup)
执行alter tablespace 。。begin backup后,将属于此表空间的所有高速缓冲区的脏数据记录到磁盘上,这个过程经历enq:TC-contention 等待。
7、enq:CI-contention 和 enq:RO-contention
“Cross Instance call Enqueue”是一种在一个或多个instance实例间调用后台进程行为时用到的队列锁,具体调用的后台进程行为包括检查点checkpoint、日志切换logfile switch、shutdown实例、载入数据文件头等等。需要注意的是这种Enqueue
Lock并不仅仅在RAC中使用,即便是单节点也会用到。CI锁的数量取决于并行执行Cross Instance Call调用的进程的总数。
当系统中出现有大量这种跨实例后台进程调用时,将出现CI队列锁的争用。
假设在一个RAC场景中,同时有大量的回话开始对不同的数据表执行TRUNCATE截断操作,TRUNCATE的一个前提是在所有实例上(因为对象表的dirty buffer可能分布在多个实例上)发生对象级别的检查点(object level checkpoint),检查点发生时CKPT进程会通知DBWR写出指定对象表相关的脏块,DBWR需要扫描Buffer
Cache以找出脏块,而如果Buffer Cache很大那么扫描将花费大量的时间,而在此过程中前台进程将一直排他地持有着本地的CI队列锁,这就将造成CI锁的严重争用。
为了减少CI队列锁地争用,我们第一步所要做的是找出实际的Cross Instance call跨实例调用的类型。这里要另外提一下的是在10g以前不管是v$session_wait或statspack中都不会将enqueue锁等待事件的具体enqueue lock类型写明,一般需要我们从p1/p2/p3列中找出enqueue的具体身份,例如”WAIT #1: nam=’enqueue’ ela= 910796 p1=1128857606
p2=1 p3=4″,这里的p1为1128857606也就是16进制的43490006,高位的’4349′转换为ascii码也就是’CI’,而这里的p2/p3对应为V$lock中的ID1/ID2,ID1=1代表了”Reuse (checkpoint and invalidate) block range”,ID2=4代表了”Mounted excl, use to allocate mechanism”。
具体ID1/ID2代表的含义在不同版本中有所变化,可以参考下表:
enq: RO - fast object reuse 等待事件
查了一下这个等待,出现这个等待比较高的情况一般都有异常:
1.truncate表或者分区表时
2.收集统计信息采用degree>1时
这个event表示在等待DBWR to clean cache.
出现异常的时候症状:the CKPT background process is the one holding the needed RO enqueue although it is actually doing nothing.
Bug:7385253.
这个wait event表示在等待DBWR to clean cache.
如果要优化这个问题,需要综合考虑,比如减少cache size,增加dbwr process或减少MTTR等等。
Is it still locked?
For some reason the truncate is still waiting for CKPT process. When you truncate or drop a table, CKPT does a range flush of the db_cache_size, which seems to be completed according to your alert_log.
That was an issue in 9i-10g. This looks like bug 4201369 which is supposed to be fixed in 10.1.0.5. I will suggest you open a tar on this! They will make you do a hanganalysis which should clarify the issue.
该等待事件多与bug 相关
Bug 7385253 - Slow Truncate / DBWR useshigh CPU / CKPT blocks on RO enqueue [ID 7385253.8]
Product (Component)
Oracle Server (Rdbms)
Range of versions believed to be affected
Versions >= 10 but BELOW 11.2
Versions confirmed as being affected
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#AFFECTS_11.1.0.7">11.1.0.7</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#AFFECTS_10.2.0.4">10.2.0.4</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#AFFECTS_10.2.0.3">10.2.0.3</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#AFFECTS_10.2.0.2">10.2.0.2</a>
Platforms affected
Generic (all / most platforms affected)
This issue is fixed in
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#FIXED_11.2.0.1">11.2.0.1 (Base Release)</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#FIXED_11.1.0.7.3">11.1.0.7.3 (Patch Set Update)</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#FIXED_10.2.0.5">10.2.0.5 (Server Patch Set)</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#FIXED_10.2.0.4.1">10.2.0.4.1 (Patch Set Update)</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=560295.1">11.1.0.7 Patch 25 on Windows Platforms</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=342443.1">10.2.0.4 Patch 14 on Windows Platforms</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=8344348.8">10.2.0.4 RAC Recommended Patch Bundle #3</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=7612639.8">10.2.0.4 Generic Recommended Patch Bundle #3</a>
该Bug的3个表现:
DBWR may use alot of CPU and seem to spin in or around kcbo_write_qdue to large number offree buffers on the object reuse queue or checkpoint queue.
In some casesthe CKPT holds the RO enqueue for very long blocking other operations with waitevent "enq: RO - fast objectreuse".
Operations so farreported being affected are :
- Apply Processes in StandBy databases
- Gather stats
- Truncates
- drop/shrink/alter tablespace
Note: This fix was previously incorrectlylisted as not affecting 11g.
The bug itself is present in 11g but it is unlikely to show anysignificant symptom due to other 11g changes meaning that free buffers are nolonger kept on the object queue.
对与该Bug 的解决方法:
setting _db_fast_obj_truncate=FALSE <--did not fix the issue
enabling asyn i/o <-- customer refused to implement to avoid corruptionsrisk
applying 7287289 <-- did not fix the issue
'enq: RO - fastobject reuse' contention when gathering schema/table statistics in parallel [ID762085.1]
Symptoms:
(1)Database has been recently upgradedfrom 10.2.0.1 to 10.2.0.4.
(2)There is 'enq: RO - fastobject reuse' contention when gathering schema/table statistics in parallelusing DBMS_STATS package (with DEGREE>1).
解决方法:
1) Flushing the buffer cache.
OR
2) Setting "_db_fast_obj_truncate" =FALSE. This reverts back to the9i way of invalidating buffers in the buffer cache.
Kindly note thatboth workarounds could have an impact on the database performance. Instead, itis recommended applying the corresponding patch.
--这2种解决方法对db
性能都有很大影响,建议应用合适的patch。
Bug8544896 - Waits for "enq: RO - fast object reuse" with high DBWR CPU[ID 8544896.8]
Versions >= 10.2.0.4 but BELOW 10.2.0.5
Regression introduced in 10.2.0.4
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=245840.1#FIXED_10.2.0.4.3">10.2.0.4.3 (Patch Set Update)</a>
<a href="https://support.oracle.com/CSP/main/article?cmd=show&type=NOT&id=342443.1">10.2.0.4 Patch 27 on Windows Platforms</a>
This problem is introduced in 10.2.0.4.
Sessions can wait on "enq: RO - fastobject reuse" while DBWR consumes lots of CPU when performing truncatetype operations.
Workaround:
(1)Flush the buffer cache beforetruncating
OR
(2) set _db_fast_obj_truncate = FALSE.
我这里出现这2个等待事件都与Truncate
操作有关。