天天看点

0120ORACLE的Dead Connection Detection

[20170120]ORACLE的Dead Connection Detection浅析.txt

1.环境:

SCOTT@book> @ &r/ver1

PORT_STRING                    VERSION        BANNER

------------------------------ -------------- --------------------------------------------------------------------------------

x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

$ echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f

/proc/sys/net/ipv4/tcp_keepalive_intvl: 75

/proc/sys/net/ipv4/tcp_keepalive_probes: 9

/proc/sys/net/ipv4/tcp_keepalive_time: 7200

--我的测试当然不能这么长,我减少到300.

# sysctl  -w net/ipv4/tcp_keepalive_time=300

net.ipv4.tcp_keepalive_time = 300

# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f

/proc/sys/net/ipv4/tcp_keepalive_time: 300

/proc/sys/net/ipv4/tcp_keepalive_time    当keepalive起用的时候,TCP发送keepalive消息的频度。默认是2小时。

/proc/sys/net/ipv4/tcp_keepalive_intvl   当探测没有确认时,keepalive探测包的发送间隔。缺省是75秒。

/proc/sys/net/ipv4/tcp_keepalive_probes  如果对方不予应答,keepalive探测包的发送次数。缺省值是9。

2.建立相关表:

CREATE TABLE SESSION_WAIT_RECORD

AS

   SELECT sid

         ,seconds_in_wait

         ,event

         ,SYSDATE AS curr_datetime

     FROM v$session_wait

    WHERE 1 = 0;

CREATE TABLE LOCK_OBJECT_RECORD

   SELECT B.username

         ,B.sid

         ,B.serial#

         ,logon_time

         ,A.object_id

     FROM v$locked_object A, v$session B

    WHERE A.session_id = B.sid AND 1 = 0;

--建立测试表deptx,注意要通过网络连接,并且看看sqlnet.ora文件中不存在SQLNET.EXPIRE_TIME = NN.

--为了测试准确,我重启数据库以及监听.

SCOTT@book> create table deptx as select * from dept ;

--//session 1:(在windows下我的工作机器.)

SCOTT@book> @ &r/spid

       SID    SERIAL# SPID       PID  P_SERIAL# C50

---------- ---------- ------ ------- ---------- --------------------------------------------------

        46         15 62463       28          8 alter system kill session '46,15' immediate;

SCOTT@book>  update deptx set loc='aaaa' where deptno=10;

1 row updated.

--//session 2:

SCOTT@78> @ &r/spid

        35         23 62453       27          8 alter system kill session '35,23' immediate;

SCOTT@78> update deptx set loc='aaaa' where deptno=10;

--//挂起.

4.打开另外session执行:

--//session 3:

$ cat ff.sql

DECLARE

    v_index NUMBER := 1;

BEGIN

    WHILE v_index != 0 LOOP

        INSERT INTO SESSION_WAIT_RECORD

        SELECT sid,

               seconds_in_wait,

               event ,

         sysdate

        FROM   v$session_wait

        WHERE  sid = &&1;

        INSERT INTO LOCK_OBJECT_RECORD

        SELECT B.username,

                   B.sid,

               B.serial#,

               logon_time,

               A.object_id ,

               sysdate

        FROM   v$locked_object A,

               v$session B

        WHERE  A.session_id = B.sid

                AND A.session_id=&&1 AND B.serial#=&&2

        ORDER  BY B.logon_time;

        commit;

        dbms_lock.Sleep(1);

        SELECT Count(*)

        INTO   v_index

    END LOOP;

END;

/

SCOTT@book> @ ff 46 15

--//然后断开session 1,我采用很粗暴的方法,拔掉网线.注意session 2,session 3在tmux下执行的,即使断开网络,也一直在主机上执行.

SCOTT@book> select max(SECONDS_IN_WAIT) from SESSION_WAIT_RECORD;

MAX(SECONDS_IN_WAIT)

--------------------

                 978

--// 300+75*9=975 ,很相近.

--//如果修改

# sysctl  -w net/ipv4/tcp_keepalive_time=7500

net.ipv4.tcp_keepalive_time = 7500

--//按照前面执行相同的测试方法,测试结果如下.

SCOTT@book> select max(SECONDS_IN_WAIT)  from SESSION_WAIT_RECORD;

                8177

--// 7500+75*9=8175 ,基本相近.

<a href="http://www.cnblogs.com/kerrycode/p/6292557.html">http://www.cnblogs.com/kerrycode/p/6292557.html</a>

那么在Oracle没有启用DCD时,系统和数据库如何判断一个连接是否异常,需要关闭呢?这个时间是这样计算的,首先它等待了

tcp_keepalive_time=7200,然后每隔75秒发送探测包,一共发送了9次后(7200+ 75*9 = 7875 ),都没有收到客户端应答,那么它就判断

这个连接死掉了,可以关闭了。所以这个值是一个固定值, 具体为7875, 当然不同的操作系统可能有所不同,取决于上面三个

tcp_keepalive参数,过了7875秒后, 这个时候PMON进程就会回收与它相关的所有资源(例如回滚事务,释放lock、latch、memory)。这

个值与我测试的时间非常接近了(考虑我们是采集的等待时间,以及测试脚本里面有休眠时间,这样采集的数据有些许偏差)。