天天看點

0120ORACLE的Dead Connection Detection

[20170120]ORACLE的Dead Connection Detection淺析.txt

1.環境:

SCOTT@book> @ &r/ver1

PORT_STRING                    VERSION        BANNER

------------------------------ -------------- --------------------------------------------------------------------------------

x86_64/Linux 2.4.xx            11.2.0.4.0     Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

$ echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f

/proc/sys/net/ipv4/tcp_keepalive_intvl: 75

/proc/sys/net/ipv4/tcp_keepalive_probes: 9

/proc/sys/net/ipv4/tcp_keepalive_time: 7200

--我的測試當然不能這麼長,我減少到300.

# sysctl  -w net/ipv4/tcp_keepalive_time=300

net.ipv4.tcp_keepalive_time = 300

# echo /proc/sys/net/ipv4/tcp_keepalive* | xargs   -n 1  strings -1 -f

/proc/sys/net/ipv4/tcp_keepalive_time: 300

/proc/sys/net/ipv4/tcp_keepalive_time    當keepalive起用的時候,TCP發送keepalive消息的頻度。預設是2小時。

/proc/sys/net/ipv4/tcp_keepalive_intvl   當探測沒有确認時,keepalive探測包的發送間隔。預設是75秒。

/proc/sys/net/ipv4/tcp_keepalive_probes  如果對方不予應答,keepalive探測包的發送次數。預設值是9。

2.建立相關表:

CREATE TABLE SESSION_WAIT_RECORD

AS

   SELECT sid

         ,seconds_in_wait

         ,event

         ,SYSDATE AS curr_datetime

     FROM v$session_wait

    WHERE 1 = 0;

CREATE TABLE LOCK_OBJECT_RECORD

   SELECT B.username

         ,B.sid

         ,B.serial#

         ,logon_time

         ,A.object_id

     FROM v$locked_object A, v$session B

    WHERE A.session_id = B.sid AND 1 = 0;

--建立測試表deptx,注意要通過網絡連接配接,并且看看sqlnet.ora檔案中不存在SQLNET.EXPIRE_TIME = NN.

--為了測試準确,我重新開機資料庫以及監聽.

SCOTT@book> create table deptx as select * from dept ;

--//session 1:(在windows下我的工作機器.)

SCOTT@book> @ &r/spid

       SID    SERIAL# SPID       PID  P_SERIAL# C50

---------- ---------- ------ ------- ---------- --------------------------------------------------

        46         15 62463       28          8 alter system kill session '46,15' immediate;

SCOTT@book>  update deptx set loc='aaaa' where deptno=10;

1 row updated.

--//session 2:

SCOTT@78> @ &r/spid

        35         23 62453       27          8 alter system kill session '35,23' immediate;

SCOTT@78> update deptx set loc='aaaa' where deptno=10;

--//挂起.

4.打開另外session執行:

--//session 3:

$ cat ff.sql

DECLARE

    v_index NUMBER := 1;

BEGIN

    WHILE v_index != 0 LOOP

        INSERT INTO SESSION_WAIT_RECORD

        SELECT sid,

               seconds_in_wait,

               event ,

         sysdate

        FROM   v$session_wait

        WHERE  sid = &&1;

        INSERT INTO LOCK_OBJECT_RECORD

        SELECT B.username,

                   B.sid,

               B.serial#,

               logon_time,

               A.object_id ,

               sysdate

        FROM   v$locked_object A,

               v$session B

        WHERE  A.session_id = B.sid

                AND A.session_id=&&1 AND B.serial#=&&2

        ORDER  BY B.logon_time;

        commit;

        dbms_lock.Sleep(1);

        SELECT Count(*)

        INTO   v_index

    END LOOP;

END;

/

SCOTT@book> @ ff 46 15

--//然後斷開session 1,我采用很粗暴的方法,拔掉網線.注意session 2,session 3在tmux下執行的,即使斷開網絡,也一直在主機上執行.

SCOTT@book> select max(SECONDS_IN_WAIT) from SESSION_WAIT_RECORD;

MAX(SECONDS_IN_WAIT)

--------------------

                 978

--// 300+75*9=975 ,很相近.

--//如果修改

# sysctl  -w net/ipv4/tcp_keepalive_time=7500

net.ipv4.tcp_keepalive_time = 7500

--//按照前面執行相同的測試方法,測試結果如下.

SCOTT@book> select max(SECONDS_IN_WAIT)  from SESSION_WAIT_RECORD;

                8177

--// 7500+75*9=8175 ,基本相近.

<a href="http://www.cnblogs.com/kerrycode/p/6292557.html">http://www.cnblogs.com/kerrycode/p/6292557.html</a>

那麼在Oracle沒有啟用DCD時,系統和資料庫如何判斷一個連接配接是否異常,需要關閉呢?這個時間是這樣計算的,首先它等待了

tcp_keepalive_time=7200,然後每隔75秒發送探測包,一共發送了9次後(7200+ 75*9 = 7875 ),都沒有收到用戶端應答,那麼它就判斷

這個連接配接死掉了,可以關閉了。是以這個值是一個固定值, 具體為7875, 當然不同的作業系統可能有所不同,取決于上面三個

tcp_keepalive參數,過了7875秒後, 這個時候PMON程序就會回收與它相關的所有資源(例如復原事務,釋放lock、latch、memory)。這

個值與我測試的時間非常接近了(考慮我們是采集的等待時間,以及測試腳本裡面有休眠時間,這樣采集的資料有些許偏差)。