天天看點

ORA-609 錯誤分析及解決方法 (轉載)

某個客戶資料庫在巡檢的時候發現alert日志裡不定期會出現ORA-609錯誤,大緻内容如下:

***********************************************************************

Fatal NI connect error 12537, connecting to:

 (LOCAL=NO)

  VERSION INFORMATION:

TNS for HPUX: Version 11.2.0.3.0 - Production

Oracle Bequeath NT Protocol Adapter for HPUX: Version 11.2.0.3.0 - Production

TCP/IP NT Protocol Adapter for HPUX: Version 11.2.0.3.0 - Production

  Time: 19-OCT-2014 20:24:16

  Tracing not turned on.

  Tns error struct:

    ns main err code: 12537

TNS-12537: TNS:connection closed

    ns secondary err code: 12560

    nt main err code: 0

    nt secondary err code: 0

    nt OS err code: 0

opiodr aborting process unknown ospid (2734) as a result of ORA-609

Sun Oct 19 21:27:24 2014

由于ORA-609的緣故,ospid(xxxx)程序被aborting了,同時還伴随着TNS-12537的錯誤,連接配接關閉

去MOS搜了一圈,正好有篇文檔是針對這個錯誤的,下面是描述:

适用于:

Oracle Net Services - Version 11.2.0.1 to 11.2.0.3 [Release 11.2]

Information in this document applies to any platform.

症狀:

alert日志出現以上類似的内容(略)

變化:

Changes in database server load, client connect descriptor, changes in network infrastructure (firewall configuration).

原因:

首先,這個“opiodr aborting process unknown ospid (2734) as a result of ORA-609”消息僅僅是說明了由于ORA-609,使Oracle資料庫專用程序被關閉了

來看一段描述:

ORA-609 means  "could not attach to incoming connection" so the database process was 'aborted' (closed) because it couldn't attach to the incoming connection passed to it by the listener.

ORA-609意味着不能通過監聽把它附加到即将到來的連接配接上,是以伺服器程序被終止(關閉)

The reason for this is found in the sqlnet error stack, in our case is:

   TNS-12537: TNS:connection closed.

Basically the dedicated process didn't have a client connection anymore to work with.

用戶端連接配接有6個步驟:

Client initiates a connection to the database so it connects to the listenerListener starts (fork) a dedicated database process that will receive this connection (session)After this dedicated process is started, the listener passes the connection from the client to this processThe server process takes the connection from the listener to continue the handshake with the clientServer process and client exchange information required for establishing a session (ASO, Two Task Common, User logon)Session is opened

In the case of the above error the connection from the client was closed somewhere between 3. and 4. So when the dedicated process tries to communicate with the client it finds that connection closed.

鑒于以上的錯誤,在第3步與第4步之間時,用戶端連接配接就關閉了,此時當專有程序嘗試與用戶端連接配接時,發現連接配接已經關閉了

To determine the client which hit this problem we can try to match the timestamp of the error from alert log with an entry in listener.log, but this might be difficult in case of a loaded listener with many incoming connections per second.

Server sqlnet trace will not provide any information about the client.

去确定碰到問題的client,我們可以嘗試去比對alert日志中錯誤發生的時間戳并且在監聽日志中也有相應的條目,但當加載的監聽每秒有許多連接配接的時候是非常困難去判斷的,伺服器sqlnet的trace不會提供任何該用戶端的資訊

We can enable sqlnet server trace to catch the error (the match is done based on the ospid found in sqlnet server trace file name and the line with ORA-609 error):

還可以啟用sqlnet server的trace中抓取到ORA-609錯誤,比對成功基于sqlnet server trace檔案名和ORA-609錯誤資訊中的ospid

nscon: doing connect handshake...    nscon: recving a packet    nsprecv: entry    nsprecv: reading from transport...    nttrd: entry    nttrd: exit    ntt2err: entry    ntt2err: Read unexpected EOF ERROR on 15    <<<<<<< error    ntt2err: exit    nsprecv: error exit    nserror: entry    nserror: nsres: id=0, op=68, ns=12537, ns2=12560; nt[0]=507, nt[1]=0, nt[2]=0; ora[0]=0, ora[1]=0, ora[2]=0    nscon: error exit    nsdo: nsctxrnk=0    nsdo: error exit    nsinh_hoff: error recving request

可能引起問原因:

Several possible situations can cause this to happen:

    client changed its mind and closed the connection immediately after initiating it    client crashed    firewall kills the connection    some oracle timeout set on client解決方案:

Because the entry from listener.log contains only CONNECT_DATA and CID related information we need to check the client configuration for any sqlnet  timeouts:

possible timeouts in sqlnet.ora in client oracle home:    sqlnet.outbound_connect_time

    sqlnet.recv_timeout

    sqlnet.send_timeout

    tcp_connect_timeout

檢查用戶端目錄中sqlnet.ora的逾時設定,通常是這個引起的

possible timeout in client connect descriptor (hardcoded in client application or in client tnsnames.ora):    connect_timeout