天天看点

oracle-rac 遇到的错误 INS-35423 安装 database 时安装程序无法获取集群节点

这两天安装生产环境的RAC,成功装完后,期间也遇到了不少小问题,今天一同事把服务器重启启动了一下之后,CRS就启动不了了。。

数据库:oracle 11.2.0.1,系统是oraclelinux6.3

以下是我做的错误及处理记录

各位大师,是否还有更好的处理办法啊??请指教!!!

错误现象: 服务器重启之后,crs启动不了 [[email protected] ~ ]$crs_stat –t -v CRS-0184: Cannot communicate with the CRS daemon. [[email protected] ~]$ crsctl start crs CRS-4563: Insufficient user privileges. CRS-4000: Command Start failed, or completed with errors. [[email protected] ~]$ crsctl check crs CRS-4639: Could not contact Oracle High Availability Services CRS-4000: Command Start failed, or completed with errors. 网上查到,说这个是11.2.0.1数据库版本的bug问题 SQL> select * from v$version; BANNER -------------------------------------------------------------------------------- Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production PL/SQL Release 11.2.0.1.0 - Production CORE    11.2.0.1.0      Production TNS for Linux: Version 11.2.0.1.0 - Production NLSRTL Version 11.2.0.1.0 – Production 我的解决办法: 两个节点同时执行 [[email protected] bin]# dd if=/var/tmp/.oracle/npohasd of=/dev/null bs=1024 count=1 查看ohasd日志 [g[email protected] log]$ tail -f /u01/11.2.0/grid/log/rac1/ohasd/ohasd.log 2013-08-14 17:08:18.621: [UiServer][3376371456] S(0x7f1f7c000b70): Accepted client connection: saddr =(ADDRESS=(PROTOCOL=ipc)(DEV=29)(KEY=OHASD_UI_SOCKET))daddr = (ADDRESS=(PROTOCOL=ipc)(KEY=OHASD_UI_SOCKET)) 2013-08-14 17:08:18.632: [UiServer][3378472704] processMessage called 2013-08-14 17:08:18.632: [UiServer][3378472704] Sending message to PE. ctx= 0x7f1f7c002f20 2013-08-14 17:08:18.632: [UiServer][3378472704] Sending command to PE: 43 2013-08-14 17:08:18.632: [   CRSPE][3382675200] Processing PE command id=144. Description: [Stat Resource : 0x7f1f8407d710] 2013-08-14 17:08:18.633: [   CRSPE][3382675200] Expression Filter : (((NAME == ora.crsd) OR (NAME == ora.cssd)) OR (NAME == ora.evmd)) 2013-08-14 17:08:18.636: [   CRSPE][3382675200] PE Command [ Stat Resource : 0x7f1f8407d710 ] has completed 2013-08-14 17:08:18.636: [   CRSPE][3382675200] UI Command [Stat Resource : 0x7f1f8407d710] is replying to sender. 2013-08-14 17:08:18.636: [UiServer][3378472704] Done for ctx=0x7f1f7c002f20 2013-08-14 17:08:18.642: [UiServer][3376371456] Closed: remote end failed/disc. 2013-08-14 18:41:19.320: [ default][3993569056] OHASD Daemon Starting. Command string :reboot 2013-08-14 18:41:19.320: [ default][3600992032] OHASD Daemon Starting. Command string :reboot 之后重新启动crs服务 [[email protected] log]$ crs_start -all CRS-5702: Resource 'ora.DATA.dg' is already running on 'rac1' CRS-5702: Resource 'ora.LISTENER.lsnr' is already running on 'rac1' CRS-5702: Resource 'ora.LISTENER_SCAN1.lsnr' is already running on 'rac1' CRS-5702: Resource 'ora.OCR.dg' is already running on 'rac1' CRS-5702: Resource 'ora.RCY.dg' is already running on 'rac1' CRS-5702: Resource 'ora.VOTE.dg' is already running on 'rac1' CRS-5702: Resource 'ora.asm' is already running on 'rac1' CRS-5702: Resource 'ora.eons' is already running on 'rac1' CRS-5702: Resource 'ora.gsd' is already running on 'rac1' CRS-5702: Resource 'ora.net1.network' is already running on 'rac1' CRS-5702: Resource 'ora.oc4j' is already running on 'rac1' CRS-5702: Resource 'ora.ons' is already running on 'rac1' CRS-5702: Resource 'ora.asm' is already running on 'rac1' CRS-5702: Resource 'ora.LISTENER.lsnr' is already runnin 。。。。。。。。。。。。。。。。。。。。。。 。。。。。。。。。。。。。。。。 [[email protected] log]$ crsctl check crs CRS-4638: Oracle High Availability Services is online CRS-4537: Cluster Ready Services is online CRS-4529: Cluster Synchronization Services is online CRS-4533: Event Manager is online 服务一切正常 [email protected] log]$ crs_stat -t Name           Type           Target    State     Host         ------------------------------------------------------------ ora.DATA.dg    ora....up.type ONLINE    ONLINE    rac1         ora....ER.lsnr ora....er.type ONLINE    ONLINE    rac1         ora....N1.lsnr ora....er.type ONLINE    ONLINE    rac1         ora.OCR.dg     ora....up.type ONLINE    ONLINE    rac1         ora.RCY.dg     ora....up.type ONLINE    ONLINE    rac1         ora.VOTE.dg    ora....up.type ONLINE    ONLINE    rac1         ora.asm        ora.asm.type   ONLINE    ONLINE    rac1         ora.eons       ora.eons.type  ONLINE    ONLINE    rac1         ora.gsd        ora.gsd.type   ONLINE    ONLINE    rac1         ora....network ora....rk.type ONLINE    ONLINE    rac1         ora.oc4j       ora.oc4j.type  ONLINE    ONLINE    rac1         ora.ons        ora.ons.type   ONLINE    ONLINE    rac1         ora.orcl.db    ora....se.type ONLINE    ONLINE    rac1         ora....SM1.asm application    ONLINE    ONLINE    rac1         ora....C1.lsnr application    ONLINE    ONLINE    rac1         ora.rac1.gsd   application    ONLINE    ONLINE    rac1         ora.rac1.ons   application    ONLINE    ONLINE    rac1         ora.rac1.vip   ora....t1.type ONLINE    ONLINE    rac1         ora....SM2.asm application    ONLINE    ONLINE    rac2         ora....C2.lsnr application    ONLINE    ONLINE    rac2         ora.rac2.gsd   application    ONLINE    ONLINE    rac2         ora.rac2.ons   application    ONLINE    ONLINE    rac2         ora.rac2.vip   ora....t1.type ONLINE    ONLINE    rac2         ora.scan1.vip  ora....ip.type ONLINE    ONLINE    rac1  

INS-35423 安装 database 时安装程序无法获取集群节点

操作系统为 RedHat Linux 6.4,已安装Oracle 11.2.0.4.0 版本的 Grid后,开始以oracle用户身份安装对应版本的database软件,结果在在走到  Oracle Database 11g Release 2 Installer database - Step 4 of 10 中的 Grid Instrallation Options 界面,集群列表为空,且报INS-35423错误,见下图:

通过检查各项配置均无误后,于是在网络上搜索了下,在国外的一个论坛上找到如下界面方法:

之所以未能获取到集群节点,是因为位于grid用户下的ORACLE_BASE下的inventory目录内的一个xml文件有问题,请看如下文件:

[[email protected] ContentsXML]# cat inventory.xml  <?xml version="1.0" standalone="yes" ?> <!-- Copyright (c) 1999, 2013, Oracle and/or its affiliates. All rights reserved. --> <!-- Do not modify the contents of this file by hand. --> <INVENTORY> <VERSION_INFO>    <SAVED_WITH>11.2.0.4.0</SAVED_WITH>    <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER> </VERSION_INFO> <HOME_LIST> <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11.2.0/grid" TYPE="O" IDX="1"> ------少了一个CRS="true"内容    <NODE_LIST>       <NODE NAME="qjdb1"/>       <NODE NAME="qjdb2"/>    </NODE_LIST> </HOME> </HOME_LIST> <COMPOSITEHOME_LIST> </COMPOSITEHOME_LIST> </INVENTORY>

通过执行如下命令修改: [[email protected] ~]$ /u01/app/11.2.0/grid/oui/bin/runInstaller -updateNodeList ORACLE_HOME="/u01/app/11.2.0/grid" CRS=true Starting Oracle Universal Installer...

Checking swap space: must be greater than 500 MB.   Actual 32000 MB    Passed The inventory pointer is located at /etc/oraInst.loc The inventory is located at /u01/app/oraInventory 'UpdateNodeList' was successful.

上面命令成功执行后,可以发现inventory.xml文件的内容发生了变化,已经调整为预期内容:

[[email protected] ContentsXML]# cat inventory.xml  <?xml version="1.0" standalone="yes" ?> <!-- Copyright (c) 1999, 2013, Oracle and/or its affiliates. All rights reserved. --> <!-- Do not modify the contents of this file by hand. --> <INVENTORY> <VERSION_INFO>    <SAVED_WITH>11.2.0.4.0</SAVED_WITH>    <MINIMUM_VER>2.1.0.6.0</MINIMUM_VER> </VERSION_INFO> <HOME_LIST> <HOME NAME="Ora11g_gridinfrahome1" LOC="/u01/app/11.2.0/grid" TYPE="O" IDX="1" CRS="true">    <NODE_LIST>       <NODE NAME="qjdb1"/>       <NODE NAME="qjdb2"/>    </NODE_LIST> </HOME> </HOME_LIST> <COMPOSITEHOME_LIST> </COMPOSITEHOME_LIST> </INVENTORY>

需要提醒的是,仅可通过命令方式更新inventory.xml文件,如果通过手工修改inventory.xml文件内容,可能会导致连安装database的界面都打不开,如runInstaller命令无法执行。