天天看点

【原创】nagios之check_oracle_health插件使用

【原创】nagios之check_oracle_health插件使用

环境:192.168.1.1(监控机)

      192.168.1.2(被监控机)上面跑着oracle数据库。

1、查看被监控是否安装了perl?并且被监控机安装DBI

输入perl -v,出现以下信息则说明已安装

This is perl, v5.8.8 built for x86_64-linux-thread-multi

Copyright 1987-2006, Larry Wall

Perl may be copied only under the terms of either the Artistic License or the

GNU General Public License, which may be found in the Perl 5 source kit.

Complete documentation for Perl, including FAQ lists, should be found on

this system using "man perl" or "perldoc perl".  If you have access to the

下载DBI

tar zxvf DBI-1.609.tar.gz

cd DBI-1.609

perl Makefile.PL

make all

make install

2、没有报错我们进行下一步安装DBD-Oracle

tar zxvf DBD-Oracle-1.52.tar.gz

cd DBD-Oracle-1.52

执行上述命令你肯定会遇到如下错误:

Using DBI 1.605 (for perl 5.008005 on i386-linux-thread-multi) installed in /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi/auto/DBI/

Configuring DBD::Oracle for perl 5.008005 on linux (i386-linux-thread-multi)

Remember to actually *READ* the README file! Especially if you have any problems.

Trying to find an ORACLE_HOME

Your LD_LIBRARY_PATH env var is set to ''

      The ORACLE_HOME environment variable is not set and I couldn't guess it.

      It must be set to hold the path to an Oracle installation directory

      on this machine (or a machine with a compatible architecture).

      See the appropriate README file for your OS for more information.

      ABORTED!

然后你需要设置你的临时ORACLE_HOME变量,参考你的oracle用户的环境变量,贴上下面的语句:

export ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1

再执行perl Makefile.PL就OK了

make

3、被监控机最后一步开始安装主角了,check_oracle_health

tar zxvf check_oracle_health-1.6.3.tar.gz

cd check_oracle_health-1.6.3

./configure --prefix=/usr/local/nagios --with-nagios-user=nagios --with-nagios-group=nagios --with-mymodules-dir=/usr/local/nagios/libexec --with-mymodules-dyndir=/usr/local/nagios/libexec

上面的步骤注意写你自己的nagios安装路径。

查看被监控机/usr/local/nagios/libexec目录下插件check_oracle_health是否有了?

4、切换到oracle用户,试运行一下这个插件看看?

/usr/local/nagios/libexec/check_oracle_health --connect=你oracle的SID --user=oracle用户 --password=oracle密码 --mode=tnsping

输出如下信息说明没有问题:

OK - connection established to 你oracle的SID.

或者你可以把最后的--mode=tnsping换成--mode=tablespace-usage试试看是否能查看所有表空间了?

5、上面是oracle用户运行没有任何问题,但是我们是root运行的,所以必须把oracle用户下的所有变量加入到root用户的变量下,再尝试上面的第4步看看是否有问题?没问题则说明OK了!有问题则说明环境变量没加好!

6、被监控测试自己是没问题了,如何让监控机去调用这个脚本呢?在被监控上面的nrpe.cfg文件加入如下内容:

vi /usr/local/nagios/etc/nrpe.cfg

command[check_oracle_health]=/usr/local/nagios/libexec/check_oracle_health --connect=你oracle的SID --user=oracle用户 --password=oracle密码 --mode=tablespace-usage

保存后退出,然后我们重启被监控的nrpe服务

killall nrpe

/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg -d

7、下面我们该到监控上去检查这个插件了?

/usr/local/nagios/libexec/check_nrpe -H 你的被监控机IP地址 -c check_oracle_health

如果正常会输出你所有的表空间内容,这里我就不列出我的表空间内容了哈!

8、修改监控机的/usr/local/nagios/etc/objects/services.cfg文件,增加如下内容:

define service{

          host_name              数据库IP地址          

          service_description    check-oracle-tablespace

          check_command          check_nrpe!check_oracle_health

          max_check_attempts     5

          normal_check_interval  3

          retry_check_interval   2

          check_period           24x7

          notification_interval  10

          notification_period    24x7

          notification_options   w,u,c,r

          contact_groups         sagroup

          }

保存重启nagios,你web界面被监控机应该就看到如下图所示了!

【原创】nagios之check_oracle_health插件使用

这样基本就算大功告成了!

不过上图只看到一个表空间哈!你点进去之后就看到所有表空间的使用率了!

遇到的问题:

帮一好朋友也是安装这个插件,然后oracle用户去执行查看,也就是上面的第4步出现如下信息:

CRITICAL - cannot connect to orcl. install_driver(Oracle) failed: Can't load '/usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi/auto/DBD/Oracle/Oracle.so' for module DBD::Oracle: /u01/app/oracle/product/10.2.0/db_1/lib/libnnz10.so: cannot restore segment prot after reloc: Permission denied at /usr/lib/perl5/5.8.8/i386-linux-thread-multi/DynaLoader.pm line 230.

 at (eval 13) line 3

Compilation failed in require at (eval 13) line 3.

Perhaps a required shared library or dll isn't installed where expected

 at ./check_oracle_health line 4098

查了半天结果是selinux开启导致的。

关闭selinux命令

setenforce 0

结果就正常了!