天天看点

PostgreSQL的监控三(zabbix)

前面介绍的PostgreSQL监控工具都偏向于性能分析,没有告警功能。而且它们只是针对PostgreSQL的监视,有时需要监控整个业务相关的系统,这时候就要考虑通用的监控工具了。Linux下比较适合监控数据库的常用的工具有Nagios和Zabbix。Zabbix更容易使用,现在看上去也更被多数人看好,所以本文只介绍Zabbix监控PostgreSQL的方法。

PostgreSQL的监控三(zabbix)

Zabbix是一个all in one高度集成的企业级监控解决方案。由一个中心的Zabbix Server和若干可能安装有Zabbix Agent被监控设备构成,主要特性可概括为以下几点

数据采集

 支持agent和agent less(SNMP, IPMI, HTTP,FTP...)

 支持基于JMX对java应用的监视

 可灵活定制agent

数据存储

 数据库为PostgreSQL,Mysql,Oracle,SQLite或DB2

 可配置历史和趋势数据的保存时间

 内建旧清理程序防止数据膨胀

报警

 可定制报警阈值

 灵活设置报警方式,邮件,SMS,脚本

 支持报警升级

 报警消息可使用宏变量定制

可视化

 可定制的数据图形

 仪表盘

 地图

 所有配置都通过GUI编辑

大规模部署

 支持模板

 自动发现主机和监控项目

 通过Zabbix Proxy实现分布式部署

其他

 Zabbix API

 认证和访问控制

 IT资产收集

zabbix要想监视PostgreSQL这种应用型的对象,一般使用zabbix agent。zabbix agent有2种工作方式。

方式1:被动代理

由Zabbix Server(或Proxy)主动查询数据(如CPU负载),作为响应Zabbix Agent返回查询结果。这也是最简单最常用的方式。

方式2:主动代理

Zabbix Agent先从Zabbix Server获取需要主动报告的监控项目一览,然后定期发送新值到Zabbix Server。主动代理可以用于处理时间比较长的监控项,比如log 。

也可以使用Zabbix Trapper

方式3:Trapper

由Zabbix Agent主动报告数据。被监控端可调用zabbix_send命令或直接利用Socket发送数据到Zabbix Server。通过Trapper可以只在状态变更时进行报告。

另外还有把PostgreSQL状态通过SNMP代理发布的方案(http://pgsnmpd.projects.pgfoundry.org/),估计用的不多,本文不涉及。

详细参考:

https://www.zabbix.com/documentation/2.4/manual/concepts/agent

https://www.zabbix.com/documentation/2.4/manual/appendix/items/activepassive

https://www.zabbix.com/documentation/2.4/manual/config/items/itemtypes/trapper

Zabbix没有内置对PostgreSQL的监控项,所以如果要监控PostgreSQL需要做一些监控项的配置或定制,下面会介绍几种方法。

自己修改zabbix_agentd.conf的配置文件,在Zabbix agent上增加PostgreSQL相关的监控项,使用psql发SQL的方式获取PostgreSQL的性能数据。

zabbix_agentd.conf

#Get the PostgreSQL version

UserParameter=psql.version,psql --version|head -n1

#Get the total number of Server Processes that are active

UserParameter=psql.server_processes,psql -t -c "select sum(numbackends) from pg_stat_database"

#Get the total number of commited transactions

UserParameter=psql.tx_commited,psql -t -c "select sum(xact_commit) from pg_stat_database"

#Get the total number of rolled back transactions

UserParameter=psql.tx_rolledback,psql -t -c "select sum(xact_rollback) from pg_stat_database"

参照

https://www.zabbix.com/wiki/howto/monitor/db/postgresql

比如参照下面的例子

https://www.zabbix.com/forum/showthread.php?t=8009

zabbix_agentd.conf:

UserParameter=postgresql[*],/opt/zabbix/bin/zapost $1 $2

zapost:

#

# Name: zapost

# Checks PostgreSQL activity.

# Author: bashman

# Version: 1.0

zapostver="1.0"

rval=0

sql=""

case $1 in

#'summary')

# sql="select a.datname, pg_size_pretty(pg_database_size(a.datid)) as size, cast(blks_hit/(blks_read+blks_hit+0.000001)*100.0 as numeric(5,2)) as cache, cast(xact_commit/(xact_rollback+xact_commit+0.000001)*100.0 as numeric(5,2)) as success from pg_stat_database a order by a.datname"

# ;;

#'size')

        #comprobar aqui los parametros

# shift

# sql="select pg_database_size('$1') as size"

#'version')

# sql='select version()'

'totalsize')

        sql="select sum(pg_database_size(datid)) as total_size from pg_stat_database"

        ;;

'db_cache')

        # comprueba los parametros

        if [ ! -z $2 ]; then

        shift

            sql="select cast(blks_hit/(blks_read+blks_hit+0.000001)*100.0 as numeric(5,2)) as cache from pg_stat_database where datname = '$1'"

    fi

'db_success')

    if [ ! -z $2 ]; then

               sql="select cast(xact_commit/(xact_rollback+xact_commit+0.000001)*100.0 as numeric(5,2)) as success from pg_stat_database where datname = '$1'"

    ;;

'server_processes')

    sql="select sum(numbackends) from pg_stat_database"

'tx_commited')

    sql="select sum(xact_commit) from pg_stat_database"

'tx_rolledback')

    sql="select sum(xact_rollback) from pg_stat_database"

'db_size')

    # comprueba los parametros

        sql="select pg_database_size('$1')" #as size"

'db_connections')

            sql="select numbackends from pg_stat_database where datname = '$1'"

'db_returned')

        sql="select tup_returned from pg_stat_database where datname = '$1'"

'db_fetched')

            sql="select tup_fetched from pg_stat_database where datname = '$1'"

'db_inserted')

            shift

            sql="select tup_inserted from pg_stat_database where datname = '$1'"

'db_updated')

            sql="select tup_updated from pg_stat_database where datname = '$1'"

'db_deleted')

            sql="select tup_deleted from pg_stat_database where datname = '$1'"

'db_commited')

        sql="select xact_commit from pg_stat_database where datname = '$1'"

'db_rolled')

        sql="select xact_rollback from pg_stat_database where datname = '$1'"

'version')

    sql="version"

'zapostver')

        echo "$zapostver"

    exit $rval

*)

        echo "zapost version: $zapostver"

        echo "usage:"

    echo " $0 totalsize -- Check the total databases size."

    echo " $0 db_cache dbname> -- Check the database cache hit ratio (percentage)."

    echo " $0 db_success dbname> -- Check the database success rate (percentage)."

    echo " $0 server_processes -- Check the total number of Server Processes that are active."

    echo " $0 tx_commited -- Check the total number of commited transactions."

    echo " $0 tx_rolledback -- Check the total number of rolled back transactions."

    echo " $0 db_size dbname> -- Check the size of a Database (in bytes)."

    echo " $0 db_connections dbname> -- Check the number of active connections for a specified database."    

    echo " $0 db_returned dbname> -- Check the number of tuples returned for a specified database."

    echo " $0 db_fetched dbname> -- Check the number of tuples fetched for a specified database."

    echo " $0 db_inserted dbname> -- Check the number of tuples inserted for a specified database."

    echo " $0 db_updated dbname> -- Check the number of tuples updated for a specified database."

    echo " $0 db_deleted dbname> -- Check the number of tuples deleted for a specified database."

    echo " $0 db_commited dbname> -- Check the number of commited back transactions for a specified database."

    echo " $0 db_rolled dbname> -- Check the number of rolled back transactions for a specified database."

    echo " $0 version -- The PostgreSQL version."

    echo " $0 zapostver -- Version of this script."

        exit $rval

esac

if [ "$sql" != "" ]; then

    if [ "$sql" == "version" ]; then

        psql --version|head -n1

        rval=$?

    else

        psql -t -c "$sql"

fi

if [ "$rval" -ne 0 ]; then

      echo "ZBX_NOTSUPPORTED

和前面提到的2种方法相比,Postbix插件的功能更全面,它包含了PostgreSQL相关的监控项和图形的Zabbix模板。Postbix以一个的后台java deamon运行,这个deamon通过jdbc查询远端被监控数据库的状态然后以trap agent的方式发送到Zabbix Server。

Postbix来自http://www.smartmarmot.com/,除了Postbix该公司还有Orabbix,MySQLBix,这些工具的内部架构和使用方法基本相同,唯一的区别就是支持的被监控数据库不同。所以smartmarmot又推出了整合这几种数据库监控能力的DBforBIX。DBforBIX的内部结构和使用方法和Postbix基本相同,下面介绍一下DBforBIX的简单的使用例子。

1)下载dbforbix

http://www.smartmarmot.com/product/dbforbix/dbforbix-download/

2)在Zabbix Server上安装dbforbix

[root@zabbix ~]# mkdir /opt/dbforbix

[root@zabbix ~]# cd /opt/dbforbix

[root@zabbix dbforbix]# unzip /root/dbforbix-0.6.1.zip

[root@zabbix dbforbix]# cp /opt/dbforbix/init.d/dbforbix /etc/init.d/dbforbix

[root@zabbix dbforbix]# chmod +x /etc/init.d/dbforbix

[root@zabbix dbforbix]# chmod +x /opt/dbforbix/run.sh

[root@zabbix dbforbix]# chkconfig dbforbix on

3)导入dbforix的模板到Zabbix服务器

点击Zabbix GUI画面的"Configuration->Templates->Import"把下面的模板文件导入Zabbix服务器。

/opt/dbforbix/template/template_postgresql.xml

PostgreSQL的监控三(zabbix)

4)在被监控PostgreSQL实例上创建DBforBIX使用的账号并赋予权限

  CREATE USER zabbix WITH PASSWORD 'passw0rd';

  GRANT SELECT ON pg_stat_activity to zabbix;

  GRANT SELECT ON pg_database to zabbix;

  GRANT SELECT ON pg_authid to zabbix;

  GRANT SELECT ON pg_stat_bgwriter to zabbix;

  GRANT SELECT ON pg_locks to zabbix;

  GRANT SELECT ON pg_stat_database to zabbix

5)修改config.props

通过拷贝config.props.sample生成config.props, 然后修改config.props设置Zabbix Server的IP和端口号,设置被监控PostgreSQL数据库的访问账号

[root@zabbix dbforbix]# cp /opt/dbforbix/conf/config.props.sample /opt/dbforbix/conf/config.props

[root@zabbix dbforbix]# vi /opt/dbforbix/conf/config.props

ZabbixServerList=ZabbixServer

ZabbixServer.Address=IP_ADDRESS_OF_ZABBIX_SERVER

ZabbixServer.Port=PORT_OF_ZABBIX_SERVER

...

DBforBIX.PidFile=./logs/dbforbix.pid

DatabaseList=PGSQLDB2

PGSQLDB2.Url=jdbc:postgresql://host:port/database

PGSQLDB2.User=zabbix

PGSQLDB2.Password=passw0rd

注意:/opt/dbforbix/init.d/dbforbix有个Bug,本来应该从config.props中读取pid文件名的,结果写死了是dbforbix.pid,config.props.sample中的默认值又是orabix.pid。

6)修改pgsqlquery.props

通过拷贝pgsqlquery.props.sample生成pgsqlquery.props。pgsqlquery.props中定义了监控项目及对应的查询SQL,可以编辑QueryList对监控项做筛选。

[root@zabbix dbforbix]# cp /opt/dbforbix/conf/pgsqlquery.props.sample /opt/dbforbix/conf/pgsqlquery.props

看一下pgsqlquery.props.sample包含的内容

pgsqlquery.props.sample:

QueryList=activeconn,tupfetched,tupinserted,tupupdated,tupdeleted,xactcommit,xactrollback,exclusivelock,accessexclusivelock,accesssharelock,rowsharelock,rowexclusivelock,shareupdateexclusivelock,sharerowexclusivelock,checkpoints_timed,checkpoints_req,buffers_checkpoint,buffers_clean,maxwritten_clean,buffers_backend,buffers_alloc    

#statistic of database

activeconn.Query=select sum(numbackends) from pg_stat_database

tupreturned.Query=select sum(tup_returned) from pg_stat_database

tupfetched.Query=select sum(tup_fetched) from pg_stat_database

tupinserted.Query=select sum(tup_inserted) from pg_stat_database

tupupdated.Query=select sum(tup_updated) from pg_stat_database

tupdeleted.Query=select sum(tup_deleted) from pg_stat_database

xactcommit.Query=SELECT sum(xact_commit) FROM pg_stat_database

xactrollback.Query=SELECT sum(xact_rollback) FROM pg_stat_database

#locks

exclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ExclusiveLock'

accessexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='AccessExclusiveLock'

accesssharelock.Query=SELECT count(*) FROM pg_locks where mode='AccessShareLock'

rowsharelock.Query=SELECT count(*) FROM pg_locks where mode='RowShareLock'

rowexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='RowExclusiveLock'

shareupdateexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ShareUpdateExclusiveLock'

sharerowexclusivelock.Query=SELECT count(*) FROM pg_locks where mode='ShareRowExclusiveLock'

checkpoints_timed.Query=select checkpoints_timed from pg_stat_bgwriter

checkpoints_req.Query=select checkpoints_req from pg_stat_bgwriter

buffers_checkpoint.Query=select buffers_checkpoint from pg_stat_bgwriter

buffers_clean.Query=select buffers_clean from pg_stat_bgwriter

maxwritten_clean.Query=select maxwritten_clean from pg_stat_bgwriter

buffers_backend.Query=select buffers_backend from pg_stat_bgwriter

buffers_alloc.Query=select buffers_alloc from pg_stat_bgwriter

7)启动dbforbix deamon

[root@zabbix dbforbix]# /etc/init.d/dbforbix start

8) 在Zabbix Server上创建Host

点击Zabbix GUI画面的"Configuration->Hosts->Create Host"为被监控数据库创建一个专门的Host。"Host name"设置为“PGSQLDB2”( 必须和config.props的DatabaseList中的名称一致,这里是“PGSQLDB2”)。并且把Host“PGSQLDB2”链接到前面导入的模板"Template_PostgeSQL"。

PostgreSQL的监控三(zabbix)
PostgreSQL的监控三(zabbix)

9) 检查数据是否已被收集

点击Zabbix GUI画面的"Monitoring->Last data"检查数据是否已被收集。

PostgreSQL的监控三(zabbix)

参考

http://www.smartmarmot.com/wiki/index.php/DBforBIX

pg_monz是一套可以监控PostgreSQL的zabbix模板,通过定制的agent UserParameter监控PostgreSQL数据库,并且利用Zabbix的发现机制可以自动发现和监视数据库和表。

pg_monz由下面几个文件组成

文件

说明

pg_monz_template.xml

模版定义文件

userparameter_pgsql.conf

提供PostgreSQL监控项目的用户参数定义

find_dbname.sh

PG数据库的自动发现脚本

find_dbname_table.sh

PG数据表的自动发现脚本

要了解pg_monz支持哪些监控项目,看一下userparameter_pgsql.conf就可以了

userparameter_pgsql.conf:

点击(此处)折叠或打开

# PostgreSQL user parameter

# Server specific examples

# Get the total number of commited transactions

UserParameter=psql.tx_commited[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select sum(xact_commit) from pg_stat_database"

# Get the total number of rolled back transactions

UserParameter=psql.tx_rolledback[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select sum(xact_rollback) from pg_stat_database"

# Max Connections

UserParameter=psql.server_maxcon[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "show max_connections"

# PostgreSQL is running

UserParameter=psql.running[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select 1" > /dev/null 2>&1 ; echo $?

# Added by SRA OSS

# Get number of checkpoint count (by checkpoint_timeout)

UserParameter=psql.checkpoints_timed[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select checkpoints_timed from pg_stat_bgwriter"

# Get number of checkpoint count (by checkpoint_segments)

UserParameter=psql.checkpoints_req[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select checkpoints_req from pg_stat_bgwriter"

# Get the total number of connections

UserParameter=psql.server_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity;"

# Get the total number of active (on processing SQL) connections

UserParameter=psql.active_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(state) from pg_stat_activity where state = 'active'"

# Get the total number of idle connections

UserParameter=psql.idle_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(state) from pg_stat_activity where state = 'idle'"

# Get the total number of idle in transaction connections

UserParameter=psql.idle_tx_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(state) from pg_stat_activity where state = 'idle in transaction'"

# Get the total number of lock-waiting connections

UserParameter=psql.locks_waiting[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity where waiting = 't'"

# Get buffer information

UserParameter=psql.buffers_checkpoint[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_checkpoint from pg_stat_bgwriter"

UserParameter=psql.buffers_clean[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_clean from pg_stat_bgwriter"

UserParameter=psql.maxwritten_clean[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select maxwritten_clean from pg_stat_bgwriter"

UserParameter=psql.buffers_backend[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_backend from pg_stat_bgwriter"

UserParameter=psql.buffers_backend_fsync[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_backend_fsync from pg_stat_bgwriter"

UserParameter=psql.buffers_alloc[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select buffers_alloc from pg_stat_bgwriter"

# Get number of slow queries

UserParameter=psql.slow_queries[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval"

UserParameter=psql.slow_select_queries[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval and query ilike 'select%'"

UserParameter=psql.slow_dml_queries[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select count(*) from pg_stat_activity where state = 'active' and now() - query_start > '$5 sec'::interval and query ~* '^(insert|update|delete)'"

# Database specific examples

# Get the size of a Database (in bytes)

UserParameter=psql.db_size[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select pg_database_size('$5')"

# Get number of active connections for a specified database

UserParameter=psql.db_connections[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select numbackends from pg_stat_database where datname = '$5'"

# Get number of tuples returned for a specified database

UserParameter=psql.db_returned[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_returned from pg_stat_database where datname = '$5'"

# Get number of tuples fetched for a specified database

UserParameter=psql.db_fetched[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_fetched from pg_stat_database where datname = '$5'"

# Get number of tuples inserted for a specified database

UserParameter=psql.db_inserted[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_inserted from pg_stat_database where datname = '$5'"

# Get number of tuples updated for a specified database

UserParameter=psql.db_updated[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_updated from pg_stat_database where datname = '$5'"

# Get number of tuples deleted for a specified database

UserParameter=psql.db_deleted[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select tup_deleted from pg_stat_database where datname = '$5'"

# Get number of commited/rolled back transactions for a specified database

UserParameter=psql.db_tx_commited[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select xact_commit from pg_stat_database where datname = '$5'"

UserParameter=psql.db_tx_rolledback[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select xact_rollback from pg_stat_database where datname = '$5'"

# Cache Hit Ratio

UserParameter=psql.cachehit_ratio[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "SELECT round(blks_hit*100/(blks_hit+blks_read), 2) AS cache_hit_ratio FROM pg_stat_database WHERE datname = '$5' and blks_read > 0 union all select 0.00 AS cache_hit_ratio order by cache_hit_ratio desc limit 1"

# Get number of temp files

UserParameter=psql.db_temp_files[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select temp_files from pg_stat_database where datname = '$5'"

# Get temp file size (in bytes)

UserParameter=psql.db_temp_bytes[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select temp_bytes from pg_stat_database where datname = '$5'"

# Get percentage of dead tuples of all tables for a specified database

UserParameter=psql.db_dead_tup_ratio[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select round(sum(n_dead_tup)*100/sum(n_live_tup+n_dead_tup), 2) as dead_tup_ratio from pg_stat_all_tables where n_live_tup > 0"

# Get number of deadlocks for a specified database (9.2 or later)

UserParameter=psql.db_deadlocks[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select deadlocks from pg_stat_database where datname = '$5'"

# Table specific examples

# Get table cache hit ratio of a specific table

UserParameter=psql.table_cachehit_ratio[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select round(heap_blks_hit*100/(heap_blks_hit+heap_blks_read), 2) as cache_hit_ratio from pg_statio_user_tables where schemaname = '$5' and relname = '$6' and heap_blks_read > 0 union all select 0.00 as cache_hit_ratio order by cache_hit_ratio desc limit 1"

# Get number of sequencial scan of a specific table

UserParameter=psql.table_seq_scan[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select seq_scan from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

# Get number of index scan of a specific table

UserParameter=psql.table_idx_scan[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select coalesce(idx_scan,0) from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

# Get number of vacuum count of a specific table

UserParameter=psql.table_vacuum_count[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select vacuum_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

# Get number of analyze count of a specific table

UserParameter=psql.table_analyze_count[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select analyze_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

# Get number of autovacuum count of a specific table

UserParameter=psql.table_autovacuum_count[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select autovacuum_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

# Get number of autoanalyze count of a specific table

UserParameter=psql.table_autoanalyze_count[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select autoanalyze_count from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

# Get number of tuples of a specific table

UserParameter=psql.table_n_tup_ins[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_tup_ins from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

UserParameter=psql.table_n_tup_upd[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_tup_upd from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

UserParameter=psql.table_n_tup_del[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_tup_del from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

UserParameter=psql.table_seq_tup_read[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select seq_tup_read from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

UserParameter=psql.table_idx_tup_fetch[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select coalesce(idx_tup_fetch,0) from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

UserParameter=psql.table_n_tup_hot_upd[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_tup_hot_upd from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

UserParameter=psql.table_n_live_tup[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_live_tup from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

UserParameter=psql.table_n_dead_tup[*],psql -h $1 -p $2 -U $3 -d $4 -t -c "select n_dead_tup from pg_stat_user_tables where schemaname = '$5' and relname = '$6'"

# Discovery Rule

# Database Discovery

UserParameter=db.list.discovery[*],$5/find_dbname.sh $1 $2 $3 $4

UserParameter=db_table.list.discovery[*],$5/find_dbname_table.sh $1 $2 $3 $4

使用例:

1)下载pg_monz

https://github.com/pg-monz/pg_monz/releases

2)安装pg_monz

[root@zabbix ~]# tar xfz pg_monz-1.0.tar.gz

[root@zabbix ~]# cd pg_monz-1.0/pg_monz

[root@zabbix pg_monz]# cp find_dbname.sh find_dbname_table.sh /usr/local/bin/

[root@zabbix pg_monz]# cp userparameter_pgsql.conf /etc/zabbix/zabbix_agentd.d/

[root@zabbix pg_monz]# chmod +x /usr/local/bin/find_dbname.sh

[root@zabbix pg_monz]# chmod +x /usr/local/bin/find_dbname_table.sh

[root@zabbix pg_monz]# /etc/init.d/zabbix-agent restart

3)导入模板到Zabbix服务器

点击Zabbix GUI画面的"Configuration->Templates->Import"把模板文件pg_monz_template.xml导入Zabbix服务器。

PostgreSQL的监控三(zabbix)

4)设置模板中的宏

点击Zabbix GUI画面的"Configuration->Templates",再点击其中的"PostgreSQL Check"模板,然后点击"Marcos" Tab设置必要宏参数(尤其是连接相关的参数)。

PostgreSQL的监控三(zabbix)

5)在Zabbix Server上创建Host

点击Zabbix GUI画面的"Configuration->Hosts->Create Host"为被监控数据库所在主机创建一个Host,如果该主机的Host已存在也可使用已有Host。这个Host要设置Zabbix Agent,并且把该Host链接到前面导入的模板"PostgeSQL Check"。

PostgreSQL的监控三(zabbix)
PostgreSQL的监控三(zabbix)

6) 检查数据是否已被收集

PostgreSQL的监控三(zabbix)

http://pg-monz.github.io/pg_monz/index-en.html

以上的方法1和方法2都需要自己再进行定制,而DBforBIX和pg_monz已经比较成熟了。pg_monz和DBforBIX相比更简单,可监控的PostgreSQL项目也更多,还可以自动发现库和表;DBforBIX的优势则在于支持监控多种常用的数据库以及可以使用jdbc连接池。综合而言如果不需要监控多种数据库个人倾向于pg_monz。