天天看点

用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0

pg_statsinfo的架构如下: 分为三个组件: 1. pg_statsinfo 部署在被监控的数据库端,用于采集数据库瞬间状态,过滤数据库csv日志,需要与repository DB通信. 2. pg_reporter 部署在HTML报告服务器上,需要与repository DB通信,与被监控的数据库通信(可选). 3. repository DB 用于存放pg_statsinfo发送过来的snapshot报告。被pg_reporter调用,生产HTML报告。 另外,非常强的一点是可以自己编写模板。 架构如图:

报告分为两类: 第一类是pg_statsinfo,需要在repo数据库安装pg_statsinfo支持. 第二类是schema,需要有连接到被监控数据库的配置。 下面是statsinfo报告的介绍:

1. Summary

name 5480307906522906617
hostname db-172-16-3-33.sky-mobi.com.hz
port 1921
pg_version 9.0beta2
snapshot begin 2010-06-08 18:04:52
snapshot end 2010-06-09 13:30:00
snapshot duration 19:25:09
total database size 5073 kB
total commits 18698
total rollbacks 2

2.Database Statistics

ID database MB +MB commit/s rollback/s hit% gets/s reads/s rows/s
1 postgres 4 0.267 0.000 99.900 17.772 0.016 95.099
2 test 26 26 0.047 0.000 99.800 23.219 0.043 82.867
用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0
用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0
用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0
用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0

Disk Usage

Disk Usage per Tablespace

ID tablespace location device used (MB) avail (MB) remain%
1 <pg_xlog> /database/pgdata/tbs2/pg_xlog 104:33 187 137594 99.864
2 pg_default /database/pgdata/tbs1/pg_root 104:17 74 137707 99.946
3 pg_global /database/pgdata/tbs1/pg_root 104:17 74 137707 99.946
4 tbs_test /database/pgdata/tbs4/tbs_test 104:65 86 137695 99.937

Long Transactions

ID pid client address when to start duration (sec) query

Notable Tables

Heavily Updated Tables

ID database schema table INSERT UPDATE DELETE total HOT%
1 test test tbl_test 620075 620075
2 test pg_toast pg_toast_2619 12 6 18
3 test pg_catalog pg_attribute 7 7
4 test pg_catalog pg_shdepend 4 4
5 postgres pg_catalog pg_shdepend 4 4
6 test pg_catalog pg_depend 3 3
7 test pg_catalog pg_statistic 1 2 3 50.000
8 test pg_catalog pg_type 2 2
9 test pg_catalog pg_namespace 1 1
10 postgres pg_catalog pg_tablespace 1 1
11 postgres pg_catalog pg_database 1 1
12 test pg_catalog pg_authid 1 1
13 test pg_catalog pg_database 1 1
14 postgres pg_catalog pg_authid 1 1
15 test pg_catalog pg_tablespace 1 1
16 test pg_catalog pg_class 1 1
17 postgres pg_catalog pg_shdescription
18 postgres pg_catalog pg_foreign_data_wrapper
19 postgres pg_catalog pg_proc
20 postgres pg_catalog pg_user_mapping

Heavily Accessed Tables

ID database schema table seq_scan seq_tup_read tup_per_seq hit%

Low Density Tables

ID database schema table rows dead rows pages rows per page
1 test test tbl_test 620075 2744 225.975

Fragmented Tables

ID database schema table column correlation

Checkpoint Activity

total checkpoints 66
checkpoints by time 64
checkpoints by xlog
avg written buffers 42.500
max written buffers 2336.000
avg duration (sec) 3.721
max duration (sec) 149.940

Autovacuum Activity

ID database schema table count avg index scans avg removed rows avg remain rows avg duration (sec) max duration (sec)

Query Activity

Functions

ID funcid name name funcname calls total time (ms) self time (ms) time/call (ms)

Statements

ID user database query calls total time (sec) time/call (sec)
10 postgres postgres SELECT statsinfo.sample() 14034 0.316 0.000
16 postgres postgres SELECT * FROM statsinfo.tablespaces 283 0.082 0.000
19 postgres postgres SELECT * FROM statsinfo.activity() 283 0.019 0.000
1 postgres postgres SELECT d.oid AS dbid, d.datname, pg_database_size(d.oid), age(d.datfrozenxid), pg_stat_get_db_xact_commit(d.oid) AS xact_commit, pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback, pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid) AS blks_read, pg_stat_get_db_blocks_hit(d.oid) AS blks_hit, pg_stat_get_db_tuples_returned(d.oid) AS tup_returned, pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(d.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted FROM pg_database d WHERE datallowconn AND datname <> ALL (('{' || $1 || '}')::text[]) ORDER BY 1 254 2.688 0.011

Setting Parameters

ID name setting source
1 TimeZone PRC command line
2 checkpoint_segments 32 configuration file
3 default_statistics_target 1000 configuration file
4 default_text_search_config pg_catalog.english configuration file
5 effective_cache_size 1024000 configuration file
6 lc_messages C configuration file
7 listen_addresses * configuration file
8 log_autovacuum_min_duration 60000 configuration file
9 log_checkpoints on configuration file
10 log_destination csvlog override
11 log_directory /var/applog/pg_log configuration file
12 log_lock_waits on configuration file
13 log_statement ddl configuration file
14 log_timezone PRC command line
15 log_truncate_on_rotation on configuration file
16 logging_collector on override
17 max_connections 1500 configuration file
18 max_stack_depth 8192 configuration file
19 pg_statsinfo.excluded_dbnames template0,template1 configuration file
20 pg_statsinfo.repository_server hostaddr=172.16.3.39 port=1921 database=repo user=statsrepo → dbname=repo host=172.16.3.39 port=1921 user=statsrepo configuration file
21 random_page_cost 2 configuration file
22 server_encoding UTF8 override
23 shared_buffers 192000 configuration file
24 shared_preload_libraries pg_statsinfo,pg_stat_statements configuration file
25 stats_temp_directory /database/pgdata/tbs3/pg_stat_tmp configuration file
26 superuser_reserved_connections 13 configuration file
27 timezone_abbreviations Default command line
28 track_functions pl configuration file
29 wal_buffers 256 configuration file
30 wal_sync_method open_sync configuration file

Schema Information

Tables

ID database schema table columns row width MB +MB table scans index scans
1 test test tbl_test 1 4 21 21

Indexes

ID database schema index table MB +MB scans rows/scan reads hits keys

配置非常简单,下面简单的介绍一下配置时的注意事项: 安装需求: PostgreSQL 版本 PostgreSQL 8.3, 8.4, 9.0 操作系统 RHEL 5.3, CentOS 5.3, Windows XP 连接消耗 1 每个被监控的机器需要消耗1个repo DB连接. 限制: 1. 被监控系统的encoding and lc_messages必须相同 2. 被监控系统的pg_statsrepo.textlog_filename名字必须固定,建议所有监控系统一致. 3. log_timezone 参数必须设置为 unknown, gmt, or utc 4. 错误日志记录 fast或immediate关闭时,错误日志不被pg_statsinfo解析. 5. 不能采集到shutdown的checkpoint 如果repoDB与被监控的数据库是同一个集群,可能采集不到.

被监控数据库维护 1.-- 删除服务端日志pg_log 2.-- 手工生成snapshot psql -d postgres -U postgres -c "SELECT statsinfo.snapshot('comment')" 3.-- 回旋日志文件 psql -d postgres -U postgres -c "SELECT pg_rotate_logfile()" 4.-- 重启异常进程(会造成僵死进程) psql -d postgres -U postgres -c "SELECT statsinfo.restart()"

repo数据库维护 1. Delete Snapshots psql -d <repository> -c "SELECT statsrepo.del_snapshot('2010-02-01 07:00:00');"