pg_statsinfo的架构如下: 分为三个组件: 1. pg_statsinfo 部署在被监控的数据库端,用于采集数据库瞬间状态,过滤数据库csv日志,需要与repository DB通信. 2. pg_reporter 部署在HTML报告服务器上,需要与repository DB通信,与被监控的数据库通信(可选). 3. repository DB 用于存放pg_statsinfo发送过来的snapshot报告。被pg_reporter调用,生产HTML报告。 另外,非常强的一点是可以自己编写模板。 架构如图:
报告分为两类: 第一类是pg_statsinfo,需要在repo数据库安装pg_statsinfo支持. 第二类是schema,需要有连接到被监控数据库的配置。 下面是statsinfo报告的介绍:
1. Summary name 5480307906522906617 hostname db-172-16-3-33.sky-mobi.com.hz port 1921 pg_version 9.0beta2 snapshot begin 2010-06-08 18:04:52 snapshot end 2010-06-09 13:30:00 snapshot duration 19:25:09 total database size 5073 kB total commits 18698 total rollbacks 2
2.Database Statistics ID database MB +MB commit/s rollback/s hit% gets/s reads/s rows/s 1 postgres 4 0.267 0.000 99.900 17.772 0.016 95.099 2 test 26 26 0.047 0.000 99.800 23.219 0.043 82.867
用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0 用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0 用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0 用pg_statsinfo监控PostgreSQL v8.3,v8.4,v9.0 Disk Usage Disk Usage per Tablespace ID tablespace location device used (MB) avail (MB) remain% 1 <pg_xlog> /database/pgdata/tbs2/pg_xlog 104:33 187 137594 99.864 2 pg_default /database/pgdata/tbs1/pg_root 104:17 74 137707 99.946 3 pg_global /database/pgdata/tbs1/pg_root 104:17 74 137707 99.946 4 tbs_test /database/pgdata/tbs4/tbs_test 104:65 86 137695 99.937
Long Transactions ID pid client address when to start duration (sec) query
Notable Tables Heavily Updated Tables ID database schema table INSERT UPDATE DELETE total HOT% 1 test test tbl_test 620075 620075 2 test pg_toast pg_toast_2619 12 6 18 3 test pg_catalog pg_attribute 7 7 4 test pg_catalog pg_shdepend 4 4 5 postgres pg_catalog pg_shdepend 4 4 6 test pg_catalog pg_depend 3 3 7 test pg_catalog pg_statistic 1 2 3 50.000 8 test pg_catalog pg_type 2 2 9 test pg_catalog pg_namespace 1 1 10 postgres pg_catalog pg_tablespace 1 1 11 postgres pg_catalog pg_database 1 1 12 test pg_catalog pg_authid 1 1 13 test pg_catalog pg_database 1 1 14 postgres pg_catalog pg_authid 1 1 15 test pg_catalog pg_tablespace 1 1 16 test pg_catalog pg_class 1 1 17 postgres pg_catalog pg_shdescription 18 postgres pg_catalog pg_foreign_data_wrapper 19 postgres pg_catalog pg_proc 20 postgres pg_catalog pg_user_mapping
Heavily Accessed Tables ID database schema table seq_scan seq_tup_read tup_per_seq hit%
Low Density Tables ID database schema table rows dead rows pages rows per page 1 test test tbl_test 620075 2744 225.975
Fragmented Tables ID database schema table column correlation
Checkpoint Activity total checkpoints 66 checkpoints by time 64 checkpoints by xlog avg written buffers 42.500 max written buffers 2336.000 avg duration (sec) 3.721 max duration (sec) 149.940
Autovacuum Activity ID database schema table count avg index scans avg removed rows avg remain rows avg duration (sec) max duration (sec)
Query Activity Functions ID funcid name name funcname calls total time (ms) self time (ms) time/call (ms)
Statements ID user database query calls total time (sec) time/call (sec) 10 postgres postgres SELECT statsinfo.sample() 14034 0.316 0.000 16 postgres postgres SELECT * FROM statsinfo.tablespaces 283 0.082 0.000 19 postgres postgres SELECT * FROM statsinfo.activity() 283 0.019 0.000 1 postgres postgres SELECT d.oid AS dbid, d.datname, pg_database_size(d.oid), age(d.datfrozenxid), pg_stat_get_db_xact_commit(d.oid) AS xact_commit, pg_stat_get_db_xact_rollback(d.oid) AS xact_rollback, pg_stat_get_db_blocks_fetched(d.oid) - pg_stat_get_db_blocks_hit(d.oid) AS blks_read, pg_stat_get_db_blocks_hit(d.oid) AS blks_hit, pg_stat_get_db_tuples_returned(d.oid) AS tup_returned, pg_stat_get_db_tuples_fetched(d.oid) AS tup_fetched, pg_stat_get_db_tuples_inserted(d.oid) AS tup_inserted, pg_stat_get_db_tuples_updated(d.oid) AS tup_updated, pg_stat_get_db_tuples_deleted(d.oid) AS tup_deleted FROM pg_database d WHERE datallowconn AND datname <> ALL (('{' || $1 || '}')::text[]) ORDER BY 1 254 2.688 0.011
Setting Parameters ID name setting source 1 TimeZone PRC command line 2 checkpoint_segments 32 configuration file 3 default_statistics_target 1000 configuration file 4 default_text_search_config pg_catalog.english configuration file 5 effective_cache_size 1024000 configuration file 6 lc_messages C configuration file 7 listen_addresses * configuration file 8 log_autovacuum_min_duration 60000 configuration file 9 log_checkpoints on configuration file 10 log_destination csvlog override 11 log_directory /var/applog/pg_log configuration file 12 log_lock_waits on configuration file 13 log_statement ddl configuration file 14 log_timezone PRC command line 15 log_truncate_on_rotation on configuration file 16 logging_collector on override 17 max_connections 1500 configuration file 18 max_stack_depth 8192 configuration file 19 pg_statsinfo.excluded_dbnames template0,template1 configuration file 20 pg_statsinfo.repository_server hostaddr=172.16.3.39 port=1921 database=repo user=statsrepo → dbname=repo host=172.16.3.39 port=1921 user=statsrepo configuration file 21 random_page_cost 2 configuration file 22 server_encoding UTF8 override 23 shared_buffers 192000 configuration file 24 shared_preload_libraries pg_statsinfo,pg_stat_statements configuration file 25 stats_temp_directory /database/pgdata/tbs3/pg_stat_tmp configuration file 26 superuser_reserved_connections 13 configuration file 27 timezone_abbreviations Default command line 28 track_functions pl configuration file 29 wal_buffers 256 configuration file 30 wal_sync_method open_sync configuration file
Schema Information Tables ID database schema table columns row width MB +MB table scans index scans 1 test test tbl_test 1 4 21 21
Indexes ID database schema index table MB +MB scans rows/scan reads hits keys
配置非常简单,下面简单的介绍一下配置时的注意事项: 安装需求: PostgreSQL 版本 PostgreSQL 8.3, 8.4, 9.0 操作系统 RHEL 5.3, CentOS 5.3, Windows XP 连接消耗 1 每个被监控的机器需要消耗1个repo DB连接. 限制: 1. 被监控系统的encoding and lc_messages必须相同 2. 被监控系统的pg_statsrepo.textlog_filename名字必须固定,建议所有监控系统一致. 3. log_timezone 参数必须设置为 unknown, gmt, or utc 4. 错误日志记录 fast或immediate关闭时,错误日志不被pg_statsinfo解析. 5. 不能采集到shutdown的checkpoint 如果repoDB与被监控的数据库是同一个集群,可能采集不到.
被监控数据库维护 1.-- 删除服务端日志pg_log 2.-- 手工生成snapshot psql -d postgres -U postgres -c "SELECT statsinfo.snapshot('comment')" 3.-- 回旋日志文件 psql -d postgres -U postgres -c "SELECT pg_rotate_logfile()" 4.-- 重启异常进程(会造成僵死进程) psql -d postgres -U postgres -c "SELECT statsinfo.restart()"
repo数据库维护 1. Delete Snapshots psql -d <repository> -c "SELECT statsrepo.del_snapshot('2010-02-01 07:00:00');"