天天看点

使用Pgpool搭建Postgresql集群背景步骤

操作系统:CentOS7

Postgresql版本:10.10

Pgpool-II版本:4.0.6

端口

  9999:对外提供服务

  9000:watch_dog

  9898:pcp

  9694:watch_dog心跳检查

备注:下文的安装步骤中,【】包围的部分需根据自己的环境来配置

背景

  项目上有实现数据库高可用的需求,Postgresql本身支持数据同步和手动failover。即,一个双机热备的Postgresql集群,主节点宕机时,整个集群只能对外提供读服务,而无法提供写服务,只有手动执行promote命令晋升备节点为主节点,才能恢复数据库服务的正常。以下是Postgresql官方文档的引文:

PostgreSQL does not provide the system software required to identify a

failure on the primary and notify the standby database server.

  Pgpool通过其WatchDog的虚拟IP和健康检查功能实现了自动故障切换。本文使用master-slave模式搭建集群,不使用replication_mode的原因是该模式下的online recovery存在限制,Pgpool官网引文如下:

There is a restriction in the online recovery in replication_mode. If

Pgpool-II itself is installed on multiple hosts, online recovery does

not work correctly, because Pgpool-II has to stop all the clients

during the 2nd stage of online recovery. If there are several

Pgpool-II hosts, only one of them will have received the online

recovery command and will block the connections from clients.

  为实现分区容错,本文使用3个节点搭建集群,每个节点安装Pgpool-II和Postgresql数据库,数据库为1主2备。

  注意:Pgpool-II和Postgresql集群对于主节点使用的术语不同,Pgpool-II的主节点称为Master,Postgresql的主节点称为Primary;备节点使用的术语相同,都为standby。

步骤

1 安装并配置PostgreSQL

  在所有节点安装PostgreSQL,但仅需要在节点1配置并启动PostgreSQL,然后使用pcp_recovery_node功能初始化并启动节点2、3上的Postgresql服务,注意不要让PostgreSQL开机自启。参考PG数据库安装。

2 配置/etc/hosts

  增加DNS配置,供recovery_1st_stage使用

【节点1的IP】 【节点1的主机名】
【节点2的IP】 【节点2的主机名】
【节点3的IP】 【节点3的主机名】
           

3 配置ssh免密通信

  Pgpool的failover配置中使用的failover.sh、follow_master.sh,以及online recovery配置中使用的recovery_1st_stage、pgpool_remote_start脚本中,都涉及通过ssh进行远程操作。

  需要配置root用户到postgres用户(包括本机的postgres用户)的单向,以及postgres用户之间的双向免密通信。

3.1 设置postgres用户的密码

su
passwd postgres
           

3.2 在当前节点执行以下语句,生成密钥

  注意不要设置passphrase

su 【当前用户】
ssh-keygen -t rsa
           

3.3 将公钥发送到其他节点

ssh-copy-id -i ~/.ssh/id_rsa.pub 【用户名】@【IP】
           

3.4 测试ssh

ssh 【用户名】@【IP】
           

4 修改ssh配置(可选,默认10个一般够)

vi /etc/ssh/sshd_config
           

修改

MaxSessions 100
           

重启sshd

systemctl restart sshd
           

5 配置系统日志

  创建日志文件,应创建在/var/log路径下,否则syslog无法写入文件(除非关闭或配置SELinux)。

mkdir /var/log/pgpool
touch /var/log/pgpool/pgpool_run.log
           

修改syslog配置

vi /etc/rsyslog.conf
           

修改

*.info;mail.none;authpriv.none;cron.none;LOCAL1.none  /var/log/messages
LOCAL1.*                                        /var/log/pgpool/pgpool_run.log
           

重启

systemctl restart rsyslog
           

6 配置pgpass

  为了允许数据库的postgres用户无需为流复制和在线恢复指定密码,对于每个节点,在操作系统的postgres用户的主目录中创建.pgpass文件,权限改为600。

su postgres
vi ~/.pgpass
           

编辑

# hostname:port:database:username:password
# In a standby server, a database field of replication matches streaming replication connections made to the master server.
【节点1的IP】:5432:postgres:postgres:123456
【节点2的IP】:5432:postgres:postgres:123456
【节点3的IP】:5432:postgres:postgres:123456
           

授权

chmod 600 ~/.pgpass
           

7 安装Pgpool

  root用户执行,在所有节点安装和配置Pgpool;为了使用在线恢复功能,需要pgpool_recovery、pgpool_remote_start、pgpool_switch_xlog等函数,因此需要安装pgpool_recovery。

  注意:安装pgpool_recovery需要先安装postgresql-devel。

mkdir /usr/local/pgpool
cd /usr/local/pgpool
tar -zxvf pgpool-II-4.0.6.tar.gz
cd /usr/local/pgpool/pgpool-II-4.0.6
./configure --with-pgsql=/usr/local/pgsql
make && make install
cd /usr/local/pgpool/pgpool-II-4.0.6/src/sql/pgpool-recovery
chown -R postgres /usr/local/pgsql
make && make install
cp /usr/local/etc/pgpool.conf.sample-stream /usr/local/etc/pgpool.conf
           

8 在节点1的Postgresql上安装pgpool_recovery扩展

  使用PG创建数据库时,实际上是通过复制一个现有的数据库来工作的。默认情况下,它复制名为template1的标准系统数据库。

psql template1 -c "CREATE EXTENSION pgpool_recovery"
psql postgres -c "CREATE EXTENSION pgpool_recovery"
           

9 编写failover脚本

在所有节点上新建配置文件,并赋予执行权限

vi /usr/local/pgpool/failover_stream.sh
           

编辑脚本(注意和pgpool.conf中failover_command参数的联系),如果primary宕机,晋升standby;如果standby宕机,仅记录日志。

#!/bin/bash
# This script is run by failover_command.

set -o xtrace
logger -i -p local1.info failover_command begin

# Special values:
#   %d = node id
#   %h = host name
#   %p = port number
#   %D = database cluster path
#   %m = new master node id
#   %H = hostname of the new master node
#   %M = old master node id
#   %P = old primary node id
#   %r = new master port number
#   %R = new master database cluster path
#   %% = '%' character

FAILED_NODE_ID="$1"
FAILED_NODE_HOST="$2"
FAILED_NODE_PORT="$3"
FAILED_NODE_PGDATA="$4"
NEW_MASTER_NODE_ID="$5"
NEW_MASTER_NODE_HOST="$6"
OLD_MASTER_NODE_ID="$7"
OLD_PRIMARY_NODE_ID="$8"
NEW_MASTER_NODE_PORT="$9"
NEW_MASTER_NODE_PGDATA="${10}"

PGHOME=/usr/local/pgsql

logger -i -p local1.info failover.sh: start: failed_node_id=${FAILED_NODE_ID} old_primary_node_id=${OLD_PRIMARY_NODE_ID} \
    failed_host=${FAILED_NODE_HOST} new_master_host=${NEW_MASTER_NODE_HOST}

## Test passwrodless SSH
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected]${NEW_MASTER_NODE_HOST} -i ~/.ssh/id_rsa ls /tmp > /dev/null

if [ $? -ne 0 ]; then
    logger -i -p local1.error failover.sh: passwrodless SSH to [email protected]${NEW_MASTER_NODE_HOST} failed. Please setup passwrodless SSH.
    exit 1
fi

# If standby node is down, skip failover.
if [ ${FAILED_NODE_ID} -ne ${OLD_PRIMARY_NODE_ID} ]; then
    logger -i -p local1.info failover.sh: Standby node is down. Skipping failover.
    exit 0
fi

# Promote standby node.
logger -i -p local1.info failover.sh: Primary node is down, promote standby node [email protected]${NEW_MASTER_NODE_HOST}.

ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
[email protected]${NEW_MASTER_NODE_HOST} -i ~/.ssh/id_rsa ${PGHOME}/bin/pg_ctl -D ${NEW_MASTER_NODE_PGDATA} -w promote

if [ $? -ne 0 ]; then
    logger -i -p local1.error failover.sh: new_master_host=${NEW_MASTER_NODE_HOST} promote failed
    exit 1
fi

logger -i -p local1.info failover.sh: end: new_master_node_id=$NEW_MASTER_NODE_ID started as the primary node
exit 0
           

授权

chmod +x /usr/local/pgpool/failover_stream.sh
           

10 编写follow_master脚本

  使仍在运行的standby从新的primary上同步数据,三个及以上节点数量时,需要设置此脚本

vi /usr/local/pgpool/follow_master.sh
           

编辑脚本

#!/bin/bash
# This script is run after failover_command to synchronize the Standby with the new Primary.

set -o xtrace
logger -i -p local1.info follow_master begin

# special values:  %d = node id
#                  %h = host name
#                  %p = port number
#                  %D = database cluster path
#                  %m = new master node id
#                  %M = old master node id
#                  %H = new master node host name
#                  %P = old primary node id
#                  %R = new master database cluster path
#                  %r = new master port number
#                  %% = '%' character
FAILED_NODE_ID="$1"
FAILED_NODE_HOST="$2"
FAILED_NODE_PORT="$3"
FAILED_NODE_PGDATA="$4"
NEW_MASTER_NODE_ID="$5"
NEW_MASTER_NODE_HOST="$6"
OLD_MASTER_NODE_ID="$7"
OLD_PRIMARY_NODE_ID="$8"
NEW_MASTER_NODE_PORT="$9"
NEW_MASTER_NODE_PGDATA="${10}"

PGHOME=/usr/local/pgsql
ARCHIVEDIR=/home/pg/data/archivedir
REPL_USER=postgres
PCP_USER=postgres
PGPOOL_PATH=/usr/local/bin
PCP_PORT=9898


# Recovery the slave from the new primary
logger -i -p local1.info follow_master.sh: start: synchronize the Standby node [email protected]${FAILED_NODE_HOST} with the new Primary node [email protected]${NEW_MASTER_NODE_HOST}

## Test passwrodless SSH
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected]${NEW_MASTER_NODE_HOST} -i ~/.ssh/id_rsa ls /tmp > /dev/null

if [ $? -ne 0 ]; then
    logger -i -p local1.error follow_master.sh: passwrodless SSH to postgr[email protected]${NEW_MASTER_NODE_HOST} failed. Please setup passwrodless SSH.
    exit 1
fi

## Get PostgreSQL major version
PGVERSION=`${PGHOME}/bin/initdb -V | awk '{print $3}' | sed 's/\..*//' | sed 's/\([0-9]*\)[a-zA-Z].*/\1/'`

if [ ${PGVERSION} -ge 12 ]; then
    RECOVERYCONF=${FAILED_NODE_PGDATA}/myrecovery.conf
else
    RECOVERYCONF=${FAILED_NODE_PGDATA}/recovery.conf
fi

# Check the status of standby
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
    [email protected]${FAILED_NODE_HOST} -i ~/.ssh/id_rsa ${PGHOME}/bin/pg_ctl -w -D ${FAILED_NODE_PGDATA} status

## If Standby is running, run pg_basebackup.
if [ $? -eq 0 ]; then

    # Execute pg_basebackup
    ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected]${FAILED_NODE_HOST} -i ~/.ssh/id_rsa "

        set -o errexit
        ${PGHOME}/bin/pg_ctl -w -m f -D ${FAILED_NODE_PGDATA} stop

        rm -rf ${FAILED_NODE_PGDATA}
        rm -rf ${ARCHIVEDIR}/*

        ${PGHOME}/bin/pg_basebackup -h ${NEW_MASTER_NODE_HOST} -U ${REPL_USER} -p ${NEW_MASTER_NODE_PORT} -D ${FAILED_NODE_PGDATA} -X stream

        if [ ${PGVERSION} -ge 12 ]; then
            sed -i -e \"\\\$ainclude_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'\" \
                   -e \"/^include_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'/d\" ${FAILED_NODE_PGDATA}/postgresql.conf
        fi
      
        cat > ${RECOVERYCONF} << EOT
primary_conninfo = 'host=${NEW_MASTER_NODE_HOST} port=${NEW_MASTER_NODE_PORT} user=${REPL_USER} passfile=''/home/postgres/.pgpass'''
recovery_target_timeline = 'latest'
restore_command = 'scp ${NEW_MASTER_NODE_HOST}:${ARCHIVEDIR}/%f %p'
EOT

        if [ ${PGVERSION} -ge 12 ]; then
            touch ${FAILED_NODE_PGDATA}/standby.signal
        else
            echo \"standby_mode = 'on'\" >> ${RECOVERYCONF}
        fi
    "

    if [ $? -ne 0 ]; then
        logger -i -p local1.error follow_master.sh: end: pg_basebackup failed
        exit 1
    fi

    # start Standby node on ${FAILED_NODE_HOST}
    ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null \
            [email protected]${FAILED_NODE_HOST} -i ~/.ssh/id_rsa $PGHOME/bin/pg_ctl -l /dev/null -w -D ${FAILED_NODE_PGDATA} start

    # If start Standby successfully, attach this node
    if [ $? -eq 0 ]; then

        # Run pcp_attact_node to attach Standby node to Pgpool-II.
        ${PGPOOL_PATH}/pcp_attach_node -w -h localhost -U $PCP_USER -p ${PCP_PORT} -n ${FAILED_NODE_ID}

        if [ $? -ne 0 ]; then
            logger -i -p local1.error follow_master.sh: end: pcp_attach_node failed
            exit 1
        fi

    # If start Standby failed, drop replication slot "${FAILED_NODE_HOST}"
    else
        logger -i -p local1.error follow_master.sh: end: follow master command failed
        exit 1
    fi

else
    logger -i -p local1.info follow_master.sh: failed_nod_id=${FAILED_NODE_ID} is not running. skipping follow master command
    exit 0
fi

logger -i -p local1.info follow_master.sh: end: follow master command complete
exit 0
           

授权

chmod +x /usr/local/pgpool/follow_master.sh
           

11 编写recovery_1st_stage脚本

  pcp_recovery_node会用到recovery_1st_stage和pgpool_remote_start。

  recovery_1st_stage会使用pg_basebackup从现存的主节点同步数据,初始化数据库,更新recovery配置文件。

vi /home/pg/data/recovery_1st_stage.sh
           

编辑

#!/bin/bash
# This script is executed by "recovery_1st_stage" to recovery a Standby node.

set -o xtrace
logger -i -p local1.info recovery_1st_stage begin

PRIMARY_NODE_PGDATA="$1"
DEST_NODE_HOST="$2"
DEST_NODE_PGDATA="$3"
PRIMARY_NODE_PORT="$4"
DEST_NODE_PORT=5432

# 这里是hostname变量,所以需要配置DNS
PRIMARY_NODE_HOST=$(hostname)
PGHOME=/usr/local/pgsql
ARCHIVEDIR=/home/pg/data/archivedir
REPL_USER=postgres

logger -i -p local1.info recovery_1st_stage: start: pg_basebackup for Standby node [email protected]{$DEST_NODE_HOST}

## Test passwrodless SSH
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected]${DEST_NODE_HOST} -i ~/.ssh/id_rsa ls /tmp > /dev/null

if [ $? -ne 0 ]; then
    logger -i -p local1.error recovery_1st_stage: passwrodless SSH to [email protected]${DEST_NODE_HOST} failed. Please setup passwrodless SSH.
    exit 1
fi

## Get PostgreSQL major version
PGVERSION=`${PGHOME}/bin/initdb -V | awk '{print $3}' | sed 's/\..*//' | sed 's/\([0-9]*\)[a-zA-Z].*/\1/'`
if [ $PGVERSION -ge 12 ]; then
    RECOVERYCONF=${DEST_NODE_PGDATA}/myrecovery.conf
else
    RECOVERYCONF=${DEST_NODE_PGDATA}/recovery.conf
fi

## Execute pg_basebackup to recovery Standby node
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected]$DEST_NODE_HOST -i ~/.ssh/id_rsa "

    set -o errexit

    rm -rf $DEST_NODE_PGDATA
    rm -rf $ARCHIVEDIR/*

    ${PGHOME}/bin/pg_basebackup -h ${PRIMARY_NODE_HOST} -U ${REPL_USER} -p ${PRIMARY_NODE_PORT} -D ${DEST_NODE_PGDATA} -X stream

    if [ ${PGVERSION} -ge 12 ]; then
        sed -i -e \"\\\$ainclude_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'\" \
               -e \"/^include_if_exists = '$(echo ${RECOVERYCONF} | sed -e 's/\//\\\//g')'/d\" ${DEST_NODE_PGDATA}/postgresql.conf
    fi

    cat > ${RECOVERYCONF} << EOT
primary_conninfo = 'host=${PRIMARY_NODE_HOST} port=${PRIMARY_NODE_PORT} user=${REPL_USER} passfile=''/home/postgres/.pgpass'''
recovery_target_timeline = 'latest'
restore_command = 'scp ${PRIMARY_NODE_HOST}:${ARCHIVEDIR}/%f %p'
EOT

    if [ ${PGVERSION} -ge 12 ]; then
        touch ${DEST_NODE_PGDATA}/standby.signal
    else
        echo \"standby_mode = 'on'\" >> ${RECOVERYCONF}
    fi

    sed -i \"s/#*port = .*/port = ${DEST_NODE_PORT}/\" ${DEST_NODE_PGDATA}/postgresql.conf
"

if [ $? -ne 0 ]; then
    logger -i -p local1.error recovery_1st_stage: end: pg_basebackup failed. online recovery failed
    exit 1
fi

logger -i -p local1.info recovery_1st_stage: end: recovery_1st_stage complete
exit 0
           

授权

chmod +x /home/pg/data/recovery_1st_stage.sh
           

12 编写pgpool_remote_start脚本(注意没有sh扩展名)

pgpool_remote_start负责启动数据库

vi /home/pg/data/pgpool_remote_start
           

编辑

#!/bin/bash
# This script is run after recovery_1st_stage to start Standby node.

set -o xtrace
logger -i -p local1.info pgpool_remote_start begin

PGHOME=/usr/local/pgsql
DEST_NODE_HOST="$1"
DEST_NODE_PGDATA="$2"


logger -i -p local1.info pgpool_remote_start: start: remote start Standby node [email protected]$DEST_NODE_HOST

## Test passwrodless SSH
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected]${DEST_NODE_HOST} -i ~/.ssh/id_rsa ls /tmp > /dev/null

if [ $? -ne 0 ]; then
    logger -i -p local1.error pgpool_remote_start: passwrodless SSH to [email protected]${DEST_NODE_HOST} failed. Please setup passwrodless SSH.
    exit 1
fi

## Start Standby node
ssh -T -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected]$DEST_NODE_HOST -i ~/.ssh/id_rsa "
$PGHOME/bin/pg_ctl -l /dev/null -w -D $DEST_NODE_PGDATA start
"

if [ $? -ne 0 ]; then
    logger -i -p local1.error pgpool_remote_start: [email protected]$DEST_NODE_HOST start failed.
    exit 1
fi

logger -i -p local1.info pgpool_remote_start: end: [email protected]$DEST_NODE_HOST started successfully.
exit 0
           

授权

chmod +x /home/pg/data/pgpool_remote_start.sh
           

13 配置pgpool.conf

复制Pgpool配置模板文件

cp /usr/local/etc/pgpool.conf.sample-stream /usr/local/etc/pgpool.conf
vi /usr/local/etc/pgpool.conf
           

修改

#CONNECTIONS
listen_addresses = '*'
port = 9999
pcp_listen_addresses = '*'
pcp_port = 9898

#- Backend Connection Settings -

backend_hostname0 ='【集群内第1个节点的IP,多个节点配置一致】'
backend_port0 = 5432
backend_weight0 = 1
backend_data_directory0 = '/home/pg/data'
backend_flag0 = ‘ALLOW_TO_FAILOVER’

backend_hostname1 ='【集群内第2个节点的IP,多个节点配置一致】'
backend_port1 = 5432
backend_weight1 = 1
backend_data_directory1 = '/home/pg/data'
backend_flag1 = 'ALLOW_TO_FAILOVER'

backend_hostname2 ='【集群内第3个节点的IP,多个节点配置一致】'
backend_port2 = 5432
backend_weight2 = 1
backend_data_directory2 = '/home/pg/data'
backend_flag2 = 'ALLOW_TO_FAILOVER'

#- Authentication -
enable_pool_hba = on
pool_passwd ='pool_passwd'

#------------------------------------------------------------------------------
# POOLS
#------------------------------------------------------------------------------

# - Concurrent session and pool size -

num_init_children = 400
max_pool = 4

#------------------------------------------------------------------------------
# LOGS
#------------------------------------------------------------------------------
log_destination = 'syslog'
syslog_facility = 'LOCAL1'

#------------------------------------------------------------------------------
# FILE LOCATIONS
#------------------------------------------------------------------------------

pid_file_name = '/usr/local/pgpool/pgpool.pid'

#------------------------------------------------------------------------------
# REPLICATION MODE
#------------------------------------------------------------------------------
replication_mode = off
#------------------------------------------------------------------------------
# LOAD BALANCING MODE
#------------------------------------------------------------------------------
load_balance_mode = on
#------------------------------------------------------------------------------
# MASTER/SLAVE MODE
#------------------------------------------------------------------------------
master_slave_mode = on
master_slave_sub_mode = 'stream'
# - Streaming -
sr_check_period = 5
sr_check_user = 'postgres'
sr_check_password ='123456'
sr_check_database = 'postgres'
follow_master_command = '/usr/local/pgpool/follow_master.sh %d %h %p %D %m %H %M %P %r %R'
#------------------------------------------------------------------------------
#HEALTH CHECK
#------------------------------------------------------------------------------

health_check_period = 10
health_check_timeout = 20
health_check_user = 'postgres'
health_check_password ='123456' 
health_check_database = 'postgres'

#------------------------------------------------------------------------------
#FAILOVER AND FAILBACK
#------------------------------------------------------------------------------
# 注意failover_stream.sh后的参数,要与failover_stream.sh中的脚本匹配
failover_command = '/usr/local/pgpool/failover_stream.sh %d %h %p %D %m %H %M %P %r %R'

#------------------------------------------------------------------------------
#WATCHDOG
#------------------------------------------------------------------------------
use_watchdog = on
wd_hostname = '【本机IP】'
wd_port = 9000
delegate_IP = '【虚拟IP】'
if_cmd_path = '/sbin'
if_up_cmd = 'ip addr add $_IP_$/24 dev 【网卡名称(通过ip addr查看)】 label 【网卡名称(通过ip addr查看)】:0'
if_down_cmd = 'ip addr del $_IP_$/24 dev 【网卡名称(通过ip addr查看)】'
arping_path = '/usr/sbin'
arping_cmd = 'arping -U $_IP_$ -w 1'
wd_heartbeat_port = 9694
wd_heartbeat_keepalive = 2
wd_heartbeat_deadtime = 30
heartbeat_destination0 = '【其他节点1的IP】'
heartbeat_destination_port0 = 9694
heartbeat_device0 = '【网卡名称(通过ip addr查看)】'
heartbeat_destination1 = '【其他节点2的IP】'
heartbeat_destination_port1 = 9694
heartbeat_device1 = '【网卡名称(通过ip addr查看)】'

#------------------------------------------------------------------------------
# ONLINE RECOVERY
#------------------------------------------------------------------------------

recovery_user = 'postgres'
recovery_password = '123456'
recovery_1st_stage_command = 'recovery_1st_stage.sh'
#- Other pgpool Connection Settings -

other_pgpool_hostname0 ='【其他节点1的IP】'
other_pgpool_port0 = 9999
other_wd_port0 = 9000
other_pgpool_hostname1 ='【其他节点2的IP】'
other_pgpool_port1 = 9999
other_wd_port1 = 9000
           

14 生成pool_passwd

/usr/local/bin/pg_md5 -p --md5auth --username=postgres pool_passwd
chown postgres:postgres /usr/local/etc/pool_passwd
           

操作完成后,在/usr/local/etc目录下会生成一个pool_passwd文件,该文件在pgpool.conf中通过“pool_passwd”选项配置。

15 配置pool_hba.conf

  客户端认证配置文件,在pgpool.conf中配置“enable_pool_hba = on”时,将使用该文件认证客户端连接。

root用户执行

cp /usr/local/etc/pool_hba.conf.sample /usr/local/etc/pool_hba.conf
chown postgres:postgres /usr/local/etc/pool_hba.conf
vi /usr/local/etc/pool_hba.conf
           

修改

local all all md5
host all all 0.0.0.0/0 md5
host all all 0/0 md5
           

16 配置pcp.conf

  Pgpool-II为管理员提供了执行管理操作的接口,例如获取Pgpool-II状态或远程终止Pgpool-II进程。pcp.conf是此接口用于身份验证的用户名/密码文件。所有操作模式都需要设置pcp.conf文件。数据库中使用PCP进行健康检测和复制延迟检测的用户,其用户名和密码必须在pcp.conf中声明。

16.1 生成密码摘要

/usr/local/bin/pg_md5 【密码】
           

16.2 生成和配置文件

cp /usr/local/etc/pcp.conf.sample /usr/local/etc/pcp.conf
vi /usr/local/etc/pcp.conf
           

编辑

# Example:
postgres:【上一步生成的密码摘要】
           

17 设置PCP密码配置文件

  由于follow_master_command脚本必须在不输入密码的情况下执行PCP命令,所以我们在Pgpool-II启动用户的主目录中创建.pcppass。

echo 'localhost:9898:【数据库用户名称】:【数据库用户密码】' > ~/.pcppass
chmod 600 ~/.pcppass
           

18 允许postgres用户以root身份执行相关命令

root用户执行

chmod u+s /sbin/ip
chmod u+s /sbin/arping
           

19 配置开机自启

vi /etc/systemd/system/pgpool.service
           

编辑

[Unit]
Description=Pgpool-II
After=syslog.target network.target

[Service]
Type=forking

User=root

ExecStart=/usr/local/bin/pgpool -C -D
ExecStop=/usr/local/bin/pgpool -m fast stop
ExecReload=/usr/local/bin/pgpool reload
LimitNOFILE=65536
KillMode=process
KillSignal=SIGINT
Restart=on-abnormal
RestartSec=30s
TimeoutSec=0

[Install]
WantedBy=multi-user.target
           

开机自启

systemctl enable pgpool
           

20 在所有节点启动pgpool

systemctl start pgpool
           

21 通过pcp_recovery_node初始化备节点数据库

  pcp_recovery_node会先在primary节点执行SELECT pgpool_recovery(‘recovery_1st_stage’, ‘【需要启动的standby的IP】’, ‘【数据目录】’, ‘5432’, 【节点编号】),可以通过查询PG日志查看异常信息;然后执行recovery_1st_stage.sh和pgpool_remote_start.sh

pcp_recovery_node -h 【主节点的IP】 -p 9898 -U 【PCP用户名(数据库)】 -n 【node-id】
           

22 查看节点状态

22.1 Pgpool节点状态

pcp_watchdog_info -h 【虚拟IP】 -p 9898 -U 【Pgpool用户】
           

22.2 数据库节点状态

通过虚拟IP的9999端口开启PostgreSQL数据库命令行,执行

注意:如果使用navicat的命令行功能,可能需要关闭并重启命令行界面(建立新的session)才能看到正确的节点状态

23 高可用测试

23.1 Pgpool 高可用测试

  • master停止pgpool服务,某个standby接管master服务
  • 重启之前停止的master的pgpool服务,节点状态变为standby
  • standby停止pgpool服务,节点状态变为shutdown
  • 重启standby,节点状态恢复为standby

23.2 数据库高可用测试

  • primary停止数据库服务,某个standby晋升为primary,其他standby执行follow

    master,从新的primary同步数据

  • 执行pcp_recovery_node,将之前停止服务的primary重启
  • standby停止数据库服务,其他节点无变化
  • 执行pcp_recovery_node,将之前停止服务的standby重启

24 负载均衡测试

  使用pgbench进行负载均衡测试

24.1 数据初始化

pgbench -i -h 【主节点IP】 -p 5432 postgres
24.2	测试
```shell
pgbench -c 【客户端数量】 -T 【持续时间(秒)】 -h 【集群虚拟IP】 postgres -p 9999 -U postgres -S(表示只读测试)
           

24.3 使用show pool_nodes查看查询请求在各节点的负载

25 非正常关闭pgpool后的故障恢复

  • 非正常关闭pgpool,要清理缓存文件:
rm /tmp/.s.PGSQL.9999
rm /tmp/.s.PGPOOLWD_CMD.9000
rm /tmp/.s.PGSQL.9898
           
  • 如果启动第一个节点状态为standby,重启pgpool并使用

    pcp_attach_node加入该节点

继续阅读