斗阵骑士官方中文版
660M · 2025-10-11
开始之前,先看一下pg_auto_failover的一个最基础架构原理如下,需要弄清楚几个节点的作用
1,monitor节点的身份是一个监控节点,仅存储元数据,负责监控primary和secondary的健康状况以及异常情况下的故障转移。
2,monitor不负责存储用户数据,负责存储用户数据的是primary和secondary节点。
3,monitor节点是一个单点,存在单点故障的可能性,这是pg_auto_failover的硬伤,但monitor节点故障后不影响primary和secondary的运行。
4,primary和secondary节点是用户数据库的存储节点,先注册到monitor中的节点为主节点,后注册到monitor的节点为从节点,正常注册后实现流复制,其身份可以互换。
monitor: 192.168.152.121 ubuntu11主: 192.168.152.122 ubuntu12从: 192.168.152.123 ubuntu13
--下载pg_auto_failover源码包cd /usr/local/pg_auto_failoverwget https://github.com/hapostgres/pg_auto_failover/archive/refs/tags/v2.2.zipapt install unzipunzip v2.2.zip--解压后路径如下drwxr-xr-x 3 root root 4096 Jun 4 16:27 ./drwxr-xr-x 14 root root 4096 Jun 4 16:26 ../drwxr-xr-x 8 root root 4096 Apr 3 20:05 pg_auto_failover-2.2/-rw-r--r-- 1 root root 1364027 Jun 4 16:26 v2.2.zip--这里的安装,实际上将pg_auto_failover的编译文件,安装到上面配置postgres的环境变量的指定的路径中cd pg_auto_failover-2.2/makemake install--安装完成后,重新将PGHOME相关目录授权给postgres用户,否则后续使用pg_autoctl的时候会报找不到命令的错误chown -R postgres:postgres /usr/local/pgsql16/
实际上pg_auto_failover的编译安装之后,把文件存放在上述环境变量的PGHOME=/usr/local/pgsql16/server目录的bin和lib目录下
postgres@ubuntu11:/usr/local/pg_install_packgae$ pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/ --auth trust --ssl-self-signed --hostname ubuntu11 --pgport 9300pg_autoctl: command not foundpostgres@ubuntu11:/usr/local/pg_install_packgae$postgres@ubuntu11:/usr/local/pg_install_packgae$postgres@ubuntu11:/usr/local/pg_install_packgae$ /usr/local/pgsql16/server/bin/pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/ --auth trust --ssl-self-signed --hostname 127.0.0.1 --pgport 930010:47:45 2821 ERROR Failed to create state directory "/run/user/0/pg_autoctl": Permission denied10:47:45 2821 ERROR Failed to build pg_autoctl pid file pathname, see above.10:47:45 2821 FATAL Failed to set pid filename from PGDATA "/usr/local/pgsql16/pg9300/data/", see above for details.postgres@ubuntu11:/usr/local/pg_install_packgae$postgres@ubuntu11:/usr/local/pg_install_packgae$postgres@ubuntu11:/usr/local/pg_install_packgae$ exitexitroot@ubuntu11:/usr/local/pg_install_packgae# sudo chmod -R 777 /run/user/0root@ubuntu11:/usr/local/pg_install_packgae# sudo chown -R postgres:postgres /run/user/0root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# su - postgres #切换到postgres用户下执行postgres@ubuntu11:~$postgres@ubuntu11:~$postgres@ubuntu11:~$postgres@ubuntu11:~$ pg_autoctl create monitor --pgdata /usr/local/pgsql16/pg9300/data/ --auth trust --ssl-self-signed --hostname ubuntu11 --pgport 9300 --run05:04:27 2216 INFO Using default --ssl-mode "require"05:04:27 2216 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic05:04:27 2216 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.05:04:27 2216 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details05:04:27 2216 INFO Initialising a PostgreSQL cluster at "/usr/local/pgsql16/pg9300/data"05:04:27 2216 INFO /usr/local/pgsql16/server/bin/pg_ctl initdb -s -D /usr/local/pgsql16/pg9300/data --option '--auth=trust'05:04:29 2216 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu11"05:04:29 2216 INFO Started pg_autoctl postgres service with pid 223805:04:29 2216 INFO Started pg_autoctl listener service with pid 223905:04:29 2238 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v05:04:29 2243 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *05:04:29 2238 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 224305:04:29 2239 WARN NOTICE: installing required extension "btree_gist"05:04:29 2239 INFO Granting connection privileges on 192.168.152.0/2405:04:29 2239 WARN Skipping HBA edits (per --skip-pg-hba) for rule: hostssl "pg_auto_failover" "autoctl_node" 192.168.152.0/24 trust05:04:29 2239 INFO Your pg_auto_failover monitor instance is now ready on port 9300.05:04:29 2239 INFO Monitor has been successfully initialized.05:04:29 2239 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service listener --pgdata /usr/local/pgsql16/pg9300/data/ -v05:04:29 2239 INFO Managing the monitor at postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require05:04:29 2239 INFO Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"05:04:29 2239 INFO Reloading Postgres configuration and HBA rules05:04:30 2239 INFO The version of extension "pgautofailover" is "2.2" on the monitor05:04:30 2239 INFO Contacting the monitor to LISTEN to its events.
postgres@ubuntu11:~$ exit #切换到root用户执行logoutroot@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show systemd05:06:26 2333 INFO HINT: to complete a systemd integration, run the following commands (as root):05:06:26 2333 INFO pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service05:06:26 2333 INFO systemctl daemon-reload05:06:26 2333 INFO systemctl enable pgautofailover05:06:26 2333 INFO systemctl start pgautofailover[Unit]Description = pg_auto_failover[Service]WorkingDirectory = /home/postgresEnvironment = 'PGDATA=/usr/local/pgsql16/pg9300/data'User = postgresExecStart = /usr/local/pgsql16/server/bin/pg_autoctl runRestart = alwaysStartLimitBurst = 0ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload[Install]WantedBy = multi-user.targetroot@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service[Unit]Description = pg_auto_failover[Service]WorkingDirectory = /home/postgresEnvironment = 'PGDATA=/usr/local/pgsql16/pg9300/data'User = postgresExecStart = /usr/local/pgsql16/server/bin/pg_autoctl runRestart = alwaysStartLimitBurst = 0ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload[Install]WantedBy = multi-user.targetroot@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae# systemctl daemon-reloadroot@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae# systemctl enable pgautofailoverCreated symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.root@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae# systemctl start pgautofailoverroot@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae# systemctl status pgautofailover● pgautofailover.service - pg_auto_failover Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2025-10-09 05:06:43 UTC; 5s ago Main PID: 2421 (pg_autoctl) Tasks: 14 (limit: 4550) Memory: 29.2M CGroup: /system.slice/pgautofailover.service ├─2421 /usr/local/pgsql16/server/bin/pg_autoctl run ├─2444 pg_autoctl: start/stop postgres ├─2445 pg_autoctl: monitor listener ├─2454 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h * ├─2455 postgres: pg_auto_failover monitor: logger ├─2456 postgres: pg_auto_failover monitor: checkpointer ├─2457 postgres: pg_auto_failover monitor: background writer ├─2459 postgres: pg_auto_failover monitor: walwriter ├─2460 postgres: pg_auto_failover monitor: autovacuum launcher ├─2461 postgres: pg_auto_failover monitor: pg_auto_failover monitor ├─2462 postgres: pg_auto_failover monitor: logical replication launcher ├─2463 postgres: pg_auto_failover monitor: pg_auto_failover monitor healthcheck worker postgres ├─2464 postgres: pg_auto_failover monitor: pg_auto_failover monitor healthcheck worker pg_auto_failover └─2466 postgres: pg_auto_failover monitor: autoctl_node pg_auto_failover [local] idleOct 09 05:06:43 ubuntu11 pg_autoctl[2421]: 05:06:43 2421 INFO Started pg_autoctl postgres service with pid 2444Oct 09 05:06:43 ubuntu11 pg_autoctl[2421]: 05:06:43 2421 INFO Started pg_autoctl listener service with pid 2445Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service listener --pgdata /usr/local/pgsql16/pg9300/data -vOct 09 05:06:43 ubuntu11 pg_autoctl[2444]: 05:06:43 2444 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data -vOct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO Managing the monitor at postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=requireOct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"Oct 09 05:06:43 ubuntu11 pg_autoctl[2454]: 05:06:43 2454 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *Oct 09 05:06:43 ubuntu11 pg_autoctl[2444]: 05:06:43 2444 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2454Oct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO The version of extension "pgautofailover" is "2.2" on the monitorOct 09 05:06:43 ubuntu11 pg_autoctl[2445]: 05:06:43 2445 INFO Contacting the monitor to LISTEN to its events.root@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu11:/usr/local/pg_install_packgae# psql -h ubuntu11 -p 9300 postgres postgrespsql (16.4)SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)Type "help" for help.#1,自动安装了pg_auto_failover库postgres=# l List of databases Name | Owner | Encoding | Locale Provider | Collate | Ctype | ICU Locale | ICU Rules | Access privileges------------------+----------+----------+-----------------+------------+------------+------------+-----------+----------------------- pg_auto_failover | autoctl | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | postgres | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | template0 | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | =c/postgres + | | | | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | libc | en_US.utf8 | en_US.utf8 | | | =c/postgres + | | | | | | | | postgres=CTc/postgres(4 rows)#2,自动创建了autoctl和autoctl_node两个角色postgres=# du List of roles Role name | Attributes--------------+------------------------------------------------------------ autoctl | autoctl_node | postgres | Superuser, Create role, Create DB, Replication, Bypass RLS#3,自动安装了如下两个扩展postgres=# c pg_auto_failoverSSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)You are now connected to database "pg_auto_failover" as user "postgres".pg_auto_failover=# dx List of installed extensions Name | Version | Schema | Description----------------+---------+------------+----------------------------------------------- btree_gist | 1.7 | public | support for indexing common datatypes in GiST pgautofailover | 2.2 | public | pg_auto_failover plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language(3 rows)pg_auto_failover=#
root@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show uri Type | Name | Connection String-------------+---------+------------------------------- monitor | monitor | postgres://[email protected]:9300/pg_auto_failover?sslmode=require #一开始我懵逼了,为什么连接串里是127.0.0.1 formation | default |root@ubuntu11:/usr/local/pg_install_packgae# su - postgrespostgres@ubuntu11:~$postgres@ubuntu11:~$ source /etc/profilepostgres@ubuntu11:~$postgres@ubuntu11:~$ pg_autoctl show uri Type | Name | Connection String-------------+---------+------------------------------- monitor | monitor | postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require #后来尝试切换到postgres用户下查看,竟然变成了主机名,搞什么鬼哦 formation | default |postgres@ubuntu11:~$
postgres@ubuntu12:/root$ pg_autoctl create postgres --hostname ubuntu12 --name ubuntu12 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'03:09:00 58370 ERROR Failed to create state directory "/run/user/0/pg_autoctl": Permission denied03:09:00 58370 ERROR Failed to build pg_autoctl pid file pathname, see above.03:09:00 58370 FATAL Failed to set pid filename from PGDATA "/usr/local/pgsql16/pg9300/data/", see above for details.postgres@ubuntu12:/root$postgres@ubuntu12:/root$postgres@ubuntu12:/root$ exitexitroot@ubuntu12:~# sudo chmod -R 777 /run/user/0root@ubuntu12:~# sudo chown -R postgres:postgres /run/user/0
root@ubuntu11:/usr/local/pg_install_packgae# psql -h 127.0.0.1 -p 9300 postgres postgrespsql (16.4)SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)Type "help" for help.postgres=# c pg_auto_failoverSSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)You are now connected to database "pg_auto_failover" as user "postgres".pg_auto_failover=#pg_auto_failover=# SELECT nodeid, nodename, nodehost, nodeport, goalstate, reportedstate FROM pgautofailover.node; nodeid | nodename | nodehost | nodeport | goalstate | reportedstate--------+----------+----------+----------+-----------+--------------- 1 | ubuntu12 | ubuntu12 | 9300 | single | single(1 row)pg_auto_failover=# delete from pgautofailover.node where nodeid = 1;DELETE 1pg_auto_failover=#pg_auto_failover=#root@ubuntu11:/usr/local/pg_install_packgae# su - postgrespostgres@ubuntu11:~$postgres@ubuntu11:~$ source /etc/profilepostgres@ubuntu11:~$postgres@ubuntu11:~$ pg_autoctl show uri #这样monitor节点恢复成原始状态,解决错误后再重新注册主/从节点 Type | Name | Connection String-------------+---------+------------------------------- monitor | monitor | postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require formation | default |postgres@ubuntu11:~$
本地清理 keeper state,再强制注册本地 pg_autoctl 会维护一个状态文件(通常在 ~/.local/share/pg_autoctl/.../pg_autoctl.state)。如果你只是清理 pgdata,但没有清理这个文件,再次注册会冲突。解决:rm -rf /home/postgres/.local/share/pg_autoctl然后重新运行 pg_autoctl create,这样它会认为是一个全新节点。
root@ubuntu12:~# su - postgrespostgres@ubuntu12:~$postgres@ubuntu12:~$postgres@ubuntu12:~$postgres@ubuntu12:~$postgres@ubuntu12:~$ pg_autoctl create postgres --hostname ubuntu12 --name ubuntu12 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'05:17:27 2094 INFO Using default --ssl-mode "require"05:17:27 2094 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic05:17:27 2094 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.05:17:27 2094 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details05:17:27 2094 INFO Started pg_autoctl postgres service with pid 209605:17:27 2094 INFO Started pg_autoctl node-init service with pid 209705:17:27 2096 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v05:17:27 2097 INFO Registered node 52 "ubuntu12" (ubuntu12:9300) in formation "default", group 0, state "single"05:17:27 2097 INFO Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.state"05:17:27 2097 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.init"05:17:27 2097 INFO Successfully registered as "single" to the monitor.05:17:27 2097 INFO FSM transition from "init" to "single": Start as a single node05:17:27 2097 INFO Initialising postgres as a primary05:17:27 2097 INFO Initialising a PostgreSQL cluster at "/usr/local/pgsql16/pg9300/data"05:17:27 2097 INFO /usr/local/pgsql16/server/bin/pg_ctl initdb -s -D /usr/local/pgsql16/pg9300/data --option '--auth=trust'05:17:27 2097 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu12"05:17:28 2122 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *05:17:28 2096 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 212205:17:28 2097 INFO The user "postgres" already exists, skipping.05:17:28 2097 INFO CREATE USER postgres05:17:28 2097 INFO CREATE DATABASE postgres;05:17:28 2097 INFO The database "postgres" already exists, skipping.05:17:28 2097 INFO CREATE EXTENSION pg_stat_statements;05:17:28 2097 INFO Disabling synchronous replication05:17:28 2097 INFO Reloading Postgres configuration and HBA rules05:17:28 2097 WARN Failed to resolve hostname "Ubuntu11" to an IP address that resolves back to the hostname on a reverse DNS lookup.05:17:28 2097 WARN Postgres might deny connection attempts from "Ubuntu11", even with the new HBA rules.05:17:28 2097 WARN Hint: correct setup of HBA with host names requires proper reverse DNS setup. You might want to use IP addresses.05:17:28 2097 WARN Using IP address "192.168.152.121" in HBA file instead of hostname "Ubuntu11"05:17:28 2097 INFO Reloading Postgres configuration and HBA rules05:17:28 2097 INFO Transition complete: current state is now "single"05:17:28 2097 INFO keeper has been successfully initialized.05:17:28 2094 WARN pg_autoctl service node-init exited with exit status 005:17:28 2096 INFO Postgres controller service received signal SIGTERM, terminating05:17:28 2096 INFO Stopping pg_autoctl postgres service05:17:28 2096 INFO /usr/local/pgsql16/server/bin/pg_ctl --pgdata /usr/local/pgsql16/pg9300/data --wait stop --mode fast05:17:28 2094 INFO Stop pg_autoctlpostgres@ubuntu12:~$
postgres@ubuntu12:~$postgres@ubuntu12:~$ exitlogoutroot@ubuntu12:~# pg_autoctl show systemd05:18:06 2160 INFO HINT: to complete a systemd integration, run the following commands (as root):05:18:06 2160 INFO pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service05:18:06 2160 INFO systemctl daemon-reload05:18:06 2160 INFO systemctl enable pgautofailover05:18:06 2160 INFO systemctl start pgautofailover[Unit]Description = pg_auto_failover[Service]WorkingDirectory = /home/postgresEnvironment = 'PGDATA=/usr/local/pgsql16/pg9300/data'User = postgresExecStart = /usr/local/pgsql16/server/bin/pg_autoctl runRestart = alwaysStartLimitBurst = 0ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload[Install]WantedBy = multi-user.targetroot@ubuntu12:~# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service[Unit]Description = pg_auto_failover[Service]WorkingDirectory = /home/postgresEnvironment = 'PGDATA=/usr/local/pgsql16/pg9300/data'User = postgresExecStart = /usr/local/pgsql16/server/bin/pg_autoctl runRestart = alwaysStartLimitBurst = 0ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload[Install]WantedBy = multi-user.targetroot@ubuntu12:~# systemctl daemon-reloadroot@ubuntu12:~# systemctl enable pgautofailoverCreated symlink /etc/systemd/system/multi-user.target.wants/pgautofailover.service → /etc/systemd/system/pgautofailover.service.root@ubuntu12:~# systemctl start pgautofailoverroot@ubuntu12:~#root@ubuntu12:~# systemctl status pgautofailover● pgautofailover.service - pg_auto_failover Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2025-10-09 05:18:32 UTC; 10s ago Main PID: 2250 (pg_autoctl) Tasks: 11 (limit: 4550) Memory: 26.8M CGroup: /system.slice/pgautofailover.service ├─2250 /usr/local/pgsql16/server/bin/pg_autoctl run ├─2267 pg_autoctl: start/stop postgres ├─2268 pg_autoctl: node active ├─2278 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h * ├─2279 postgres: logger ├─2280 postgres: checkpointer ├─2281 postgres: background writer ├─2283 postgres: walwriter ├─2284 postgres: autovacuum launcher ├─2285 postgres: logical replication launcher └─2313 postgres: postgres postgres [local] startupOct 09 05:18:32 ubuntu12 pg_autoctl[2250]: 05:18:32 2250 INFO Started pg_autoctl postgres service with pid 2267Oct 09 05:18:32 ubuntu12 pg_autoctl[2250]: 05:18:32 2250 INFO Started pg_autoctl node-active service with pid 2268Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service node-active --pgdata /usr/local/pgsql16/pg9300/data -vOct 09 05:18:32 ubuntu12 pg_autoctl[2267]: 05:18:32 2267 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data -vOct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO Reloaded the new configuration from "/home/postgres/.config/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.cfg"Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 INFO pg_autoctl service is running, current state is "single"Oct 09 05:18:32 ubuntu12 pg_autoctl[2278]: 05:18:32 2278 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *Oct 09 05:18:32 ubuntu12 pg_autoctl[2268]: 05:18:32 2268 WARN PostgreSQL was not running, restarted with pid 2278Oct 09 05:18:33 ubuntu12 pg_autoctl[2267]: 05:18:33 2267 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 2278Oct 09 05:18:33 ubuntu12 pg_autoctl[2268]: 05:18:33 2268 INFO New state for this node (node 52, "ubuntu12") (ubuntu12:9300): single ➜ singleroot@ubuntu12:~#
root@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show uri Type | Name | Connection String-------------+---------+------------------------------- monitor | monitor | postgres://[email protected]:9300/pg_auto_failover?sslmode=require formation | default | postgres://ubuntu12:9300/postgres?target_session_attrs=read-write&sslmode=requireroot@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae#root@ubuntu11:/usr/local/pg_install_packgae# pg_autoctl show state Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State---------+-------+---------------+----------------+--------------+---------------------+--------------------ubuntu12 | 8 | ubuntu12:9300 | 1: 0/15596F8 | read-write | single | singleroot@ubuntu11:/usr/local/pg_install_packgae#
root@ubuntu13:/usr/local/pg_install_package# systemctl stop postgresql9300root@ubuntu13:/usr/local/pg_install_package# systemctl disable postgresql9300root@ubuntu13:/usr/local/pg_install_package# rm -rf /usr/local/pgsql16/pg9300
postgres@ubuntu13:~$ pg_autoctl create postgres --hostname ubuntu13 --name ubuntu13 --auth trust --ssl-self-signed --pgdata /usr/local/pgsql16/pg9300/data/ --pgport 9300 --monitor 'postgres://autoctl_node@Ubuntu11:9300/pg_auto_failover?sslmode=require'05:46:07 11100 INFO Using default --ssl-mode "require"05:46:07 11100 INFO Using --ssl-self-signed: pg_autoctl will create self-signed certificates, allowing for encrypted network traffic05:46:07 11100 WARN Self-signed certificates provide protection against eavesdropping; this setup does NOT protect against Man-In-The-Middle attacks nor Impersonation attacks.05:46:07 11100 WARN See https://www.postgresql.org/docs/current/libpq-ssl.html for details05:46:07 11100 INFO Started pg_autoctl postgres service with pid 1110205:46:07 11100 INFO Started pg_autoctl node-init service with pid 1110305:46:07 11102 INFO /usr/local/pgsql16/server/bin/pg_autoctl do service postgres --pgdata /usr/local/pgsql16/pg9300/data/ -v05:46:07 11103 INFO Registered node 60 "ubuntu13" (ubuntu13:9300) in formation "default", group 0, state "wait_standby"05:46:07 11103 INFO Writing keeper state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.state"05:46:07 11103 INFO Writing keeper init state file at "/home/postgres/.local/share/pg_autoctl/usr/local/pgsql16/pg9300/data/pg_autoctl.init"05:46:07 11103 INFO Successfully registered as "wait_standby" to the monitor.05:46:07 11103 INFO FSM transition from "init" to "wait_standby": Start following a primary05:46:07 11103 INFO Transition complete: current state is now "wait_standby"05:46:07 11103 INFO New state for node 52 "ubuntu12" (ubuntu12:9300): single ➜ wait_primary05:46:07 11103 INFO New state for node 52 "ubuntu12" (ubuntu12:9300): wait_primary ➜ wait_primary05:46:07 11103 INFO FSM transition from "wait_standby" to "catchingup": The primary is now ready to accept a standby05:46:07 11103 INFO Initialising PostgreSQL as a hot standby05:46:07 11103 INFO /usr/local/pgsql16/server/bin/pg_basebackup -w -d 'application_name=pgautofailover_standby_60 host=ubuntu12 port=9300 user=pgautofailover_replicator sslmode=require' --pgdata /usr/local/pgsql16/pg9300/backup/node_60 -U pgautofailover_replicator --verbose --progress --max-rate 100M --wal-method=stream --slot pgautofailover_standby_6005:46:07 11103 INFO pg_basebackup:05:46:07 11103 INFO05:46:07 11103 INFO initiating base backup, waiting for checkpoint to complete05:46:07 11103 INFO pg_basebackup:05:46:07 11103 INFO05:46:07 11103 INFO checkpoint completed05:46:07 11103 INFO pg_basebackup:05:46:07 11103 INFO05:46:07 11103 INFO write-ahead log start point: 0/2000028 on timeline 105:46:07 11103 INFO pg_basebackup:05:46:07 11103 INFO05:46:07 11103 INFO starting background WAL receiver05:46:07 11103 INFO 22591/22591 kB (100%), 0/1 tablespace (...backup/node_60/global/pg_control)05:46:08 11103 INFO 22591/22591 kB (100%), 1/1 tablespace05:46:08 11103 INFO pg_basebackup: write-ahead log end point: 0/200013805:46:08 11103 INFO pg_basebackup: waiting for background process to finish streaming ...05:46:08 11103 INFO pg_basebackup: syncing data to disk ...05:46:08 11103 INFO pg_basebackup: renaming backup_manifest.tmp to backup_manifest05:46:08 11103 INFO pg_basebackup: base backup completed05:46:08 11103 INFO Creating the standby signal file at "/usr/local/pgsql16/pg9300/data/standby.signal", and replication setup at "/usr/local/pgsql16/pg9300/data/postgresql-auto-failover-standby.conf"05:46:08 11103 INFO /usr/bin/openssl req -new -x509 -days 365 -nodes -text -out /usr/local/pgsql16/pg9300/data/server.crt -keyout /usr/local/pgsql16/pg9300/data/server.key -subj "/CN=ubuntu13"05:46:08 11116 INFO /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h *05:46:08 11102 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 1111605:46:08 11103 INFO PostgreSQL started on port 930005:46:08 11103 INFO Fetched current list of 1 other nodes from the monitor to update HBA rules, including 1 changes.05:46:08 11103 INFO Ensuring HBA rules for node 52 "ubuntu12" (ubuntu12:9300)05:46:08 11103 INFO Adding HBA rule: hostssl replication "pgautofailover_replicator" ubuntu12 trust05:46:08 11103 INFO Adding HBA rule: hostssl "postgres" "pgautofailover_replicator" ubuntu12 trust05:46:08 11103 INFO Writing new HBA rules in "/usr/local/pgsql16/pg9300/data/pg_hba.conf"05:46:08 11103 INFO Reloading Postgres configuration and HBA rules05:46:08 11103 INFO Transition complete: current state is now "catchingup"05:46:08 11103 INFO keeper has been successfully initialized.05:46:08 11100 WARN pg_autoctl service node-init exited with exit status 005:46:08 11102 INFO Postgres controller service received signal SIGTERM, terminating05:46:08 11102 INFO Stopping pg_autoctl postgres service05:46:08 11102 INFO /usr/local/pgsql16/server/bin/pg_ctl --pgdata /usr/local/pgsql16/pg9300/data --wait stop --mode fast05:46:08 11100 INFO Stop pg_autoctlpostgres@ubuntu13:~$
root@ubuntu13:/usr/local/pg_install_package# pg_autoctl show systemd05:48:11 12172 INFO HINT: to complete a systemd integration, run the following commands (as root):05:48:11 12172 INFO pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service05:48:11 12172 INFO systemctl daemon-reload05:48:11 12172 INFO systemctl enable pgautofailover05:48:11 12172 INFO systemctl start pgautofailover[Unit]Description = pg_auto_failover[Service]WorkingDirectory = /home/postgresEnvironment = 'PGDATA=/usr/local/pgsql16/pg9300/data'User = postgresExecStart = /usr/local/pgsql16/server/bin/pg_autoctl runRestart = alwaysStartLimitBurst = 0ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload[Install]WantedBy = multi-user.targetroot@ubuntu13:/usr/local/pg_install_package# pg_autoctl -q show systemd --pgdata "/usr/local/pgsql16/pg9300/data" | tee /etc/systemd/system/pgautofailover.service[Unit]Description = pg_auto_failover[Service]WorkingDirectory = /home/postgresEnvironment = 'PGDATA=/usr/local/pgsql16/pg9300/data'User = postgresExecStart = /usr/local/pgsql16/server/bin/pg_autoctl runRestart = alwaysStartLimitBurst = 0ExecReload = /usr/local/pgsql16/server/bin/pg_autoctl reload[Install]WantedBy = multi-user.targetroot@ubuntu13:/usr/local/pg_install_package# systemctl daemon-reloadroot@ubuntu13:/usr/local/pg_install_package# systemctl enable pgautofailoverroot@ubuntu13:/usr/local/pg_install_package# systemctl start pgautofailoverroot@ubuntu13:/usr/local/pg_install_package#root@ubuntu13:/usr/local/pg_install_package# systemctl status pgautofailover● pgautofailover.service - pg_auto_failover Loaded: loaded (/etc/systemd/system/pgautofailover.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2025-10-09 05:48:26 UTC; 5s ago Main PID: 12381 (pg_autoctl) Tasks: 9 (limit: 4550) Memory: 42.0M CGroup: /system.slice/pgautofailover.service ├─12381 /usr/local/pgsql16/server/bin/pg_autoctl run ├─12391 pg_autoctl: start/stop postgres ├─12392 pg_autoctl: node active ├─12402 /usr/local/pgsql16/server/bin/postgres -D /usr/local/pgsql16/pg9300/data -p 9300 -h * ├─12403 postgres: logger ├─12404 postgres: checkpointer ├─12405 postgres: background writer ├─12406 postgres: startup recovering 000000010000000000000003 └─12407 postgres: walreceiver streaming 0/3000110Oct 09 05:48:27 ubuntu13 pg_autoctl[12391]: 05:48:27 12391 INFO Postgres is now serving PGDATA "/usr/local/pgsql16/pg9300/data" on port 9300 with pid 12402Oct 09 05:48:27 ubuntu13 pg_autoctl[12392]: 05:48:27 12392 WARN PostgreSQL was not running, restarted with pid 12402Oct 09 05:48:28 ubuntu13 pg_autoctl[12392]: 05:48:28 12392 INFO Updated the keeper's state from the local PostgreSQL instance, which is runningOct 09 05:48:28 ubuntu13 pg_autoctl[12392]: 05:48:28 12392 INFO pg_autoctl managed to ensure current state "catchingup": PostgreSQL is runningOct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO Monitor assigned new state "secondary"Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO FSM transition from "catchingup" to "secondary": Convinced the monitor that I'm up and running, and eligible for promotion againOct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO Reached timeline 1, same as upstream node 52 "ubuntu12" (ubuntu12:9300)Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO Creating replication slot "pgautofailover_standby_52"Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO Transition complete: current state is now "secondary"Oct 09 05:48:29 ubuntu13 pg_autoctl[12392]: 05:48:29 12392 INFO New state for node 52 "ubuntu12" (ubuntu12:9300): primary ➜ primaryroot@ubuntu13:/usr/local/pg_install_package#
再次回到monitor节点查看,可看到ubuntu11作为monitor节点,ubuntu12作为主节点,ubuntu13作为从节点
root@ubuntu11:~#root@ubuntu11:~# pg_autoctl show uri; Type | Name | Connection String-------------+---------+------------------------------- monitor | monitor | postgres://autoctl_node@ubuntu11:9300/pg_auto_failover?sslmode=require formation | default | postgres://ubuntu12:9300,ubuntu13:9300/postgres?target_session_attrs=read-write&sslmode=requireroot@ubuntu11:~#root@ubuntu11:~#root@ubuntu11:~# pg_autoctl show state Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State---------+-------+---------------+----------------+--------------+---------------------+--------------------ubuntu12 | 52 | ubuntu12:9300 | 1: 0/3000148 | read-write | primary | primaryubuntu13 | 60 | ubuntu13:9300 | 1: 0/3000148 | read-only | secondary | secondaryroot@ubuntu11:~#root@ubuntu11:~#
登录主节点修改postgres用户密码
ALTER USER postgres WITH PASSWORD 'a-strong-password';
1,修改从节点ubuntu13的hba.conf,添加一下访问规则
hostssl postgres all 192.168.0.0/16 md5
2,重启主节点ubuntu12,此时故障转移,ubuntu13从节点提升为主节点,同时ubuntu13作为新的主节点,配置规则会覆盖从节点ubuntu12
pg_auto_failover的hba.conf覆盖规则是“新的主节点覆盖旧的主节点”,这一点有点绕,有兴趣的自己测试验证
然后从客户端连接至pg_auto_failover集群的主节点,查看复制状态,其实跟手动搭建的流复制就一样了,只不过是pg_auto_failover把整个postgresql集群的搭建过程给屏蔽掉了
SELECT * FROM pg_replication_slots;slot_name |plugin|slot_type|datoid|database|temporary|active|active_pid|xmin|catalog_xmin|restart_lsn|confirmed_flush_lsn|wal_status|safe_wal_size|two_phase|conflicting|-------------------------+------+---------+------+--------+---------+------+----------+----+------------+-----------+-------------------+----------+-------------+---------+-----------+pgautofailover_standby_60| |physical | | |false |true | 11783|747 | |0/5020868 | |reserved | |false | |select * from pg_stat_replication;pid |usesysid|usename |application_name |client_addr |client_hostname|client_port|backend_start |backend_xmin|state |sent_lsn |write_lsn|flush_lsn|replay_lsn|write_lag |flush_lag |replay_lag |sync_priority|sync_state|reply_time |-----+--------+-------------------------+-------------------------+---------------+---------------+-----------+-----------------------------+------------+---------+---------+---------+---------+----------+---------------+---------------+---------------+-------------+----------+-----------------------------+11783| 16416|pgautofailover_replicator|pgautofailover_standby_60|192.168.152.123|ubuntu13 | 60426|2025-10-09 13:23:00.095 +0800| |streaming|0/5020868|0/5020868|0/5020868|0/5020868 |00:00:00.001064|00:00:00.001766|00:00:00.001773| 1|quorum |2025-10-09 13:51:42.416 +0800|
可以看到,pg_auto_failover在一主一从的模式下,是同步复制
select * from pg_settings where name like '%synchronous_commit%';name |setting|unit|category |short_desc |extra_desc|context|vartype|source |min_val|max_val|enumvals |boot_val|reset_val|sourcefile |sourceline|pending_restart|------------------+-------+----+--------------------------+-----------------------------------------------------+----------+-------+-------+------------------+-------+-------+----------------------------------------+--------+---------+------------------------------------------------------------+----------+---------------+synchronous_commit|on | |Write-Ahead Log / Settings|Sets the current transaction''s synchronization level.| |user |enum |configuration file| | |{local,remote_write,remote_apply,on,off}|on |on |/usr/local/pgsql16/pg9300/data/postgresql-auto-failover.conf| 12|false |select * from pg_settings where name like '%synchronous_standby_names%' ;name |setting |unit|category |short_desc |extra_desc|context|vartype|source |min_val|max_val|enumvals|boot_val|reset_val |sourcefile |sourceline|pending_restart|-------------------------+---------------------------------+----+----------------------------+-------------------------------------------------------------------------------+----------+-------+-------+------------------+-------+-------+--------+--------+---------------------------------+---------------------------------------------------+----------+---------------+synchronous_standby_names|ANY 1 (pgautofailover_standby_60)| |Replication / Primary Server|Number of synchronous standbys and list of names of potential synchronous ones.| |sighup |string |configuration file| | |NULL | |ANY 1 (pgautofailover_standby_60)|/usr/local/pgsql16/pg9300/data/postgresql.auto.conf| 4|false |
河北梆子的主要伴奏乐器之一是啥
“国产 ChromeOS”FydeOS v21 发布:引入快速插入 / 即圈即搜等 AI 功能、Chromium OS 升至 r138