1PG_AUTO_FAILOVER(1) pg_auto_failover PG_AUTO_FAILOVER(1)
2
3
4
6 pg_auto_failover - pg_auto_failover Documentation
7
9 pg_auto_failover is an extension for PostgreSQL that monitors and man‐
10 ages failover for a postgres clusters. It is optimised for simplicity
11 and correctness.
12
13 Single Standby Architecture
14 [image: pg_auto_failover Architecture with a primary and a standby
15 node] [image] pg_auto_failover architecture with a primary and a
16 standby node.UNINDENT
17
18 pg_auto_failover implements Business Continuity for your PostgreSQL
19 services. pg_auto_failover implements a single PostgreSQL service us‐
20 ing multiple nodes with automated failover, and automates PostgreSQL
21 maintenance operations in a way that guarantees availability of the
22 service to its users and applications.
23
24 To that end, pg_auto_failover uses three nodes (machines, servers)
25 per PostgreSQL service:
26
27 • a PostgreSQL primary node,
28
29 • a PostgreSQL secondary node, using Synchronous Hot Standby,
30
31 • a pg_auto_failover Monitor node that acts both as a witness and an
32 orchestrator.
33
34 The pg_auto_failover Monitor implements a state machine and relies on
35 in-core PostgreSQL facilities to deliver HA. For example. when the sec‐
36 ondary node is detected to be unavailable, or when its lag is reported
37 above a defined threshold (the default is 1 WAL files, or 16MB, see the
38 pgautofailover.promote_wal_log_threshold GUC on the pg_auto_failover
39 monitor), then the Monitor removes it from the synchro‐
40 nous_standby_names setting on the primary node. Until the secondary is
41 back to being monitored healthy, failover and switchover operations are
42 not allowed, preventing data loss.
43
44 Multiple Standby Architecture
45 [image: pg_auto_failover Architecture for a standalone PostgreSQL
46 service] [image] pg_auto_failover architecture with a primary and two
47 standby nodes.UNINDENT
48
49 In the pictured architecture, pg_auto_failover implements Business
50 Continuity and data availability by implementing a single PostgreSQL
51 service using multiple with automated failover and data redundancy.
52 Even after losing any Postgres node in a production system, this ar‐
53 chitecture maintains two copies of the data on two different nodes.
54
55 When using more than one standby, different architectures can be
56 achieved with pg_auto_failover, depending on the objectives and
57 trade-offs needed for your production setup.
58
59 Multiple Standbys Architecture with 3 standby nodes, one async
60 [image: pg_auto_failover architecture with a primary and three
61 standby nodes] [image] pg_auto_failover architecture with a primary
62 and three standby nodes.UNINDENT
63
64 When setting the three parameters above, it's possible to design very
65 different Postgres architectures for your production needs.
66
67 In this case, the system is setup with two standby nodes participat‐
68 ing in the replication quorum, allowing for number_sync_standbys = 1.
69 The system always maintains a minimum of two copies of the data set:
70 one on the primary, another one on one on either node B or node D.
71 Whenever we lose one of those nodes, we can hold to this guarantee of
72 two copies of the data set.
73
74 Adding to that, we have the standby server C which has been set up to
75 not participate in the replication quorum. Node C will not be found
76 in the synchronous_standby_names list of nodes. Also, node C is set
77 up in a way to never be a candidate for failover, with candidate-pri‐
78 ority = 0.
79
80 This architecture would fit a situation where nodes A, B, and D are
81 deployed in the same data center or availability zone, and node C in
82 another. Those three nodes are set up to support the main production
83 traffic and implement high availability of both the Postgres service
84 and the data set.
85
86 Node C might be set up for Business Continuity in case the first data
87 center is lost, or maybe for reporting the need for deployment on an‐
88 other application domain.
89
91 pg_auto_failover includes the command line tool pg_autoctl that imple‐
92 ments many commands to manage your Postgres nodes. To implement the
93 Postgres architectures described in this documentation, and more, it is
94 generally possible to use only some of the many pg_autoctl commands.
95
96 This section of the documentation is a short introduction to the main
97 commands that are useful when getting started with pg_auto_failover.
98 More commands are available and help deal with a variety of situations,
99 see the manual for the whole list.
100
101 To understand which replication settings to use in your case, see ar‐
102 chitecture_basics section and then the multi_node_architecture section.
103
104 To follow a step by step guide that you can reproduce on your own Azure
105 subscription and create a production Postgres setup from VMs, see the
106 tutorial section.
107
108 To understand how to setup pg_auto_failover in a way that is compliant
109 with your internal security guide lines, read the security section.
110
111 Command line environment, configuration files, etc
112 As a command line tool pg_autoctl depends on some environment vari‐
113 ables. Mostly, the tool re-uses the Postgres environment variables
114 that you might already know.
115
116 To manage a Postgres node pg_auto_failover needs to know its data di‐
117 rectory location on-disk. For that, some users will find it easier to
118 export the PGDATA variable in their environment. The alternative con‐
119 sists of always using the --pgdata option that is available to all the
120 pg_autoctl commands.
121
122 Creating Postgres Nodes
123 To get started with the simplest Postgres failover setup, 3 nodes are
124 needed: the pg_auto_failover monitor, and 2 Postgres nodes that will
125 get assigned roles by the monitor. One Postgres node will be assigned
126 the primary role, the other one will get assigned the secondary role.
127
128 To create the monitor use the command:
129
130 $ pg_autoctl create monitor
131
132 The create the Postgres nodes use the following command on each node
133 you want to create:
134
135 $ pg_autoctl create postgres
136
137 While those create commands initialize your nodes, now you have to ac‐
138 tually run the Postgres service that are expected to be running. For
139 that you can manually run the following command on every node:
140
141 $ pg_autoctl run
142
143 It is also possible (and recommended) to integrate the pg_auto_failover
144 service in your usual service management facility. When using systemd
145 the following commands can be used to produce the unit file configura‐
146 tion required:
147
148 $ pg_autoctl show systemd
149 INFO HINT: to complete a systemd integration, run the following commands:
150 INFO pg_autoctl -q show systemd --pgdata "/tmp/pgaf/m" | sudo tee /etc/systemd/system/pgautofailover.service
151 INFO sudo systemctl daemon-reload
152 INFO sudo systemctl enable pgautofailover
153 INFO sudo systemctl start pgautofailover
154 [Unit]
155 ...
156
157 While it is expected that for a production deployment each node actu‐
158 ally is a separate machine (virtual or physical, or even a container),
159 it is also possible to run several Postgres nodes all on the same ma‐
160 chine for testing or development purposes.
161
162 TIP:
163 When running several pg_autoctl nodes on the same machine for test‐
164 ing or contributing to pg_auto_failover, each Postgres instance
165 needs to run on its own port, and with its own data directory. It
166 can make things easier to then set the environement variables PGDATA
167 and PGPORT in each terminal, shell, or tab where each instance is
168 started.
169
170 Inspecting nodes
171 Once your Postgres nodes have been created, and once each pg_autoctl
172 service is running, it is possible to inspect the current state of the
173 formation with the following command:
174
175 $ pg_autoctl show state
176
177 The pg_autoctl show state commands outputs the current state of the
178 system only once. Sometimes it would be nice to have an auto-updated
179 display such as provided by common tools such as watch(1) or top(1) and
180 the like. For that, the following commands are available (see also
181 pg_autoctl_watch):
182
183 $ pg_autoctl watch
184 $ pg_autoctl show state --watch
185
186 To analyze what's been happening to get to the current state, it is
187 possible to review the past events generated by the pg_auto_failover
188 monitor with the following command:
189
190 $ pg_autoctl show events
191
192 HINT:
193 The pg_autoctl show commands can be run from any node in your sys‐
194 tem. Those command need to connect to the monitor and print the
195 current state or the current known list of events as per the monitor
196 view of the system.
197
198 Use pg_autoctl show state --local to have a view of the local state
199 of a given node without connecting to the monitor Postgres instance.
200
201 The option --json is available in most pg_autoctl commands and
202 switches the output format from a human readable table form to a
203 program friendly JSON pretty-printed output.
204
205 Inspecting and Editing Replication Settings
206 When creating a node it is possible to use the --candidate-priority and
207 the --replication-quorum options to set the replication properties as
208 required by your choice of Postgres architecture.
209
210 To review the current replication settings of a formation, use one of
211 the two following commands, which are convenient aliases (the same com‐
212 mand with two ways to invoke it):
213
214 $ pg_autoctl show settings
215 $ pg_autoctl get formation settings
216
217 It is also possible to edit those replication settings at any time
218 while your nodes are in production: you can change your mind or adjust
219 to new elements without having to re-deploy everything. Just use the
220 following commands to adjust the replication settings on the fly:
221
222 $ pg_autoctl set formation number-sync-standbys
223 $ pg_autoctl set node replication-quorum
224 $ pg_autoctl set node candidate-priority
225
226 IMPORTANT:
227 The pg_autoctl get and pg_autoctl set commands always connect to the
228 monitor Postgres instance.
229
230 The pg_autoctl set command then changes the replication settings on
231 the node registration on the monitor. Then the monitor assigns the
232 APPLY_SETTINGS state to the current primary node in the system for
233 it to apply the new replication settings to its Postgres streaming
234 replication setup.
235
236 As a result, the pg_autoctl set commands requires a stable state in
237 the system to be allowed to proceed. Namely, the current primary
238 node in the system must have both its Current State and its Assigned
239 State set to primary, as per the pg_autoctl show state output.
240
241 Implementing Maintenance Operations
242 When a Postgres node must be taken offline for a maintenance operation,
243 such as e.g. a kernel security upgrade or a minor Postgres update, it
244 is best to make it so that the pg_auto_failover monitor knows about it.
245
246 • For one thing, a node that is known to be in maintenance does not
247 participate in failovers. If you are running with two Postgres
248 nodes, then failover operations are entirely prevented while the
249 standby node is in maintenance.
250
251 • Moreover, depending on your replication settings, enabling mainte‐
252 nance on your standby ensures that the primary node switches to
253 async replication before Postgres is shut down on the secondary,
254 avoiding write queries to be blocked.
255
256 To implement maintenance operations, use the following commands:
257
258 $ pg_autoctl enable maintenance
259 $ pg_autoctl disable maintenance
260
261 The main pg_autoctl run service that is expected to be running in the
262 background should continue to run during the whole maintenance opera‐
263 tion. When a node is in the maintenance state, the pg_autoctl service
264 is not controlling the Postgres service anymore.
265
266 Note that it is possible to enable maintenance on a primary Postgres
267 node, and that operation then requires a failover to happen first. It
268 is possible to have pg_auto_failover orchestrate that for you when us‐
269 ing the command:
270
271 $ pg_autoctl enable maintenance --allow-failover
272
273 IMPORTANT:
274 The pg_autoctl enable and pg_autoctl disable commands requires a
275 stable state in the system to be allowed to proceed. Namely, the
276 current primary node in the system must have both its Current State
277 and its Assigned State set to primary, as per the pg_autoctl show
278 state output.
279
280 Manual failover, switchover, and promotions
281 In the cases when a failover is needed without having an actual node
282 failure, the pg_auto_failover monitor can be used to orchestrate the
283 operation. Use one of the following commands, which are synonyms in the
284 pg_auto_failover design:
285
286 $ pg_autoctl perform failover
287 $ pg_autoctl perform switchover
288
289 Finally, it is also possible to “elect” a new primary node in your for‐
290 mation with the command:
291
292 $ pg_autoctl perform promotion
293
294 IMPORTANT:
295 The pg_autoctl perform commands requires a stable state in the sys‐
296 tem to be allowed to proceed. Namely, the current primary node in
297 the system must have both its Current State and its Assigned State
298 set to primary, as per the pg_autoctl show state output.
299
300 What's next?
301 This section of the documentation is meant to help users get started by
302 focusing on the main commands of the pg_autoctl tool. Each command has
303 many options that can have very small impact, or pretty big impact in
304 terms of security or architecture. Read the rest of the manual to un‐
305 derstand how to best use the many pg_autoctl options to implement your
306 specific Postgres production architecture.
307
309 In this guide we’ll create a primary and secondary Postgres node and
310 set up pg_auto_failover to replicate data between them. We’ll simulate
311 failure in the primary node and see how the system smoothly switches
312 (fails over) to the secondary.
313
314 For illustration, we'll run our databases on virtual machines in the
315 Azure platform, but the techniques here are relevant to any cloud
316 provider or on-premise network. We'll use four virtual machines: a pri‐
317 mary database, a secondary database, a monitor, and an "application."
318 The monitor watches the other nodes’ health, manages global state, and
319 assigns nodes their roles.
320
321 Create virtual network
322 Our database machines need to talk to each other and to the monitor
323 node, so let's create a virtual network.
324
325 az group create \
326 --name ha-demo \
327 --location eastus
328
329 az network vnet create \
330 --resource-group ha-demo \
331 --name ha-demo-net \
332 --address-prefix 10.0.0.0/16
333
334 We need to open ports 5432 (Postgres) and 22 (SSH) between the ma‐
335 chines, and also give ourselves access from our remote IP. We'll do
336 this with a network security group and a subnet.
337
338 az network nsg create \
339 --resource-group ha-demo \
340 --name ha-demo-nsg
341
342 az network nsg rule create \
343 --resource-group ha-demo \
344 --nsg-name ha-demo-nsg \
345 --name ha-demo-ssh-and-pg \
346 --access allow \
347 --protocol Tcp \
348 --direction Inbound \
349 --priority 100 \
350 --source-address-prefixes `curl ifconfig.me` 10.0.1.0/24 \
351 --source-port-range "*" \
352 --destination-address-prefix "*" \
353 --destination-port-ranges 22 5432
354
355 az network vnet subnet create \
356 --resource-group ha-demo \
357 --vnet-name ha-demo-net \
358 --name ha-demo-subnet \
359 --address-prefixes 10.0.1.0/24 \
360 --network-security-group ha-demo-nsg
361
362 Finally add four virtual machines (ha-demo-a, ha-demo-b, ha-demo-moni‐
363 tor, and ha-demo-app). For speed we background the az vm create pro‐
364 cesses and run them in parallel:
365
366 # create VMs in parallel
367 for node in monitor a b app
368 do
369 az vm create \
370 --resource-group ha-demo \
371 --name ha-demo-${node} \
372 --vnet-name ha-demo-net \
373 --subnet ha-demo-subnet \
374 --nsg ha-demo-nsg \
375 --public-ip-address ha-demo-${node}-ip \
376 --image debian \
377 --admin-username ha-admin \
378 --generate-ssh-keys &
379 done
380 wait
381
382 To make it easier to SSH into these VMs in future steps, let's make a
383 shell function to retrieve their IP addresses:
384
385 # run this in your local shell as well
386
387 vm_ip () {
388 az vm list-ip-addresses -g ha-demo -n ha-demo-$1 -o tsv \
389 --query '[] [] .virtualMachine.network.publicIpAddresses[0].ipAddress'
390 }
391
392 # for convenience with ssh
393
394 for node in monitor a b app
395 do
396 ssh-keyscan -H `vm_ip $node` >> ~/.ssh/known_hosts
397 done
398
399 Let's review what we created so far.
400
401 az resource list --output table --query \
402 "[?resourceGroup=='ha-demo'].{ name: name, flavor: kind, resourceType: type, region: location }"
403
404 This shows the following resources:
405
406 Name ResourceType Region
407 ------------------------------- ----------------------------------------------------- --------
408 ha-demo-a Microsoft.Compute/virtualMachines eastus
409 ha-demo-app Microsoft.Compute/virtualMachines eastus
410 ha-demo-b Microsoft.Compute/virtualMachines eastus
411 ha-demo-monitor Microsoft.Compute/virtualMachines eastus
412 ha-demo-appVMNic Microsoft.Network/networkInterfaces eastus
413 ha-demo-aVMNic Microsoft.Network/networkInterfaces eastus
414 ha-demo-bVMNic Microsoft.Network/networkInterfaces eastus
415 ha-demo-monitorVMNic Microsoft.Network/networkInterfaces eastus
416 ha-demo-nsg Microsoft.Network/networkSecurityGroups eastus
417 ha-demo-a-ip Microsoft.Network/publicIPAddresses eastus
418 ha-demo-app-ip Microsoft.Network/publicIPAddresses eastus
419 ha-demo-b-ip Microsoft.Network/publicIPAddresses eastus
420 ha-demo-monitor-ip Microsoft.Network/publicIPAddresses eastus
421 ha-demo-net Microsoft.Network/virtualNetworks eastus
422
423 Install the pg_autoctl executable
424 This guide uses Debian Linux, but similar steps will work on other dis‐
425 tributions. All that differs are the packages and paths. See install.
426
427 The pg_auto_failover system is distributed as a single pg_autoctl bi‐
428 nary with subcommands to initialize and manage a replicated PostgreSQL
429 service. We’ll install the binary with the operating system package
430 manager on all nodes. It will help us run and observe PostgreSQL.
431
432 for node in monitor a b app
433 do
434 az vm run-command invoke \
435 --resource-group ha-demo \
436 --name ha-demo-${node} \
437 --command-id RunShellScript \
438 --scripts \
439 "sudo touch /home/ha-admin/.hushlogin" \
440 "curl https://install.citusdata.com/community/deb.sh | sudo bash" \
441 "sudo DEBIAN_FRONTEND=noninteractive apt-get install -q -y postgresql-common" \
442 "echo 'create_main_cluster = false' | sudo tee -a /etc/postgresql-common/createcluster.conf" \
443 "sudo DEBIAN_FRONTEND=noninteractive apt-get install -q -y postgresql-11-auto-failover-1.4" \
444 "sudo usermod -a -G postgres ha-admin" &
445 done
446 wait
447
448 Run a monitor
449 The pg_auto_failover monitor is the first component to run. It periodi‐
450 cally attempts to contact the other nodes and watches their health. It
451 also maintains global state that “keepers” on each node consult to de‐
452 termine their own roles in the system.
453
454 # on the monitor virtual machine
455
456 ssh -l ha-admin `vm_ip monitor` -- \
457 pg_autoctl create monitor \
458 --auth trust \
459 --ssl-self-signed \
460 --pgdata monitor \
461 --pgctl /usr/lib/postgresql/11/bin/pg_ctl
462
463 This command initializes a PostgreSQL cluster at the location pointed
464 by the --pgdata option. When --pgdata is omitted, pg_autoctl attempts
465 to use the PGDATA environment variable. If a PostgreSQL instance had
466 already existing in the destination directory, this command would have
467 configured it to serve as a monitor.
468
469 pg_auto_failover, installs the pgautofailover Postgres extension, and
470 grants access to a new autoctl_node user.
471
472 In the Quick Start we use --auth trust to avoid complex security set‐
473 tings. The Postgres trust authentication method is not considered a
474 reasonable choice for production environments. Consider either using
475 the --skip-pg-hba option or --auth scram-sha-256 and then setting up
476 passwords yourself.
477
478 At this point the monitor is created. Now we'll install it as a service
479 with systemd so that it will resume if the VM restarts.
480
481 ssh -T -l ha-admin `vm_ip monitor` << CMD
482 pg_autoctl -q show systemd --pgdata ~ha-admin/monitor > pgautofailover.service
483 sudo mv pgautofailover.service /etc/systemd/system
484 sudo systemctl daemon-reload
485 sudo systemctl enable pgautofailover
486 sudo systemctl start pgautofailover
487 CMD
488
489 Bring up the nodes
490 We’ll create the primary database using the pg_autoctl create subcom‐
491 mand.
492
493 ssh -l ha-admin `vm_ip a` -- \
494 pg_autoctl create postgres \
495 --pgdata ha \
496 --auth trust \
497 --ssl-self-signed \
498 --username ha-admin \
499 --dbname appdb \
500 --hostname ha-demo-a.internal.cloudapp.net \
501 --pgctl /usr/lib/postgresql/11/bin/pg_ctl \
502 --monitor 'postgres://autoctl_node@ha-demo-monitor.internal.cloudapp.net/pg_auto_failover?sslmode=require'
503
504 Notice the user and database name in the monitor connection string --
505 these are what monitor init created. We also give it the path to pg_ctl
506 so that the keeper will use the correct version of pg_ctl in future
507 even if other versions of postgres are installed on the system.
508
509 In the example above, the keeper creates a primary database. It chooses
510 to set up node A as primary because the monitor reports there are no
511 other nodes in the system yet. This is one example of how the keeper is
512 state-based: it makes observations and then adjusts its state, in this
513 case from "init" to "single."
514
515 Also add a setting to trust connections from our "application" VM:
516
517 ssh -T -l ha-admin `vm_ip a` << CMD
518 echo 'hostssl "appdb" "ha-admin" ha-demo-app.internal.cloudapp.net trust' \
519 >> ~ha-admin/ha/pg_hba.conf
520 CMD
521
522 At this point the monitor and primary node are created and running.
523 Next we need to run the keeper. It’s an independent process so that it
524 can continue operating even if the PostgreSQL process goes terminates
525 on the node. We'll install it as a service with systemd so that it will
526 resume if the VM restarts.
527
528 ssh -T -l ha-admin `vm_ip a` << CMD
529 pg_autoctl -q show systemd --pgdata ~ha-admin/ha > pgautofailover.service
530 sudo mv pgautofailover.service /etc/systemd/system
531 sudo systemctl daemon-reload
532 sudo systemctl enable pgautofailover
533 sudo systemctl start pgautofailover
534 CMD
535
536 Next connect to node B and do the same process. We'll do both steps at
537 once:
538
539 ssh -l ha-admin `vm_ip b` -- \
540 pg_autoctl create postgres \
541 --pgdata ha \
542 --auth trust \
543 --ssl-self-signed \
544 --username ha-admin \
545 --dbname appdb \
546 --hostname ha-demo-b.internal.cloudapp.net \
547 --pgctl /usr/lib/postgresql/11/bin/pg_ctl \
548 --monitor 'postgres://autoctl_node@ha-demo-monitor.internal.cloudapp.net/pg_auto_failover?sslmode=require'
549
550 ssh -T -l ha-admin `vm_ip b` << CMD
551 pg_autoctl -q show systemd --pgdata ~ha-admin/ha > pgautofailover.service
552 sudo mv pgautofailover.service /etc/systemd/system
553 sudo systemctl daemon-reload
554 sudo systemctl enable pgautofailover
555 sudo systemctl start pgautofailover
556 CMD
557
558 It discovers from the monitor that a primary exists, and then switches
559 its own state to be a hot standby and begins streaming WAL contents
560 from the primary.
561
562 Node communication
563 For convenience, pg_autoctl modifies each node's pg_hba.conf file to
564 allow the nodes to connect to one another. For instance, pg_autoctl
565 added the following lines to node A:
566
567 # automatically added to node A
568
569 hostssl "appdb" "ha-admin" ha-demo-a.internal.cloudapp.net trust
570 hostssl replication "pgautofailover_replicator" ha-demo-b.internal.cloudapp.net trust
571 hostssl "appdb" "pgautofailover_replicator" ha-demo-b.internal.cloudapp.net trust
572
573 For pg_hba.conf on the monitor node pg_autoctl inspects the local net‐
574 work and makes its best guess about the subnet to allow. In our case it
575 guessed correctly:
576
577 # automatically added to the monitor
578
579 hostssl "pg_auto_failover" "autoctl_node" 10.0.1.0/24 trust
580
581 If worker nodes have more ad-hoc addresses and are not in the same sub‐
582 net, it's better to disable pg_autoctl's automatic modification of
583 pg_hba using the --skip-pg-hba command line option during creation. You
584 will then need to edit the hba file by hand. Another reason for manual
585 edits would be to use special authentication methods.
586
587 Watch the replication
588 First let’s verify that the monitor knows about our nodes, and see what
589 states it has assigned them:
590
591 ssh -l ha-admin `vm_ip monitor` pg_autoctl show state --pgdata monitor
592
593 Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
594 -------+-------+--------------------------------------+-----------+-----------+---------------------+--------------------
595 node_1 | 1 | ha-demo-a.internal.cloudapp.net:5432 | 0/3000060 | yes | primary | primary
596 node_2 | 2 | ha-demo-b.internal.cloudapp.net:5432 | 0/3000060 | yes | secondary | secondary
597
598 This looks good. We can add data to the primary, and later see it ap‐
599 pear in the secondary. We'll connect to the database from inside our
600 "app" virtual machine, using a connection string obtained from the mon‐
601 itor.
602
603 ssh -l ha-admin `vm_ip monitor` pg_autoctl show uri --pgdata monitor
604
605 Type | Name | Connection String
606 -----------+---------+-------------------------------
607 monitor | monitor | postgres://autoctl_node@ha-demo-monitor.internal.cloudapp.net:5432/pg_auto_failover?sslmode=require
608 formation | default | postgres://ha-demo-b.internal.cloudapp.net:5432,ha-demo-a.internal.cloudapp.net:5432/appdb?target_session_attrs=read-write&sslmode=require
609
610 Now we'll get the connection string and store it in a local environment
611 variable:
612
613 APP_DB_URI=$( \
614 ssh -l ha-admin `vm_ip monitor` \
615 pg_autoctl show uri --formation default --pgdata monitor \
616 )
617
618 The connection string contains both our nodes, comma separated, and in‐
619 cludes the url parameter ?target_session_attrs=read-write telling psql
620 that we want to connect to whichever of these servers supports reads
621 and writes. That will be the primary server.
622
623 # connect to database via psql on the app vm and
624 # create a table with a million rows
625 ssh -l ha-admin -t `vm_ip app` -- \
626 psql "'$APP_DB_URI'" \
627 -c "'CREATE TABLE foo AS SELECT generate_series(1,1000000) bar;'"
628
629 Cause a failover
630 Now that we've added data to node A, let's switch which is considered
631 the primary and which the secondary. After the switch we'll connect
632 again and query the data, this time from node B.
633
634 # initiate failover to node B
635 ssh -l ha-admin -t `vm_ip monitor` \
636 pg_autoctl perform switchover --pgdata monitor
637
638 Once node B is marked "primary" (or "wait_primary") we can connect and
639 verify that the data is still present:
640
641 # connect to database via psql on the app vm
642 ssh -l ha-admin -t `vm_ip app` -- \
643 psql "'$APP_DB_URI'" \
644 -c "'SELECT count(*) FROM foo;'"
645
646 It shows
647
648 count
649 ---------
650 1000000
651
652 Cause a node failure
653 This plot is too boring, time to introduce a problem. We’ll turn off VM
654 for node B (currently the primary after our previous failover) and
655 watch node A get promoted.
656
657 In one terminal let’s keep an eye on events:
658
659 ssh -t -l ha-admin `vm_ip monitor` -- \
660 watch -n 1 -d pg_autoctl show state --pgdata monitor
661
662 In another terminal we’ll turn off the virtual server.
663
664 az vm stop \
665 --resource-group ha-demo \
666 --name ha-demo-b
667
668 After a number of failed attempts to talk to node B, the monitor deter‐
669 mines the node is unhealthy and puts it into the "demoted" state. The
670 monitor promotes node A to be the new primary.
671
672 Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
673 -------+-------+--------------------------------------+-----------+-----------+---------------------+--------------------
674 node_1 | 1 | ha-demo-a.internal.cloudapp.net:5432 | 0/6D4E068 | yes | wait_primary | wait_primary
675 node_2 | 2 | ha-demo-b.internal.cloudapp.net:5432 | 0/6D4E000 | yes | demoted | catchingup
676
677 Node A cannot be considered in full "primary" state since there is no
678 secondary present, but it can still serve client requests. It is marked
679 as "wait_primary" until a secondary appears, to indicate that it's run‐
680 ning without a backup.
681
682 Let's add some data while B is offline.
683
684 # notice how $APP_DB_URI continues to work no matter which node
685 # is serving as primary
686 ssh -l ha-admin -t `vm_ip app` -- \
687 psql "'$APP_DB_URI'" \
688 -c "'INSERT INTO foo SELECT generate_series(1000001, 2000000);'"
689
690 Resurrect node B
691 Run this command to bring node B back online:
692
693 az vm start \
694 --resource-group ha-demo \
695 --name ha-demo-b
696
697 Now the next time the keeper retries its health check, it brings the
698 node back. Node B goes through the state "catchingup" while it updates
699 its data to match A. Once that's done, B becomes a secondary, and A is
700 now a full primary again.
701
702 Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
703 -------+-------+--------------------------------------+------------+-----------+---------------------+--------------------
704 node_1 | 1 | ha-demo-a.internal.cloudapp.net:5432 | 0/12000738 | yes | primary | primary
705 node_2 | 2 | ha-demo-b.internal.cloudapp.net:5432 | 0/12000738 | yes | secondary | secondary
706
707 What's more, if we connect directly to the database again, all two mil‐
708 lion rows are still present.
709
710 ssh -l ha-admin -t `vm_ip app` -- \
711 psql "'$APP_DB_URI'" \
712 -c "'SELECT count(*) FROM foo;'"
713
714 It shows
715
716 count
717 ---------
718 2000000
719
721 pg_auto_failover is designed as a simple and robust way to manage auto‐
722 mated Postgres failover in production. On-top of robust operations,
723 pg_auto_failover setup is flexible and allows either Business Continu‐
724 ity or High Availability configurations. pg_auto_failover design in‐
725 cludes configuration changes in a live system without downtime.
726
727 pg_auto_failover is designed to be able to handle a single PostgreSQL
728 service using three nodes. In this setting, the system is resilient to
729 losing any one of three nodes.
730 [image: pg_auto_failover Architecture for a standalone PostgreSQL
731 service] [image] pg_auto_failover Architecture for a standalone Post‐
732 greSQL service.UNINDENT
733
734 It is important to understand that when using only two Postgres nodes
735 then pg_auto_failover is optimized for Business Continuity. In the
736 event of losing a single node, pg_auto_failover is capable of contin‐
737 uing the PostgreSQL service, and prevents any data loss when doing
738 so, thanks to PostgreSQL Synchronous Replication.
739
740 That said, there is a trade-off involved in this architecture. The
741 business continuity bias relaxes replication guarantees for asynchro‐
742 nous replication in the event of a standby node failure. This allows
743 the PostgreSQL service to accept writes when there's a single server
744 available, and opens the service for potential data loss if the pri‐
745 mary server were also to fail.
746
747 The pg_auto_failover Monitor
748 Each PostgreSQL node in pg_auto_failover runs a Keeper process which
749 informs a central Monitor node about notable local changes. Some
750 changes require the Monitor to orchestrate a correction across the
751 cluster:
752
753 • New nodes
754
755 At initialization time, it's necessary to prepare the configura‐
756 tion of each node for PostgreSQL streaming replication, and get
757 the cluster to converge to the nominal state with both a primary
758 and a secondary node in each group. The monitor determines each
759 new node's role
760
761 • Node failure
762
763 The monitor orchestrates a failover when it detects an unhealthy
764 node. The design of pg_auto_failover allows the monitor to shut
765 down service to a previously designated primary node without caus‐
766 ing a "split-brain" situation.
767
768 The monitor is the authoritative node that manages global state and
769 makes changes in the cluster by issuing commands to the nodes' keeper
770 processes. A pg_auto_failover monitor node failure has limited impact
771 on the system. While it prevents reacting to other nodes' failures, it
772 does not affect replication. The PostgreSQL streaming replication
773 setup installed by pg_auto_failover does not depend on having the moni‐
774 tor up and running.
775
776 pg_auto_failover Glossary
777 pg_auto_failover handles a single PostgreSQL service with the following
778 concepts:
779
780 Monitor
781 The pg_auto_failover monitor is a service that keeps track of one or
782 several formations containing groups of nodes.
783
784 The monitor is implemented as a PostgreSQL extension, so when you run
785 the command pg_autoctl create monitor a PostgreSQL instance is initial‐
786 ized, configured with the extension, and started. The monitor service
787 embeds a PostgreSQL instance.
788
789 Formation
790 A formation is a logical set of PostgreSQL services that are managed
791 together.
792
793 It is possible to operate many formations with a single monitor in‐
794 stance. Each formation has a group of Postgres nodes and the FSM or‐
795 chestration implemented by the monitor applies separately to each
796 group.
797
798 Group
799 A group of two PostgreSQL nodes work together to provide a single Post‐
800 greSQL service in a Highly Available fashion. A group consists of a
801 PostgreSQL primary server and a secondary server setup with Hot Standby
802 synchronous replication. Note that pg_auto_failover can orchestrate the
803 whole setting-up of the replication for you.
804
805 In pg_auto_failover versions up to 1.3, a single Postgres group can
806 contain only two Postgres nodes. Starting with pg_auto_failover 1.4,
807 there's no limit to the number of Postgres nodes in a single group.
808 Note that each Postgres instance that belongs to the same group serves
809 the same dataset in its data directory (PGDATA).
810
811 NOTE:
812 The notion of a formation that contains multiple groups in
813 pg_auto_failover is useful when setting up and managing a whole Ci‐
814 tus formation, where the coordinator nodes belong to group zero of
815 the formation, and each Citus worker node becomes its own group and
816 may have Postgres standby nodes.
817
818 Keeper
819 The pg_auto_failover keeper is an agent that must be running on the
820 same server where your PostgreSQL nodes are running. The keeper con‐
821 trols the local PostgreSQL instance (using both the pg_ctl command-line
822 tool and SQL queries), and communicates with the monitor:
823
824 • it sends updated data about the local node, such as the WAL delta
825 in between servers, measured via PostgreSQL statistics views.
826
827 • it receives state assignments from the monitor.
828
829 Also the keeper maintains local state that includes the most recent
830 communication established with the monitor and the other PostgreSQL
831 node of its group, enabling it to detect network_partitions.
832
833 NOTE:
834 In pg_auto_failover versions up to and including 1.3, the keeper
835 process started with pg_autoctl run manages a separate Postgres in‐
836 stance, running as its own process tree.
837
838 Starting in pg_auto_failover version 1.4, the keeper process
839 (started with pg_autoctl run) runs the Postgres instance as a
840 sub-process of the main pg_autoctl process, allowing tighter control
841 over the Postgres execution. Running the sub-process also makes the
842 solution work better both in container environments (because it's
843 now a single process tree) and with systemd, because it uses a spe‐
844 cific cgroup per service unit.
845
846 Node
847 A node is a server (virtual or physical) that runs PostgreSQL instances
848 and a keeper service. At any given time, any node might be a primary or
849 a secondary Postgres instance. The whole point of pg_auto_failover is
850 to decide this state.
851
852 As a result, refrain from naming your nodes with the role you intend
853 for them. Their roles can change. If they didn't, your system wouldn't
854 need pg_auto_failover!
855
856 State
857 A state is the representation of the per-instance and per-group situa‐
858 tion. The monitor and the keeper implement a Finite State Machine to
859 drive operations in the PostgreSQL groups; allowing pg_auto_failover to
860 implement High Availability with the goal of zero data loss.
861
862 The keeper main loop enforces the current expected state of the local
863 PostgreSQL instance, and reports the current state and some more infor‐
864 mation to the monitor. The monitor uses this set of information and its
865 own health-check information to drive the State Machine and assign a
866 goal state to the keeper.
867
868 The keeper implements the transitions between a current state and a
869 monitor-assigned goal state.
870
871 Client-side HA
872 Implementing client-side High Availability is included in PostgreSQL's
873 driver libpq from version 10 onward. Using this driver, it is possible
874 to specify multiple host names or IP addresses in the same connection
875 string:
876
877 $ psql -d "postgresql://host1,host2/dbname?target_session_attrs=read-write"
878 $ psql -d "postgresql://host1:port2,host2:port2/dbname?target_session_attrs=read-write"
879 $ psql -d "host=host1,host2 port=port1,port2 target_session_attrs=read-write"
880
881 When using either of the syntax above, the psql application attempts to
882 connect to host1, and when successfully connected, checks the tar‐
883 get_session_attrs as per the PostgreSQL documentation of it:
884 If this parameter is set to read-write, only a connection in which
885 read-write transactions are accepted by default is considered ac‐
886 ceptable. The query SHOW transaction_read_only will be sent upon
887 any successful connection; if it returns on, the connection will be
888 closed. If multiple hosts were specified in the connection string,
889 any remaining servers will be tried just as if the connection at‐
890 tempt had failed. The default value of this parameter, any, regards
891 all connections as acceptable.
892
893 When the connection attempt to host1 fails, or when the target_ses‐
894 sion_attrs can not be verified, then the psql application attempts to
895 connect to host2.
896
897 The behavior is implemented in the connection library libpq, so any ap‐
898 plication using it can benefit from this implementation, not just psql.
899
900 When using pg_auto_failover, configure your application connection
901 string to use the primary and the secondary server host names, and set
902 target_session_attrs=read-write too, so that your application automati‐
903 cally connects to the current primary, even after a failover occurred.
904
905 Monitoring protocol
906 The monitor interacts with the data nodes in 2 ways:
907
908 • Data nodes periodically connect and run SELECT pgauto‐
909 failover.node_active(...) to communicate their current state and
910 obtain their goal state.
911
912 • The monitor periodically connects to all the data nodes to see if
913 they are healthy, doing the equivalent of pg_isready.
914
915 When a data node calls node_active, the state of the node is stored in
916 the pgautofailover.node table and the state machines of both nodes are
917 progressed. The state machines are described later in this readme. The
918 monitor typically only moves one state forward and waits for the
919 node(s) to converge except in failure states.
920
921 If a node is not communicating to the monitor, it will either cause a
922 failover (if node is a primary), disabling synchronous replication (if
923 node is a secondary), or cause the state machine to pause until the
924 node comes back (other cases). In most cases, the latter is harmless,
925 though in some cases it may cause downtime to last longer, e.g. if a
926 standby goes down during a failover.
927
928 To simplify operations, a node is only considered unhealthy if the mon‐
929 itor cannot connect and it hasn't reported its state through node_ac‐
930 tive for a while. This allows, for example, PostgreSQL to be restarted
931 without causing a health check failure.
932
933 Synchronous vs. asynchronous replication
934 By default, pg_auto_failover uses synchronous replication, which means
935 all writes block until at least one standby node has reported receiving
936 them. To handle cases in which the standby fails, the primary switches
937 between two states called wait_primary and primary based on the health
938 of standby nodes, and based on the replication setting num‐
939 ber_sync_standby.
940
941 When in the wait_primary state, synchronous replication is disabled by
942 automatically setting synchronous_standby_names = '' to allow writes to
943 proceed. However doing so also disables failover, since the standby
944 might get arbitrarily far behind. If the standby is responding to
945 health checks and within 1 WAL segment of the primary (by default),
946 synchronous replication is enabled again on the primary by setting syn‐
947 chronous_standby_names = '*' which may cause a short latency spike
948 since writes will then block until the standby has caught up.
949
950 When using several standby nodes with replication quorum enabled, the
951 actual setting for synchronous_standby_names is set to a list of those
952 standby nodes that are set to participate to the replication quorum.
953
954 If you wish to disable synchronous replication, you need to add the
955 following to postgresql.conf:
956
957 synchronous_commit = 'local'
958
959 This ensures that writes return as soon as they are committed on the
960 primary -- under all circumstances. In that case, failover might lead
961 to some data loss, but failover is not initiated if the secondary is
962 more than 10 WAL segments (by default) behind on the primary. During a
963 manual failover, the standby will continue accepting writes from the
964 old primary. The standby will stop accepting writes only if it's fully
965 caught up (most common), the primary fails, or it does not receive
966 writes for 2 minutes.
967
968 A note about performance
969 In some cases the performance impact on write latency when setting syn‐
970 chronous replication makes the application fail to deliver expected
971 performance. If testing or production feedback shows this to be the
972 case, it is beneficial to switch to using asynchronous replication.
973
974 The way to use asynchronous replication in pg_auto_failover is to
975 change the synchronous_commit setting. This setting can be set per
976 transaction, per session, or per user. It does not have to be set glob‐
977 ally on your Postgres instance.
978
979 One way to benefit from that would be:
980
981 alter role fast_and_loose set synchronous_commit to local;
982
983 That way performance-critical parts of the application don't have to
984 wait for the standby nodes. Only use this when you can also lower your
985 data durability guarantees.
986
987 Node recovery
988 When bringing a node back after a failover, the keeper (pg_autoctl run)
989 can simply be restarted. It will also restart postgres if needed and
990 obtain its goal state from the monitor. If the failed node was a pri‐
991 mary and was demoted, it will learn this from the monitor. Once the
992 node reports, it is allowed to come back as a standby by running
993 pg_rewind. If it is too far behind, the node performs a new pg_base‐
994 backup.
995
997 Pg_auto_failover allows you to have more than one standby node, and of‐
998 fers advanced control over your production architecture characteris‐
999 tics.
1000
1001 Architectures with two standby nodes
1002 When adding your second standby node with default settings, you get the
1003 following architecture:
1004 [image: pg_auto_failover architecture with two standby nodes] [image]
1005 pg_auto_failover architecture with two standby nodes.UNINDENT
1006
1007 In this case, three nodes get set up with the same characteristics,
1008 achieving HA for both the Postgres service and the production
1009 dataset. An important setting for this architecture is num‐
1010 ber_sync_standbys.
1011
1012 The replication setting number_sync_standbys sets how many standby
1013 nodes the primary should wait for when committing a transaction. In
1014 order to have a good availability in your system, pg_auto_failover
1015 requires number_sync_standbys + 1 standby nodes participating in the
1016 replication quorum: this allows any standby node to fail without im‐
1017 pact on the system's ability to respect the replication quorum.
1018
1019 When only two nodes are registered in a group on the monitor we have
1020 a primary and a single secondary node. Then number_sync_standbys can
1021 only be set to zero. When adding a second standby node to a
1022 pg_auto_failover group, then the monitor automatically increments
1023 number_sync_standbys to one, as we see in the diagram above.
1024
1025 When number_sync_standbys is set to zero then pg_auto_failover imple‐
1026 ments the Business Continuity setup as seen in architecture_basics:
1027 synchronous replication is then used as a way to guarantee that
1028 failover can be implemented without data loss.
1029
1030 In more details:
1031
1032 1. With number_sync_standbys set to one, this architecture always
1033 maintains two copies of the dataset: one on the current primary
1034 node (node A in the previous diagram), and one on the standby
1035 that acknowledges the transaction first (either node B or node C
1036 in the diagram).
1037
1038 When one of the standby nodes is unavailable, the second copy of
1039 the dataset can still be maintained thanks to the remaining
1040 standby.
1041
1042 When both the standby nodes are unavailable, then it's no longer
1043 possible to guarantee the replication quorum, and thus writes on
1044 the primary are blocked. The Postgres primary node waits until at
1045 least one standby node acknowledges the transactions locally com‐
1046 mitted, thus degrading your Postgres service to read-only.
1047
1048 0. It is possible to manually set number_sync_standbys to zero when
1049 having registered two standby nodes to the monitor, overriding
1050 the default behavior.
1051
1052 In that case, when the second standby node becomes unhealthy at
1053 the same time as the first standby node, the primary node is as‐
1054 signed the state wait_primary. In that state, synchronous repli‐
1055 cation is disabled on the primary by setting synchro‐
1056 nous_standby_names to an empty string. Writes are allowed on the
1057 primary, even though there's no extra copy of the production
1058 dataset available at this time.
1059
1060 Setting number_sync_standbys to zero allows data to be written
1061 even when both standby nodes are down. In this case, a single
1062 copy of the production data set is kept and, if the primary was
1063 then to fail, some data will be lost. How much depends on your
1064 backup and recovery mechanisms.
1065
1066 Replication Settings and Postgres Architectures
1067 The entire flexibility of pg_auto_failover can be leveraged with the
1068 following three replication settings:
1069
1070 • Number of sync stanbys
1071
1072 • Replication quorum
1073
1074 • Candidate priority
1075
1076 Number Sync Standbys
1077 This parameter is used by Postgres in the synchronous_standby_names pa‐
1078 rameter: number_sync_standby is the number of synchronous standbys for
1079 whose replies transactions must wait.
1080
1081 This parameter can be set at the formation level in pg_auto_failover,
1082 meaning that it applies to the current primary, and "follows" a
1083 failover to apply to any new primary that might replace the current
1084 one.
1085
1086 To set this parameter to the value <n>, use the following command:
1087
1088 pg_autoctl set formation number-sync-standbys <n>
1089
1090 The default value in pg_auto_failover is zero. When set to zero, the
1091 Postgres parameter synchronous_standby_names can be set to either '*'
1092 or to '':
1093
1094 • synchronous_standby_names = '*' means that any standby may partici‐
1095 pate in the replication quorum for transactions with synchronous_com‐
1096 mit set to on or higher values.
1097
1098 pg_autofailover uses synchronous_standby_names = '*' when there's at
1099 least one standby that is known to be healthy.
1100
1101 • synchronous_standby_names = '' (empty string) disables synchrous com‐
1102 mit and makes all your commits asynchronous, meaning that transaction
1103 commits will not wait for replication. In other words, a single copy
1104 of your production data is maintained when synchronous_standby_names
1105 is set that way.
1106
1107 pg_autofailover uses synchronous_standby_names = '' only when num‐
1108 ber_sync_standbys is set to zero and there's no standby node known
1109 healthy by the monitor.
1110
1111 In order to set number_sync_standbys to a non-zero value,
1112 pg_auto_failover requires that at least number_sync_standbys + 1
1113 standby nodes be registered in the system.
1114
1115 When the first standby node is added to the pg_auto_failover monitor,
1116 the only acceptable value for number_sync_standbys is zero. When a sec‐
1117 ond standby is added that participates in the replication quorum, then
1118 number_sync_standbys is automatically set to one.
1119
1120 The command pg_autoctl set formation number-sync-standbys can be used
1121 to change the value of this parameter in a formation, even when all the
1122 nodes are already running in production. The pg_auto_failover monitor
1123 then sets a transition for the primary to update its local value of
1124 synchronous_standby_names.
1125
1126 Replication Quorum
1127 The replication quorum setting is a boolean and defaults to true, and
1128 can be set per-node. Pg_auto_failover includes a given node in synchro‐
1129 nous_standby_names only when the replication quorum parameter has been
1130 set to true. This means that asynchronous replication will be used for
1131 nodes where replication-quorum is set to false.
1132
1133 It is possible to force asynchronous replication globally by setting
1134 replication quorum to false on all the nodes in a formation. Remember
1135 that failovers will happen, and thus to set your replication settings
1136 on the current primary node too when needed: it is going to be a
1137 standby later.
1138
1139 To set this parameter to either true or false, use one of the following
1140 commands:
1141
1142 pg_autoctl set node replication-quorum true
1143 pg_autoctl set node replication-quorum false
1144
1145 Candidate Priority
1146 The candidate priority setting is an integer that can be set to any
1147 value between 0 (zero) and 100 (one hundred). The default value is 50.
1148 When the pg_auto_failover monitor decides to orchestrate a failover, it
1149 uses each node's candidate priority to pick the new primary node.
1150
1151 When setting the candidate priority of a node down to zero, this node
1152 will never be selected to be promoted as the new primary when a
1153 failover is orchestrated by the monitor. The monitor will instead wait
1154 until another node registered is healthy and in a position to be pro‐
1155 moted.
1156
1157 To set this parameter to the value <n>, use the following command:
1158
1159 pg_autoctl set node candidate-priority <n>
1160
1161 When nodes have the same candidate priority, the monitor then picks the
1162 standby with the most advanced LSN position published to the monitor.
1163 When more than one node has published the same LSN position, a random
1164 one is chosen.
1165
1166 When the candidate for failover has not published the most advanced LSN
1167 position in the WAL, pg_auto_failover orchestrates an intermediate step
1168 in the failover mechanism. The candidate fetches the missing WAL bytes
1169 from one of the standby with the most advanced LSN position prior to
1170 being promoted. Postgres allows this operation thanks to cascading
1171 replication: any standby can be the upstream node for another standby.
1172
1173 It is required at all times that at least two nodes have a non-zero
1174 candidate priority in any pg_auto_failover formation. Otherwise no
1175 failover is possible.
1176
1177 Auditing replication settings
1178 The command pg_autoctl get formation settings (also known as pg_autoctl
1179 show settings) can be used to obtain a summary of all the replication
1180 settings currently in effect in a formation. Still using the first dia‐
1181 gram on this page, we get the following summary:
1182
1183 $ pg_autoctl get formation settings
1184 Context | Name | Setting | Value
1185 ----------+---------+---------------------------+-------------------------------------------------------------
1186 formation | default | number_sync_standbys | 1
1187 primary | node_A | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_3, pgautofailover_standby_2)'
1188 node | node_A | replication quorum | true
1189 node | node_B | replication quorum | true
1190 node | node_C | replication quorum | true
1191 node | node_A | candidate priority | 50
1192 node | node_B | candidate priority | 50
1193 node | node_C | candidate priority | 50
1194
1195 We can see that the number_sync_standbys has been used to compute the
1196 current value of the synchronous_standby_names setting on the primary.
1197
1198 Because all the nodes in that example have the same default candidate
1199 priority (50), then pg_auto_failover is using the form ANY 1 with the
1200 list of standby nodes that are currently participating in the replica‐
1201 tion quorum.
1202
1203 The entries in the synchronous_standby_names list are meant to match
1204 the application_name connection setting used in the primary_conninfo,
1205 and the format used by pg_auto_failover there is the format string
1206 "pgautofailover_standby_%d" where %d is replaced by the node id. This
1207 allows keeping the same connection string to the primary when the node
1208 name is changed (using the command pg_autoctl set metadata --name).
1209
1210 Here we can see the node id of each registered Postgres node with the
1211 following command:
1212
1213 $ pg_autoctl show state
1214 Name | Node | Host:Port | LSN | Reachable | Current State | Assigned State
1215 -------+-------+----------------+-----------+-----------+---------------------+--------------------
1216 node_A | 1 | localhost:5001 | 0/7002310 | yes | primary | primary
1217 node_B | 2 | localhost:5002 | 0/7002310 | yes | secondary | secondary
1218 node_C | 3 | localhost:5003 | 0/7002310 | yes | secondary | secondary
1219
1220 When setting pg_auto_failover with per formation number_sync_standby
1221 and then per node replication quorum and candidate priority replication
1222 settings, those properties are then used to compute the synchro‐
1223 nous_standby_names value on the primary node. This value is automati‐
1224 cally maintained on the primary by pg_auto_failover, and is updated ei‐
1225 ther when replication settings are changed or when a failover happens.
1226
1227 The other situation when the pg_auto_failover replication settings are
1228 used is a candidate election when a failover happens and there is more
1229 than two nodes registered in a group. Then the node with the highest
1230 candidate priority is selected, as detailed above in the Candidate Pri‐
1231 ority section.
1232
1233 Sample architectures with three standby nodes
1234 When setting the three parameters above, it's possible to design very
1235 different Postgres architectures for your production needs.
1236 [image: pg_auto_failover architecture with three standby nodes] [im‐
1237 age] pg_auto_failover architecture with three standby nodes.UNINDENT
1238
1239 In this case, the system is set up with three standby nodes all set
1240 the same way, with default parameters. The default parameters support
1241 setting number_sync_standbys = 2. This means that Postgres will main‐
1242 tain three copies of the production data set at all times.
1243
1244 On the other hand, if two standby nodes were to fail at the same
1245 time, despite the fact that two copies of the data are still main‐
1246 tained, the Postgres service would be degraded to read-only.
1247
1248 With this architecture diagram, here's the summary that we obtain:
1249
1250 $ pg_autoctl show settings
1251 Context | Name | Setting | Value
1252 ----------+---------+---------------------------+---------------------------------------------------------------------------------------
1253 formation | default | number_sync_standbys | 2
1254 primary | node_A | synchronous_standby_names | 'ANY 2 (pgautofailover_standby_2, pgautofailover_standby_4, pgautofailover_standby_3)'
1255 node | node_A | replication quorum | true
1256 node | node_B | replication quorum | true
1257 node | node_C | replication quorum | true
1258 node | node_D | replication quorum | true
1259 node | node_A | candidate priority | 50
1260 node | node_B | candidate priority | 50
1261 node | node_C | candidate priority | 50
1262 node | node_D | candidate priority | 50
1263
1264 Sample architecture with three standby nodes, one async
1265 [image: pg_auto_failover architecture with three standby nodes, one
1266 async] [image] pg_auto_failover architecture with three standby
1267 nodes, one async.UNINDENT
1268
1269 In this case, the system is set up with two standby nodes participat‐
1270 ing in the replication quorum, allowing for number_sync_standbys = 1.
1271 The system always maintains at least two copies of the data set, one
1272 on the primary, another on either node B or node D. Whenever we lose
1273 one of those nodes, we can hold to the guarantee of having two copies
1274 of the data set.
1275
1276 Additionally, we have the standby server C which has been set up to
1277 not participate in the replication quorum. Node C will not be found
1278 in the synchronous_standby_names list of nodes. Also, node C is set
1279 up to never be a candidate for failover, with candidate-priority = 0.
1280
1281 This architecture would fit a situation with nodes A, B, and D are
1282 deployed in the same data center or availability zone and node C in
1283 another one. Those three nodes are set up to support the main pro‐
1284 duction traffic and implement high availability of both the Postgres
1285 service and the data set.
1286
1287 Node C might be set up for Business Continuity in case the first data
1288 center is lost, or maybe for reporting needs on another application
1289 domain.
1290
1291 With this architecture diagram, here's the summary that we obtain:
1292
1293 pg_autoctl show settings
1294 Context | Name | Setting | Value
1295 ----------+---------+---------------------------+-------------------------------------------------------------
1296 formation | default | number_sync_standbys | 1
1297 primary | node_A | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_4, pgautofailover_standby_2)'
1298 node | node_A | replication quorum | true
1299 node | node_B | replication quorum | true
1300 node | node_C | replication quorum | false
1301 node | node_D | replication quorum | true
1302 node | node_A | candidate priority | 50
1303 node | node_B | candidate priority | 50
1304 node | node_C | candidate priority | 0
1305 node | node_D | candidate priority | 50
1306
1308 Introduction
1309 pg_auto_failover uses a state machine for highly controlled execution.
1310 As keepers inform the monitor about new events (or fail to contact it
1311 at all), the monitor assigns each node both a current state and a goal
1312 state. A node's current state is a strong guarantee of its capabili‐
1313 ties. States themselves do not cause any actions; actions happen during
1314 state transitions. The assigned goal states inform keepers of what
1315 transitions to attempt.
1316
1317 Example of state transitions in a new cluster
1318 A good way to get acquainted with the states is by examining the tran‐
1319 sitions of a cluster from birth to high availability.
1320
1321 After starting a monitor and running keeper init for the first data
1322 node ("node A"), the monitor registers the state of that node as "init"
1323 with a goal state of "single." The init state means the monitor knows
1324 nothing about the node other than its existence because the keeper is
1325 not yet continuously running there to report node health.
1326
1327 Once the keeper runs and reports its health to the monitor, the monitor
1328 assigns it the state "single," meaning it is just an ordinary Postgres
1329 server with no failover. Because there are not yet other nodes in the
1330 cluster, the monitor also assigns node A the goal state of single --
1331 there's nothing that node A's keeper needs to change.
1332
1333 As soon as a new node ("node B") is initialized, the monitor assigns
1334 node A the goal state of "wait_primary." This means the node still has
1335 no failover, but there's hope for a secondary to synchronize with it
1336 soon. To accomplish the transition from single to wait_primary, node
1337 A's keeper adds node B's hostname to pg_hba.conf to allow a hot standby
1338 replication connection.
1339
1340 At the same time, node B transitions into wait_standby with the goal
1341 initially of staying in wait_standby. It can do nothing but wait until
1342 node A gives it access to connect. Once node A has transitioned to
1343 wait_primary, the monitor assigns B the goal of "catchingup," which
1344 gives B's keeper the green light to make the transition from
1345 wait_standby to catchingup. This transition involves running pg_base‐
1346 backup, editing recovery.conf and restarting PostgreSQL in Hot Standby
1347 node.
1348
1349 Node B reports to the monitor when it's in hot standby mode and able to
1350 connect to node A. The monitor then assigns node B the goal state of
1351 "secondary" and A the goal of "primary." Postgres ships WAL logs from
1352 node A and replays them on B. Finally B is caught up and tells the mon‐
1353 itor (specifically B reports its pg_stat_replication.sync_state and WAL
1354 replay lag). At this glorious moment the monitor assigns A the state
1355 primary (goal: primary) and B secondary (goal: secondary).
1356
1357 State reference
1358 The following diagram shows the pg_auto_failover State Machine. It's
1359 missing links to the single state, which can always been reached when
1360 removing all the other nodes.
1361 [image: pg_auto_failover Finite State Machine diagram] [image]
1362 pg_auto_failover Finite State Machine diagram.UNINDENT
1363
1364 In the previous diagram we can see that we have a list of six states
1365 where the application can connect to a read-write Postgres service:
1366 single, wait_primary, primary, prepare_maintenance, and apply_set‐
1367 tings.
1368
1369 Init
1370 A node is assigned the "init" state when it is first registered with
1371 the monitor. Nothing is known about the node at this point beyond its
1372 existence. If no other node has been registered with the monitor for
1373 the same formation and group ID then this node is assigned a goal state
1374 of "single." Otherwise the node has the goal state of "wait_standby."
1375
1376 Single
1377 There is only one node in the group. It behaves as a regular PostgreSQL
1378 instance, with no high availability and no failover. If the administra‐
1379 tor removes a node the other node will revert to the single state.
1380
1381 Wait_primary
1382 Applied to a node intended to be the primary but not yet in that posi‐
1383 tion. The primary-to-be at this point knows the secondary's node name
1384 or IP address, and has granted the node hot standby access in the
1385 pg_hba.conf file.
1386
1387 The wait_primary state may be caused either by a new potential sec‐
1388 ondary being registered with the monitor (good), or an existing sec‐
1389 ondary becoming unhealthy (bad). In the latter case, during the transi‐
1390 tion from primary to wait_primary, the primary node's keeper disables
1391 synchronous replication on the node. It also cancels currently blocked
1392 queries.
1393
1394 Join_primary
1395 Applied to a primary node when another standby is joining the group.
1396 This allows the primary node to apply necessary changes to its HBA
1397 setup before allowing the new node joining the system to run the
1398 pg_basebackup command.
1399
1400 IMPORTANT:
1401 This state has been deprecated, and is no longer assigned to nodes.
1402 Any time we would have used join_primary before, we now use primary
1403 instead.
1404
1405 Primary
1406 A healthy secondary node exists and has caught up with WAL replication.
1407 Specifically, the keeper reports the primary state only when it has
1408 verified that the secondary is reported "sync" in pg_stat_replica‐
1409 tion.sync_state, and with a WAL lag of 0.
1410
1411 The primary state is a strong assurance. It's the only state where we
1412 know we can fail over when required.
1413
1414 During the transition from wait_primary to primary, the keeper also en‐
1415 ables synchronous replication. This means that after a failover the
1416 secondary will be fully up to date.
1417
1418 Wait_standby
1419 Monitor decides this node is a standby. Node must wait until the pri‐
1420 mary has authorized it to connect and setup hot standby replication.
1421
1422 Catchingup
1423 The monitor assigns catchingup to the standby node when the primary is
1424 ready for a replication connection (pg_hba.conf has been properly
1425 edited, connection role added, etc).
1426
1427 The standby node keeper runs pg_basebackup, connecting to the primary's
1428 hostname and port. The keeper then edits recovery.conf and starts Post‐
1429 greSQL in hot standby node.
1430
1431 Secondary
1432 A node with this state is acting as a hot standby for the primary, and
1433 is up to date with the WAL log there. In particular, it is within 16MB
1434 or 1 WAL segment of the primary.
1435
1436 Maintenance
1437 The cluster administrator can manually move a secondary into the main‐
1438 tenance state to gracefully take it offline. The primary will then
1439 transition from state primary to wait_primary, during which time the
1440 secondary will be online to accept writes. When the old primary reaches
1441 the wait_primary state then the secondary is safe to take offline with
1442 minimal consequences.
1443
1444 Prepare_maintenance
1445 The cluster administrator can manually move a primary node into the
1446 maintenance state to gracefully take it offline. The primary then tran‐
1447 sitions to the prepare_maintenance state to make sure the secondary is
1448 not missing any writes. In the prepare_maintenance state, the primary
1449 shuts down.
1450
1451 Wait_maintenance
1452 The custer administrator can manually move a secondary into the mainte‐
1453 nance state to gracefully take it offline. Before reaching the mainte‐
1454 nance state though, we want to switch the primary node to asynchronous
1455 replication, in order to avoid writes being blocked. In the state
1456 wait_maintenance the standby waits until the primary has reached
1457 wait_primary.
1458
1459 Draining
1460 A state between primary and demoted where replication buffers finish
1461 flushing. A draining node will not accept new client writes, but will
1462 continue to send existing data to the secondary.
1463
1464 To implement that with Postgres we actually stop the service. When
1465 stopping, Postgres ensures that the current replication buffers are
1466 flushed correctly to synchronous standbys.
1467
1468 Demoted
1469 The primary keeper or its database were unresponsive past a certain
1470 threshold. The monitor assigns demoted state to the primary to avoid a
1471 split-brain scenario where there might be two nodes that don't communi‐
1472 cate with each other and both accept client writes.
1473
1474 In that state the keeper stops PostgreSQL and prevents it from running.
1475
1476 Demote_timeout
1477 If the monitor assigns the primary a demoted goal state but the primary
1478 keeper doesn't acknowledge transitioning to that state within a timeout
1479 window, then the monitor assigns demote_timeout to the primary.
1480
1481 Most commonly may happen when the primary machine goes silent. The
1482 keeper is not reporting to the monitor.
1483
1484 Stop_replication
1485 The stop_replication state is meant to ensure that the primary goes to
1486 the demoted state before the standby goes to single and accepts writes
1487 (in case the primary can’t contact the monitor anymore). Before promot‐
1488 ing the secondary node, the keeper stops PostgreSQL on the primary to
1489 avoid split-brain situations.
1490
1491 For safety, when the primary fails to contact the monitor and fails to
1492 see the pg_auto_failover connection in pg_stat_replication, then it
1493 goes to the demoted state of its own accord.
1494
1495 Prepare_promotion
1496 The prepare_promotion state is meant to prepare the standby server to
1497 being promoted. This state allows synchronisation on the monitor, mak‐
1498 ing sure that the primary has stopped Postgres before promoting the
1499 secondary, hence preventing split brain situations.
1500
1501 Report_LSN
1502 The report_lsn state is assigned to standby nodes when a failover is
1503 orchestrated and there are several standby nodes. In order to pick the
1504 furthest standby in the replication, pg_auto_failover first needs a
1505 fresh report of the current LSN position reached on each standby node.
1506
1507 When a node reaches the report_lsn state, the replication stream is
1508 stopped, by restarting Postgres without a primary_conninfo. This allows
1509 the primary node to detect network_partitions, i.e. when the primary
1510 can't connect to the monitor and there's no standby listed in
1511 pg_stat_replication.
1512
1513 Fast_forward
1514 The fast_forward state is assigned to the selected promotion candidate
1515 during a failover when it won the election thanks to the candidate pri‐
1516 ority settings, but the selected node is not the most advanced standby
1517 node as reported in the report_lsn state.
1518
1519 Missing WAL bytes are fetched from one of the most advanced standby
1520 nodes by using Postgres cascading replication features: it is possible
1521 to use any standby node in the primary_conninfo.
1522
1523 Dropped
1524 The dropped state is assigned to a node when the pg_autoctl drop node
1525 command is used. This allows the node to implement specific local ac‐
1526 tions before being entirely removed from the monitor database.
1527
1528 When a node reports reaching the dropped state, the monitor removes its
1529 entry. If a node is not reporting anymore, maybe because it's com‐
1530 pletely unavailable, then it's possible to run the pg_autoctl drop node
1531 --force command, and then the node entry is removed from the monitor.
1532
1533 Failover logic
1534 This section needs to be expanded further, but below is the failover
1535 state machine for each node that is implemented by the monitor:
1536 [image: Node state machine] [image] Node state machine.UNINDENT
1537
1538 Since the state machines of the data nodes always move in tandem, a
1539 pair (group) of data nodes also implicitly has the following state
1540 machine:
1541 [image: Group state machine] [image] Group state machine.UNINDENT
1542
1543 pg_auto_failover keeper's State Machine
1544 When built in TEST mode, it is then possible to use the following com‐
1545 mand to get a visual representation of the Keeper's Finite State Ma‐
1546 chine:
1547
1548 $ PG_AUTOCTL_DEBUG=1 pg_autoctl do fsm gv | dot -Tsvg > fsm.svg
1549
1550 The dot program is part of the Graphviz suite and produces the follow‐
1551 ing output:
1552 [image: Keeper state machine] [image] Keeper State Machine.UNINDENT
1553
1555 At the heart of the pg_auto_failover implementation is a State Machine.
1556 The state machine is driven by the monitor, and its transitions are im‐
1557 plemented in the keeper service, which then reports success to the mon‐
1558 itor.
1559
1560 The keeper is allowed to retry transitions as many times as needed un‐
1561 til they succeed, and reports also failures to reach the assigned state
1562 to the monitor node. The monitor also implements frequent health-checks
1563 targeting the registered PostgreSQL nodes.
1564
1565 When the monitor detects something is not as expected, it takes action
1566 by assigning a new goal state to the keeper, that is responsible for
1567 implementing the transition to this new state, and then reporting.
1568
1569 Unhealthy Nodes
1570 The pg_auto_failover monitor is responsible for running regular
1571 health-checks with every PostgreSQL node it manages. A health-check is
1572 successful when it is able to connect to the PostgreSQL node using the
1573 PostgreSQL protocol (libpq), imitating the pg_isready command.
1574
1575 How frequent those health checks are (20s by default), the PostgreSQL
1576 connection timeout in use (5s by default), and how many times to retry
1577 in case of a failure before marking the node unhealthy (2 by default)
1578 are GUC variables that you can set on the Monitor node itself. Remem‐
1579 ber, the monitor is implemented as a PostgreSQL extension, so the setup
1580 is a set of PostgreSQL configuration settings:
1581
1582 SELECT name, setting
1583 FROM pg_settings
1584 WHERE name ~ 'pgautofailover\.health';
1585 name | setting
1586 -----------------------------------------+---------
1587 pgautofailover.health_check_max_retries | 2
1588 pgautofailover.health_check_period | 20000
1589 pgautofailover.health_check_retry_delay | 2000
1590 pgautofailover.health_check_timeout | 5000
1591 (4 rows)
1592
1593 The pg_auto_failover keeper also reports if PostgreSQL is running as
1594 expected. This is useful for situations where the PostgreSQL server /
1595 OS is running fine and the keeper (pg_autoctl run) is still active, but
1596 PostgreSQL has failed. Situations might include File System is Full on
1597 the WAL disk, some file system level corruption, missing files, etc.
1598
1599 Here's what happens to your PostgreSQL service in case of any sin‐
1600 gle-node failure is observed:
1601
1602 • Primary node is monitored unhealthy
1603
1604 When the primary node is unhealthy, and only when the secondary
1605 node is itself in good health, then the primary node is asked to
1606 transition to the DRAINING state, and the attached secondary is
1607 asked to transition to the state PREPARE_PROMOTION. In this state,
1608 the secondary is asked to catch-up with the WAL traffic from the
1609 primary, and then report success.
1610
1611 The monitor then continues orchestrating the promotion of the
1612 standby: it stops the primary (implementing STONITH in order to
1613 prevent any data loss), and promotes the secondary into being a
1614 primary now.
1615
1616 Depending on the exact situation that triggered the primary un‐
1617 healthy, it's possible that the secondary fails to catch-up with
1618 WAL from it, in that case after the PREPARE_PROMO‐
1619 TION_CATCHUP_TIMEOUT the standby reports success anyway, and the
1620 failover sequence continues from the monitor.
1621
1622 • Secondary node is monitored unhealthy
1623
1624 When the secondary node is unhealthy, the monitor assigns to it
1625 the state CATCHINGUP, and assigns the state WAIT_PRIMARY to the
1626 primary node. When implementing the transition from PRIMARY to
1627 WAIT_PRIMARY, the keeper disables synchronous replication.
1628
1629 When the keeper reports an acceptable WAL difference in the two
1630 nodes again, then the replication is upgraded back to being syn‐
1631 chronous. While a secondary node is not in the SECONDARY state,
1632 secondary promotion is disabled.
1633
1634 • Monitor node has failed
1635
1636 Then the primary and secondary node just work as if you didn't
1637 have setup pg_auto_failover in the first place, as the keeper
1638 fails to report local state from the nodes. Also, health checks
1639 are not performed. It means that no automated failover may happen,
1640 even if needed.
1641
1642 Network Partitions
1643 Adding to those simple situations, pg_auto_failover is also resilient
1644 to Network Partitions. Here's the list of situation that have an impact
1645 to pg_auto_failover behavior, and the actions taken to ensure High
1646 Availability of your PostgreSQL service:
1647
1648 • Primary can't connect to Monitor
1649
1650 Then it could be that either the primary is alone on its side of a
1651 network split, or that the monitor has failed. The keeper decides
1652 depending on whether the secondary node is still connected to the
1653 replication slot, and if we have a secondary, continues to serve
1654 PostgreSQL queries.
1655
1656 Otherwise, when the secondary isn't connected, and after the NET‐
1657 WORK_PARTITION_TIMEOUT has elapsed, the primary considers it might
1658 be alone in a network partition: that's a potential split brain
1659 situation and with only one way to prevent it. The primary stops,
1660 and reports a new state of DEMOTE_TIMEOUT.
1661
1662 The network_partition_timeout can be setup in the keeper's config‐
1663 uration and defaults to 20s.
1664
1665 • Monitor can't connect to Primary
1666
1667 Once all the retries have been done and the timeouts are elapsed,
1668 then the primary node is considered unhealthy, and the monitor be‐
1669 gins the failover routine. This routine has several steps, each of
1670 them allows to control our expectations and step back if needed.
1671
1672 For the failover to happen, the secondary node needs to be healthy
1673 and caught-up with the primary. Only if we timeout while waiting
1674 for the WAL delta to resorb (30s by default) then the secondary
1675 can be promoted with uncertainty about the data durability in the
1676 group.
1677
1678 • Monitor can't connect to Secondary
1679
1680 As soon as the secondary is considered unhealthy then the monitor
1681 changes the replication setting to asynchronous on the primary, by
1682 assigning it the WAIT_PRIMARY state. Also the secondary is as‐
1683 signed the state CATCHINGUP, which means it can't be promoted in
1684 case of primary failure.
1685
1686 As the monitor tracks the WAL delta between the two servers, and
1687 they both report it independently, the standby is eligible to pro‐
1688 motion again as soon as it's caught-up with the primary again, and
1689 at this time it is assigned the SECONDARY state, and the replica‐
1690 tion will be switched back to synchronous.
1691
1692 Failure handling and network partition detection
1693 If a node cannot communicate to the monitor, either because the monitor
1694 is down or because there is a problem with the network, it will simply
1695 remain in the same state until the monitor comes back.
1696
1697 If there is a network partition, it might be that the monitor and sec‐
1698 ondary can still communicate and the monitor decides to promote the
1699 secondary since the primary is no longer responsive. Meanwhile, the
1700 primary is still up-and-running on the other side of the network parti‐
1701 tion. If a primary cannot communicate to the monitor it starts checking
1702 whether the secondary is still connected. In PostgreSQL, the secondary
1703 connection automatically times out after 30 seconds. If last contact
1704 with the monitor and the last time a connection from the secondary was
1705 observed are both more than 30 seconds in the past, the primary con‐
1706 cludes it is on the losing side of a network partition and shuts itself
1707 down. It may be that the secondary and the monitor were actually down
1708 and the primary was the only node that was alive, but we currently do
1709 not have a way to distinguish such a situation. As with consensus al‐
1710 gorithms, availability can only be correctly preserved if at least 2
1711 out of 3 nodes are up.
1712
1713 In asymmetric network partitions, the primary might still be able to
1714 talk to the secondary, while unable to talk to the monitor. During
1715 failover, the monitor therefore assigns the secondary the stop_replica‐
1716 tion state, which will cause it to disconnect from the primary. After
1717 that, the primary is expected to shut down after at least 30 and at
1718 most 60 seconds. To factor in worst-case scenarios, the monitor waits
1719 for 90 seconds before promoting the secondary to become the new pri‐
1720 mary.
1721
1723 We provide native system packages for pg_auto_failover on most popular
1724 Linux distributions.
1725
1726 Use the steps below to install pg_auto_failover on PostgreSQL 11. At
1727 the current time pg_auto_failover is compatible with both PostgreSQL 10
1728 and PostgreSQL 11.
1729
1730 Ubuntu or Debian
1731 Quick install
1732 The following installation method downloads a bash script that auto‐
1733 mates several steps. The full script is available for review at our
1734 package cloud installation instructions page.
1735
1736 # add the required packages to your system
1737 curl https://install.citusdata.com/community/deb.sh | sudo bash
1738
1739 # install pg_auto_failover
1740 sudo apt-get install postgresql-11-auto-failover
1741
1742 # confirm installation
1743 /usr/bin/pg_autoctl --version
1744
1745 Manual Installation
1746 If you'd prefer to install your repo on your system manually, follow
1747 the instructions from package cloud manual installation page. This page
1748 will guide you with the specific details to achieve the 3 steps:
1749
1750 1. install CitusData GnuPG key for its package repository
1751
1752 2. install a new apt source for CitusData packages
1753
1754 3. update your available package list
1755
1756 Then when that's done, you can proceed with installing pg_auto_failover
1757 itself as in the previous case:
1758
1759 # install pg_auto_failover
1760 sudo apt-get install postgresql-11-auto-failover
1761
1762 # confirm installation
1763 /usr/bin/pg_autoctl --version
1764
1765 Fedora, CentOS, or Red Hat
1766 Quick install
1767 The following installation method downloads a bash script that auto‐
1768 mates several steps. The full script is available for review at our
1769 package cloud installation instructions page url.
1770
1771 # add the required packages to your system
1772 curl https://install.citusdata.com/community/rpm.sh | sudo bash
1773
1774 # install pg_auto_failover
1775 sudo yum install -y pg-auto-failover14_12
1776
1777 # confirm installation
1778 /usr/pgsql-12/bin/pg_autoctl --version
1779
1780 Manual installation
1781 If you'd prefer to install your repo on your system manually, follow
1782 the instructions from package cloud manual installation page. This page
1783 will guide you with the specific details to achieve the 3 steps:
1784
1785 1. install the pygpgme yum-utils packages for your distribution
1786
1787 2. install a new RPM reposiroty for CitusData packages
1788
1789 3. update your local yum cache
1790
1791 Then when that's done, you can proceed with installing pg_auto_failover
1792 itself as in the previous case:
1793
1794 # install pg_auto_failover
1795 sudo yum install -y pg-auto-failover14_12
1796
1797 # confirm installation
1798 /usr/pgsql-12/bin/pg_autoctl --version
1799
1800 Installing a pgautofailover Systemd unit
1801 The command pg_autoctl show systemd outputs a systemd unit file that
1802 you can use to setup a boot-time registered service for
1803 pg_auto_failover on your machine.
1804
1805 Here's a sample output from the command:
1806
1807 $ export PGDATA=/var/lib/postgresql/monitor
1808 $ pg_autoctl show systemd
1809 13:44:34 INFO HINT: to complete a systemd integration, run the following commands:
1810 13:44:34 INFO pg_autoctl -q show systemd --pgdata "/var/lib/postgresql/monitor" | sudo tee /etc/systemd/system/pgautofailover.service
1811 13:44:34 INFO sudo systemctl daemon-reload
1812 13:44:34 INFO sudo systemctl start pgautofailover
1813 [Unit]
1814 Description = pg_auto_failover
1815
1816 [Service]
1817 WorkingDirectory = /var/lib/postgresql
1818 Environment = 'PGDATA=/var/lib/postgresql/monitor'
1819 User = postgres
1820 ExecStart = /usr/lib/postgresql/10/bin/pg_autoctl run
1821 Restart = always
1822 StartLimitBurst = 0
1823
1824 [Install]
1825 WantedBy = multi-user.target
1826
1827 Copy/pasting the commands given in the hint output from the command
1828 will enable the pgautofailer service on your system, when using sys‐
1829 temd.
1830
1831 It is important that PostgreSQL is started by pg_autoctl rather than by
1832 systemd itself, as it might be that a failover has been done during a
1833 reboot, for instance, and that once the reboot complete we want the lo‐
1834 cal Postgres to re-join as a secondary node where it used to be a pri‐
1835 mary node.
1836
1838 In order to be able to orchestrate fully automated failovers,
1839 pg_auto_failover needs to be able to establish the following Postgres
1840 connections:
1841
1842 • from the monitor node to each Postgres node to check the node's
1843 “health”
1844
1845 • from each Postgres node to the monitor to implement our node_ac‐
1846 tive protocol and fetch the current assigned state for this node
1847
1848 • from the secondary node to the primary node for Postgres streaming
1849 replication.
1850
1851 Postgres Client authentication is controlled by a configuration file:
1852 pg_hba.conf. This file contains a list of rules where each rule may al‐
1853 low or reject a connection attempt.
1854
1855 For pg_auto_failover to work as intended, some HBA rules need to be
1856 added to each node configuration. You can choose to provision the
1857 pg_hba.conf file yourself thanks to pg_autoctl options' --skip-pg-hba,
1858 or you can use the following options to control which kind of rules are
1859 going to be added for you.
1860
1861 Postgres HBA rules
1862 For your application to be able to connect to the current Postgres pri‐
1863 mary servers, some application specific HBA rules have to be added to
1864 pg_hba.conf. There is no provision for doing that in pg_auto_failover.
1865
1866 In other words, it is expected that you have to edit pg_hba.conf to
1867 open connections for your application needs.
1868
1869 The trust security model
1870 As its name suggests the trust security model is not enabling any kind
1871 of security validation. This setting is popular for testing deployments
1872 though, as it makes it very easy to verify that everything works as in‐
1873 tended before putting security restrictions in place.
1874
1875 To enable a “trust” security model with pg_auto_failover, use the
1876 pg_autoctl option --auth trust when creating nodes:
1877
1878 $ pg_autoctl create monitor --auth trust ...
1879 $ pg_autoctl create postgres --auth trust ...
1880 $ pg_autoctl create postgres --auth trust ...
1881
1882 When using --auth trust pg_autoctl adds new HBA rules in the monitor
1883 and the Postgres nodes to enable connections as seen above.
1884
1885 Authentication with passwords
1886 To setup pg_auto_failover with password for connections, you can use
1887 one of the password based authentication methods supported by Postgres,
1888 such as password or scram-sha-256. We recommend the latter, as in the
1889 following example:
1890
1891 $ pg_autoctl create monitor --auth scram-sha-256 ...
1892
1893 The pg_autoctl does not set the password for you. The first step is to
1894 set the database user password in the monitor database thanks to the
1895 following command:
1896
1897 $ psql postgres://monitor.host/pg_auto_failover
1898 > alter user autoctl_node password 'h4ckm3';
1899
1900 Now that the monitor is ready with our password set for the au‐
1901 toctl_node user, we can use the password in the monitor connection
1902 string used when creating Postgres nodes.
1903
1904 On the primary node, we can create the Postgres setup as usual, and
1905 then set our replication password, that we will use if we are demoted
1906 and then re-join as a standby:
1907
1908 $ pg_autoctl create postgres \
1909 --auth scram-sha-256 \
1910 ... \
1911 --monitor postgres://autoctl_node:h4ckm3@monitor.host/pg_auto_failover
1912
1913 $ pg_autoctl config set replication.password h4ckm3m0r3
1914
1915 The second Postgres node is going to be initialized as a secondary and
1916 pg_autoctl then calls pg_basebackup at create time. We need to have the
1917 replication password already set at this time, and we can achieve that
1918 the following way:
1919
1920 $ export PGPASSWORD=h4ckm3m0r3
1921 $ pg_autoctl create postgres \
1922 --auth scram-sha-256 \
1923 ... \
1924 --monitor postgres://autoctl_node:h4ckm3@monitor.host/pg_auto_failover
1925
1926 $ pg_autoctl config set replication.password h4ckm3m0r3
1927
1928 Note that you can use The Password File mechanism as discussed in the
1929 Postgres documentation in order to maintain your passwords in a sepa‐
1930 rate file, not in your main pg_auto_failover configuration file. This
1931 also avoids using passwords in the environment and in command lines.
1932
1933 Encryption of network communications
1934 Postgres knows how to use SSL to enable network encryption of all com‐
1935 munications, including authentication with passwords and the whole data
1936 set when streaming replication is used.
1937
1938 To enable SSL on the server an SSL certificate is needed. It could be
1939 as simple as a self-signed certificate, and pg_autoctl creates such a
1940 certificate for you when using --ssl-self-signed command line option:
1941
1942 $ pg_autoctl create monitor --ssl-self-signed ... \
1943 --auth scram-sha-256 ... \
1944 --ssl-mode require \
1945 ...
1946
1947 $ pg_autoctl create postgres --ssl-self-signed ... \
1948 --auth scram-sha-256 ... \
1949 ...
1950
1951 $ pg_autoctl create postgres --ssl-self-signed ... \
1952 --auth scram-sha-256 ... \
1953 ...
1954
1955 In that example we setup SSL connections to encrypt the network traf‐
1956 fic, and we still have to setup an authentication mechanism exactly as
1957 in the previous sections of this document. Here scram-sha-256 has been
1958 selected, and the password will be sent over an encrypted channel.
1959
1960 When using the --ssl-self-signed option, pg_autoctl creates a
1961 self-signed certificate, as per the Postgres documentation at the
1962 Creating Certificates page.
1963
1964 The certificate subject CN defaults to the --hostname parameter, which
1965 can be given explicitely or computed by pg_autoctl as either your host‐
1966 name when you have proper DNS resolution, or your current IP address.
1967
1968 Self-signed certificates provide protection against eavesdropping; this
1969 setup does NOT protect against Man-In-The-Middle attacks nor Imperson‐
1970 ation attacks. See PostgreSQL documentation page SSL Support for de‐
1971 tails.
1972
1973 Using your own SSL certificates
1974 In many cases you will want to install certificates provided by your
1975 local security department and signed by a trusted Certificate Author‐
1976 ity. In that case one solution is to use --skip-pg-hba and do the whole
1977 setup yourself.
1978
1979 It is still possible to give the certificates to pg_auto_failover and
1980 have it handle the Postgres setup for you:
1981
1982 $ pg_autoctl create monitor --ssl-ca-file root.crt \
1983 --ssl-crl-file root.crl \
1984 --server-cert server.crt \
1985 --server-key server.key \
1986 --ssl-mode verify-full \
1987 ...
1988
1989 $ pg_autoctl create postgres --ssl-ca-file root.crt \
1990 --server-cert server.crt \
1991 --server-key server.key \
1992 --ssl-mode verify-full \
1993 ...
1994
1995 $ pg_autoctl create postgres --ssl-ca-file root.crt \
1996 --server-cert server.crt \
1997 --server-key server.key \
1998 --ssl-mode verify-full \
1999 ...
2000
2001 The option --ssl-mode can be used to force connection strings used by
2002 pg_autoctl to contain your prefered ssl mode. It defaults to require
2003 when using --ssl-self-signed and to allow when --no-ssl is used. Here,
2004 we set --ssl-mode to verify-full which requires SSL Certificates Au‐
2005 thentication, covered next.
2006
2007 The default --ssl-mode when providing your own certificates (signed by
2008 your trusted CA) is then verify-full. This setup applies to the client
2009 connection where the server identity is going to be checked against the
2010 root certificate provided with --ssl-ca-file and the revocation list
2011 optionally provided with the --ssl-crl-file. Both those files are used
2012 as the respective parameters sslrootcert and sslcrl in pg_autoctl con‐
2013 nection strings to both the monitor and the streaming replication pri‐
2014 mary server.
2015
2016 SSL Certificates Authentication
2017 Given those files, it is then possible to use certificate based authen‐
2018 tication of client connections. For that, it is necessary to prepare
2019 client certificates signed by your root certificate private key and us‐
2020 ing the target user name as its CN, as per Postgres documentation for
2021 Certificate Authentication:
2022 The cn (Common Name) attribute of the certificate will be compared
2023 to the requested database user name, and if they match the login
2024 will be allowed
2025
2026 For enabling the cert authentication method with pg_auto_failover, you
2027 need to prepare a Client Certificate for the user postgres and used by
2028 pg_autoctl when connecting to the monitor, to place in ~/.post‐
2029 gresql/postgresql.crt along with its key ~/.postgresql/postgresql.key,
2030 in the home directory of the user that runs the pg_autoctl service
2031 (which defaults to postgres).
2032
2033 Then you need to create a user name map as documented in Postgres page
2034 User Name Maps so that your certificate can be used to authenticate
2035 pg_autoctl users.
2036
2037 The ident map in pg_ident.conf on the pg_auto_failover monitor should
2038 then have the following entry, to allow postgres to connect as the au‐
2039 toctl_node user for pg_autoctl operations:
2040
2041 # MAPNAME SYSTEM-USERNAME PG-USERNAME
2042
2043 # pg_autoctl runs as postgres and connects to the monitor autoctl_node user
2044 pgautofailover postgres autoctl_node
2045
2046 To enable streaming replication, the pg_ident.conf file on each Post‐
2047 gres node should now allow the postgres user in the client certificate
2048 to connect as the pgautofailover_replicator database user:
2049
2050 # MAPNAME SYSTEM-USERNAME PG-USERNAME
2051
2052 # pg_autoctl runs as postgres and connects to the monitor autoctl_node user
2053 pgautofailover postgres pgautofailover_replicator
2054
2055 Given that user name map, you can then use the cert authentication
2056 method. As with the pg_ident.conf provisioning, it is best to now pro‐
2057 vision the HBA rules yourself, using the --skip-pg-hba option:
2058
2059 $ pg_autoctl create postgres --skip-pg-hba --ssl-ca-file ...
2060
2061 The HBA rule will use the authentication method cert with a map option,
2062 and might then look like the following on the monitor:
2063
2064 # allow certificate based authentication to the monitor
2065 hostssl pg_auto_failover autoctl_node 10.0.0.0/8 cert map=pgautofailover
2066
2067 Then your pg_auto_failover nodes on the 10.0.0.0 network are allowed to
2068 connect to the monitor with the user autoctl_node used by pg_autoctl,
2069 assuming they have a valid and trusted client certificate.
2070
2071 The HBA rule to use on the Postgres nodes to allow for Postgres stream‐
2072 ing replication connections looks like the following:
2073
2074 # allow streaming replication for pg_auto_failover nodes
2075 hostssl replication pgautofailover_replicator 10.0.0.0/8 cert map=pgautofailover
2076
2077 Because the Postgres server runs as the postgres system user, the con‐
2078 nection to the primary node can be made with SSL enabled and will then
2079 use the client certificates installed in the postgres home directory in
2080 ~/.postgresql/postgresql.{key,cert} locations.
2081
2082 Postgres HBA provisioning
2083 While pg_auto_failover knows how to manage the Postgres HBA rules that
2084 are necessary for your stream replication needs and for its monitor
2085 protocol, it will not manage the Postgres HBA rules that are needed for
2086 your applications.
2087
2088 If you have your own HBA provisioning solution, you can include the
2089 rules needed for pg_auto_failover and then use the --skip-pg-hba option
2090 to the pg_autoctl create commands.
2091
2092 Enable SSL connections on an existing setup
2093 Whether you upgrade pg_auto_failover from a previous version that did
2094 not have support for the SSL features, or when you started with
2095 --no-ssl and later change your mind, it is possible with
2096 pg_auto_failover to add SSL settings on system that has already been
2097 setup without explicit SSL support.
2098
2099 In this section we detail how to upgrade to SSL settings.
2100
2101 Installing Self-Signed certificates on-top of an already existing
2102 pg_auto_failover setup is done with one of the following pg_autoctl
2103 command variants, depending if you want self-signed certificates or
2104 fully verified ssl certificates:
2105
2106 $ pg_autoctl enable ssl --ssl-self-signed --ssl-mode required
2107
2108 $ pg_autoctl enable ssl --ssl-ca-file root.crt \
2109 --ssl-crl-file root.crl \
2110 --server-cert server.crt \
2111 --server-key server.key \
2112 --ssl-mode verify-full
2113
2114 The pg_autoctl enable ssl command edits the post‐
2115 gresql-auto-failover.conf Postgres configuration file to match the com‐
2116 mand line arguments given and enable SSL as instructed, and then up‐
2117 dates the pg_autoctl configuration.
2118
2119 The connection string to connect to the monitor is also automatically
2120 updated by the pg_autoctl enable ssl command. You can verify your new
2121 configuration with:
2122
2123 $ pg_autoctl config get pg_autoctl.monitor
2124
2125 Note that an already running pg_autoctl deamon will try to reload its
2126 configuration after pg_autoctl enable ssl has finished. In some cases
2127 this is not possible to do without a restart. So be sure to check the
2128 logs from a running daemon to confirm that the reload succeeded. If it
2129 did not you may need to restart the daemon to ensure the new connection
2130 string is used.
2131
2132 The HBA settings are not edited, irrespective of the --skip-pg-hba that
2133 has been used at creation time. That's because the host records match
2134 either SSL or non-SSL connection attempts in Postgres HBA file, so the
2135 pre-existing setup will continue to work. To enhance the SSL setup, you
2136 can manually edit the HBA files and change the existing lines from host
2137 to hostssl to dissallow unencrypted connections at the server side.
2138
2139 In summary, to upgrade an existing pg_auto_failover setup to enable
2140 SSL:
2141
2142 1. run the pg_autoctl enable ssl command on your monitor and then
2143 all the Postgres nodes,
2144
2145 2. on the Postgres nodes, review your pg_autoctl logs to make sure
2146 that the reload operation has been effective, and review your
2147 Postgres settings to verify that you have the expected result,
2148
2149 3. review your HBA rules setup to change the pg_auto_failover rules
2150 from host to hostssl to disallow insecure connections.
2151
2153 The pg_autoctl tool hosts many commands and sub-commands. Each of them
2154 have their own manual page.
2155
2156 pg_autoctl
2157 pg_autoctl - control a pg_auto_failover node
2158
2159 Synopsis
2160 pg_autoctl provides the following commands:
2161
2162 + create Create a pg_auto_failover node, or formation
2163 + drop Drop a pg_auto_failover node, or formation
2164 + config Manages the pg_autoctl configuration
2165 + show Show pg_auto_failover information
2166 + enable Enable a feature on a formation
2167 + disable Disable a feature on a formation
2168 + get Get a pg_auto_failover node, or formation setting
2169 + set Set a pg_auto_failover node, or formation setting
2170 + perform Perform an action orchestrated by the monitor
2171 run Run the pg_autoctl service (monitor or keeper)
2172 watch Display a dashboard to watch monitor's events and state
2173 stop signal the pg_autoctl service for it to stop
2174 reload signal the pg_autoctl for it to reload its configuration
2175 status Display the current status of the pg_autoctl service
2176 help print help message
2177 version print pg_autoctl version
2178
2179 pg_autoctl create
2180 monitor Initialize a pg_auto_failover monitor node
2181 postgres Initialize a pg_auto_failover standalone postgres node
2182 formation Create a new formation on the pg_auto_failover monitor
2183
2184 pg_autoctl drop
2185 monitor Drop the pg_auto_failover monitor
2186 node Drop a node from the pg_auto_failover monitor
2187 formation Drop a formation on the pg_auto_failover monitor
2188
2189 pg_autoctl config
2190 check Check pg_autoctl configuration
2191 get Get the value of a given pg_autoctl configuration variable
2192 set Set the value of a given pg_autoctl configuration variable
2193
2194 pg_autoctl show
2195 uri Show the postgres uri to use to connect to pg_auto_failover nodes
2196 events Prints monitor's state of nodes in a given formation and group
2197 state Prints monitor's state of nodes in a given formation and group
2198 settings Print replication settings for a formation from the monitor
2199 standby-names Prints synchronous_standby_names for a given group
2200 file List pg_autoctl internal files (config, state, pid)
2201 systemd Print systemd service file for this node
2202
2203 pg_autoctl enable
2204 secondary Enable secondary nodes on a formation
2205 maintenance Enable Postgres maintenance mode on this node
2206 ssl Enable SSL configuration on this node
2207
2208 pg_autoctl disable
2209 secondary Disable secondary nodes on a formation
2210 maintenance Disable Postgres maintenance mode on this node
2211 ssl Disable SSL configuration on this node
2212
2213 pg_autoctl get
2214 + node get a node property from the pg_auto_failover monitor
2215 + formation get a formation property from the pg_auto_failover monitor
2216
2217 pg_autoctl get node
2218 replication-quorum get replication-quorum property from the monitor
2219 candidate-priority get candidate property from the monitor
2220
2221 pg_autoctl get formation
2222 settings get replication settings for a formation from the monitor
2223 number-sync-standbys get number_sync_standbys for a formation from the monitor
2224
2225 pg_autoctl set
2226 + node set a node property on the monitor
2227 + formation set a formation property on the monitor
2228
2229 pg_autoctl set node
2230 metadata set metadata on the monitor
2231 replication-quorum set replication-quorum property on the monitor
2232 candidate-priority set candidate property on the monitor
2233
2234 pg_autoctl set formation
2235 number-sync-standbys set number-sync-standbys for a formation on the monitor
2236
2237 pg_autoctl perform
2238 failover Perform a failover for given formation and group
2239 switchover Perform a switchover for given formation and group
2240 promotion Perform a failover that promotes a target node
2241
2242 Description
2243 The pg_autoctl tool is the client tool provided by pg_auto_failover to
2244 create and manage Postgres nodes and the pg_auto_failover monitor node.
2245 The command is built with many sub-commands that each have their own
2246 manual page.
2247
2248 Help
2249 To get the full recursive list of supported commands, use:
2250
2251 pg_autoctl help
2252
2253 Version
2254 To grab the version of pg_autoctl that you're using, use:
2255
2256 pg_autoctl --version
2257 pg_autoctl version
2258
2259 A typical output would be:
2260
2261 pg_autoctl version 1.4.2
2262 pg_autoctl extension version 1.4
2263 compiled with PostgreSQL 12.3 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit
2264 compatible with Postgres 10, 11, 12, and 13
2265
2266 The version is also available as a JSON document when using the --json
2267 option:
2268
2269 pg_autoctl --version --json
2270 pg_autoctl version --json
2271
2272 A typical JSON output would be:
2273
2274 {
2275 "pg_autoctl": "1.4.2",
2276 "pgautofailover": "1.4",
2277 "pg_major": "12",
2278 "pg_version": "12.3",
2279 "pg_version_str": "PostgreSQL 12.3 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit",
2280 "pg_version_num": 120003
2281 }
2282
2283 This is for version 1.4.2 of pg_auto_failover. This particular version
2284 of the pg_autoctl client tool has been compiled using libpq for Post‐
2285 greSQL 12.3 and is compatible with Postgres 10, 11, 12, and 13.
2286
2287 pg_autoctl create
2288 pg_autoctl create - Create a pg_auto_failover node, or formation
2289
2290 pg_autoctl create monitor
2291 pg_autoctl create monitor - Initialize a pg_auto_failover monitor node
2292
2293 Synopsis
2294 This command initializes a PostgreSQL cluster and installs the pgauto‐
2295 failover extension so that it's possible to use the new instance to
2296 monitor PostgreSQL services:
2297
2298 usage: pg_autoctl create monitor [ --pgdata --pgport --pgctl --hostname ]
2299
2300 --pgctl path to pg_ctl
2301 --pgdata path to data directory
2302 --pgport PostgreSQL's port number
2303 --hostname hostname by which postgres is reachable
2304 --auth authentication method for connections from data nodes
2305 --skip-pg-hba skip editing pg_hba.conf rules
2306 --run create node then run pg_autoctl service
2307 --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
2308 --ssl-mode use that sslmode in connection strings
2309 --ssl-ca-file set the Postgres ssl_ca_file to that file path
2310 --ssl-crl-file set the Postgres ssl_crl_file to that file path
2311 --no-ssl don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
2312 --server-key set the Postgres ssl_key_file to that file path
2313 --server-cert set the Postgres ssl_cert_file to that file path
2314
2315 Description
2316 The pg_autoctl tool is the client tool provided by pg_auto_failover to
2317 create and manage Postgres nodes and the pg_auto_failover monitor node.
2318 The command is built with many sub-commands that each have their own
2319 manual page.
2320
2321 Options
2322 The following options are available to pg_autoctl create monitor:
2323
2324 --pgctl
2325 Path to the pg_ctl tool to use for the version of PostgreSQL you
2326 want to use.
2327
2328 Defaults to the pg_ctl found in the PATH when there is a single
2329 entry for pg_ctl in the PATH. Check your setup using which -a
2330 pg_ctl.
2331
2332 When using an RPM based distribution such as RHEL or CentOS, the
2333 path would usually be /usr/pgsql-13/bin/pg_ctl for Postgres 13.
2334
2335 When using a debian based distribution such as debian or ubuntu,
2336 the path would usually be /usr/lib/postgresql/13/bin/pg_ctl for
2337 Postgres 13. Those distributions also use the package post‐
2338 gresql-common which provides /usr/bin/pg_config. This tool can
2339 be automatically used by pg_autoctl to discover the default ver‐
2340 sion of Postgres to use on your setup.
2341
2342 --pgdata
2343 Location where to initialize a Postgres database cluster, using
2344 either pg_ctl initdb or pg_basebackup. Defaults to the environ‐
2345 ment variable PGDATA.
2346
2347 --pgport
2348 Postgres port to use, defaults to 5432.
2349
2350 --hostname
2351 Hostname or IP address (both v4 and v6 are supported) to use
2352 from any other node to connect to this node.
2353
2354 When not provided, a default value is computed by running the
2355 following algorithm.
2356
2357 1. We get this machine's "public IP" by opening a connection
2358 to the 8.8.8.8:53 public service. Then we get TCP/IP
2359 client address that has been used to make that connection.
2360
2361 2. We then do a reverse DNS lookup on the IP address found in
2362 the previous step to fetch a hostname for our local ma‐
2363 chine.
2364
2365 3. If the reverse DNS lookup is successful , then pg_autoctl
2366 does a forward DNS lookup of that hostname.
2367
2368 When the forward DNS lookup response in step 3. is an IP address
2369 found in one of our local network interfaces, then pg_autoctl
2370 uses the hostname found in step 2. as the default --hostname.
2371 Otherwise it uses the IP address found in step 1.
2372
2373 You may use the --hostname command line option to bypass the
2374 whole DNS lookup based process and force the local node name to
2375 a fixed value.
2376
2377 --auth Authentication method used by pg_autoctl when editing the Post‐
2378 gres HBA file to open connections to other nodes. No default
2379 value, must be provided by the user. The value --trust is only a
2380 good choice for testing and evaluation of pg_auto_failover, see
2381 security for more information.
2382
2383 --skip-pg-hba
2384 When this option is used then pg_autoctl refrains from any edit‐
2385 ing of the Postgres HBA file. Please note that editing the HBA
2386 file is still needed so that other nodes can connect using ei‐
2387 ther read privileges or replication streaming privileges.
2388
2389 When --skip-pg-hba is used, pg_autoctl still outputs the HBA en‐
2390 tries it needs in the logs, it only skips editing the HBA file.
2391
2392 --run Immediately run the pg_autoctl service after having created this
2393 node.
2394
2395 --ssl-self-signed
2396 Generate SSL self-signed certificates to provide network encryp‐
2397 tion. This does not protect against man-in-the-middle kinds of
2398 attacks. See security for more about our SSL settings.
2399
2400 --ssl-mode
2401 SSL Mode used by pg_autoctl when connecting to other nodes, in‐
2402 cluding when connecting for streaming replication.
2403
2404 --ssl-ca-file
2405 Set the Postgres ssl_ca_file to that file path.
2406
2407 --ssl-crl-file
2408 Set the Postgres ssl_crl_file to that file path.
2409
2410 --no-ssl
2411 Don't enable network encryption. This is not recommended, prefer
2412 --ssl-self-signed.
2413
2414 --server-key
2415 Set the Postgres ssl_key_file to that file path.
2416
2417 --server-cert
2418 Set the Postgres ssl_cert_file to that file path.
2419
2420 pg_autoctl create postgres
2421 pg_autoctl create postgres - Initialize a pg_auto_failover postgres
2422 node
2423
2424 Synopsis
2425 The command pg_autoctl create postgres initializes a standalone Post‐
2426 gres node to a pg_auto_failover monitor. The monitor is then handling
2427 auto-failover for this Postgres node (as soon as a secondary has been
2428 registered too, and is known to be healthy).
2429
2430 usage: pg_autoctl create postgres
2431
2432 --pgctl path to pg_ctl
2433 --pgdata path to data directory
2434 --pghost PostgreSQL's hostname
2435 --pgport PostgreSQL's port number
2436 --listen PostgreSQL's listen_addresses
2437 --username PostgreSQL's username
2438 --dbname PostgreSQL's database name
2439 --name pg_auto_failover node name
2440 --hostname hostname used to connect from the other nodes
2441 --formation pg_auto_failover formation
2442 --monitor pg_auto_failover Monitor Postgres URL
2443 --auth authentication method for connections from monitor
2444 --skip-pg-hba skip editing pg_hba.conf rules
2445 --pg-hba-lan edit pg_hba.conf rules for --dbname in detected LAN
2446 --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
2447 --ssl-mode use that sslmode in connection strings
2448 --ssl-ca-file set the Postgres ssl_ca_file to that file path
2449 --ssl-crl-file set the Postgres ssl_crl_file to that file path
2450 --no-ssl don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
2451 --server-key set the Postgres ssl_key_file to that file path
2452 --server-cert set the Postgres ssl_cert_file to that file path
2453 --candidate-priority priority of the node to be promoted to become primary
2454 --replication-quorum true if node participates in write quorum
2455 --maximum-backup-rate maximum transfer rate of data transferred from the server during initial sync
2456
2457 Description
2458 Three different modes of initialization are supported by this command,
2459 corresponding to as many implementation strategies.
2460
2461 1. Initialize a primary node from scratch
2462
2463 This happens when --pgdata (or the environment variable PGDATA)
2464 points to an non-existing or empty directory. Then the given
2465 --hostname is registered to the pg_auto_failover --monitor as a
2466 member of the --formation.
2467
2468 The monitor answers to the registration call with a state to as‐
2469 sign to the new member of the group, either SINGLE or
2470 WAIT_STANDBY. When the assigned state is SINGLE, then pg_autoctl
2471 create postgres proceedes to initialize a new PostgreSQL in‐
2472 stance.
2473
2474 2. Initialize an already existing primary server
2475
2476 This happens when --pgdata (or the environment variable PGDATA)
2477 points to an already existing directory that belongs to a Post‐
2478 greSQL instance. The standard PostgreSQL tool pg_controldata is
2479 used to recognize whether the directory belongs to a PostgreSQL
2480 instance.
2481
2482 In that case, the given --hostname is registered to the monitor
2483 in the tentative SINGLE state. When the given --formation and
2484 --group is currently empty, then the monitor accepts the regis‐
2485 tration and the pg_autoctl create prepares the already existing
2486 primary server for pg_auto_failover.
2487
2488 3. Initialize a secondary node from scratch
2489
2490 This happens when --pgdata (or the environment variable PGDATA)
2491 points to a non-existing or empty directory, and when the monitor
2492 registration call assigns the state WAIT_STANDBY in step 1.
2493
2494 In that case, the pg_autoctl create command steps through the
2495 initial states of registering a secondary server, which includes
2496 preparing the primary server PostgreSQL HBA rules and creating a
2497 replication slot.
2498
2499 When the command ends successfully, a PostgreSQL secondary server
2500 has been created with pg_basebackup and is now started, catch‐
2501 ing-up to the primary server.
2502
2503 4. Initialize a secondary node from an existing data directory
2504
2505 When the data directory pointed to by the option --pgdata or the
2506 environment variable PGDATA already exists, then pg_auto_failover
2507 verifies that the system identifier matches the one of the other
2508 nodes already existing in the same group.
2509
2510 The system identifier can be obtained with the command pg_con‐
2511 troldata. All nodes in a physical replication setting must have
2512 the same system identifier, and so in pg_auto_failover all the
2513 nodes in a same group have that constraint too.
2514
2515 When the system identifier matches the already registered system
2516 identifier of other nodes in the same group, then the node is
2517 set-up as a standby and Postgres is started with the primary con‐
2518 ninfo pointed at the current primary.
2519
2520 The --auth option allows setting up authentication method to be used
2521 when monitor node makes a connection to data node with pgauto‐
2522 failover_monitor user. As with the pg_autoctl_create_monitor command,
2523 you could use --auth trust when playing with pg_auto_failover at first
2524 and consider something production grade later. Also, consider using
2525 --skip-pg-hba if you already have your own provisioning tools with a
2526 security compliance process.
2527
2528 See security for notes on .pgpass
2529
2530 Options
2531 The following options are available to pg_autoctl create postgres:
2532
2533 --pgctl
2534 Path to the pg_ctl tool to use for the version of PostgreSQL you
2535 want to use.
2536
2537 Defaults to the pg_ctl found in the PATH when there is a single
2538 entry for pg_ctl in the PATH. Check your setup using which -a
2539 pg_ctl.
2540
2541 When using an RPM based distribution such as RHEL or CentOS, the
2542 path would usually be /usr/pgsql-13/bin/pg_ctl for Postgres 13.
2543
2544 When using a debian based distribution such as debian or ubuntu,
2545 the path would usually be /usr/lib/postgresql/13/bin/pg_ctl for
2546 Postgres 13. Those distributions also use the package post‐
2547 gresql-common which provides /usr/bin/pg_config. This tool can
2548 be automatically used by pg_autoctl to discover the default ver‐
2549 sion of Postgres to use on your setup.
2550
2551 --pgdata
2552 Location where to initialize a Postgres database cluster, using
2553 either pg_ctl initdb or pg_basebackup. Defaults to the environ‐
2554 ment variable PGDATA.
2555
2556 --pghost
2557 Hostname to use when connecting to the local Postgres instance
2558 from the pg_autoctl process. By default, this field is left
2559 blank in the connection string, allowing to use Unix Domain
2560 Sockets with the default path compiled in your libpq version,
2561 usually provided by the Operating System. That would be
2562 /var/run/postgresql when using debian or ubuntu.
2563
2564 --pgport
2565 Postgres port to use, defaults to 5432.
2566
2567 --listen
2568 PostgreSQL's listen_addresses to setup. At the moment only one
2569 address is supported in this command line option.
2570
2571 --username
2572 PostgreSQL's username to use when connecting to the local Post‐
2573 gres instance to manage it.
2574
2575 --dbname
2576 PostgreSQL's database name to use in your application. Defaults
2577 to being the same as the --username, or to postgres when none of
2578 those options are used.
2579
2580 --name Node name used on the monitor to refer to this node. The host‐
2581 name is a technical information, and given Postgres requirements
2582 on the HBA setup and DNS resolution (both forward and reverse
2583 lookups), IP addresses are often used for the hostname.
2584
2585 The --name option allows using a user-friendly name for your
2586 Postgres nodes.
2587
2588 --hostname
2589 Hostname or IP address (both v4 and v6 are supported) to use
2590 from any other node to connect to this node.
2591
2592 When not provided, a default value is computed by running the
2593 following algorithm.
2594
2595 1. We get this machine's "public IP" by opening a connection
2596 to the given monitor hostname or IP address. Then we get
2597 TCP/IP client address that has been used to make that con‐
2598 nection.
2599
2600 2. We then do a reverse DNS lookup on the IP address found in
2601 the previous step to fetch a hostname for our local ma‐
2602 chine.
2603
2604 3. If the reverse DNS lookup is successful , then pg_autoctl
2605 does a forward DNS lookup of that hostname.
2606
2607 When the forward DNS lookup response in step 3. is an IP address
2608 found in one of our local network interfaces, then pg_autoctl
2609 uses the hostname found in step 2. as the default --hostname.
2610 Otherwise it uses the IP address found in step 1.
2611
2612 You may use the --hostname command line option to bypass the
2613 whole DNS lookup based process and force the local node name to
2614 a fixed value.
2615
2616 --formation
2617 Formation to register the node into on the monitor. Defaults to
2618 the default formation, that is automatically created in the mon‐
2619 itor in the pg_autoctl_create_monitor command.
2620
2621 --monitor
2622 Postgres URI used to connect to the monitor. Must use the au‐
2623 toctl_node username and target the pg_auto_failover database
2624 name. It is possible to show the Postgres URI from the monitor
2625 node using the command pg_autoctl_show_uri.
2626
2627 --auth Authentication method used by pg_autoctl when editing the Post‐
2628 gres HBA file to open connections to other nodes. No default
2629 value, must be provided by the user. The value --trust is only a
2630 good choice for testing and evaluation of pg_auto_failover, see
2631 security for more information.
2632
2633 --skip-pg-hba
2634 When this option is used then pg_autoctl refrains from any edit‐
2635 ing of the Postgres HBA file. Please note that editing the HBA
2636 file is still needed so that other nodes can connect using ei‐
2637 ther read privileges or replication streaming privileges.
2638
2639 When --skip-pg-hba is used, pg_autoctl still outputs the HBA en‐
2640 tries it needs in the logs, it only skips editing the HBA file.
2641
2642 --pg-hba-lan
2643 When this option is used pg_autoctl determines the local IP ad‐
2644 dress used to connect to the monitor, and retrieves its netmask,
2645 and uses that to compute your local area network CIDR. This CIDR
2646 is then opened for connections in the Postgres HBA rules.
2647
2648 For instance, when the monitor resolves to 192.168.0.1 and your
2649 local Postgres node uses an inferface with IP address
2650 192.168.0.2/255.255.255.0 to connect to the monitor, then the
2651 LAN CIDR is computed to be 192.168.0.0/24.
2652
2653 --candidate-priority
2654 Sets this node replication setting for candidate priority to the
2655 given value (between 0 and 100) at node registration on the mon‐
2656 itor. Defaults to 50.
2657
2658 --replication-quorum
2659 Sets this node replication setting for replication quorum to the
2660 given value (either true or false) at node registration on the
2661 monitor. Defaults to true, which enables synchronous replica‐
2662 tion.
2663
2664 --maximum-backup-rate
2665 Sets the maximum transfer rate of data transferred from the
2666 server during initial sync. This is used by pg_basebackup. De‐
2667 faults to 100M.
2668
2669 --run Immediately run the pg_autoctl service after having created this
2670 node.
2671
2672 --ssl-self-signed
2673 Generate SSL self-signed certificates to provide network encryp‐
2674 tion. This does not protect against man-in-the-middle kinds of
2675 attacks. See security for more about our SSL settings.
2676
2677 --ssl-mode
2678 SSL Mode used by pg_autoctl when connecting to other nodes, in‐
2679 cluding when connecting for streaming replication.
2680
2681 --ssl-ca-file
2682 Set the Postgres ssl_ca_file to that file path.
2683
2684 --ssl-crl-file
2685 Set the Postgres ssl_crl_file to that file path.
2686
2687 --no-ssl
2688 Don't enable network encryption. This is not recommended, prefer
2689 --ssl-self-signed.
2690
2691 --server-key
2692 Set the Postgres ssl_key_file to that file path.
2693
2694 --server-cert
2695 Set the Postgres ssl_cert_file to that file path.
2696
2697 pg_autoctl create formation
2698 pg_autoctl create formation - Create a new formation on the
2699 pg_auto_failover monitor
2700
2701 Synopsis
2702 This command registers a new formation on the monitor, with the speci‐
2703 fied kind:
2704
2705 usage: pg_autoctl create formation [ --pgdata --monitor --formation --kind --dbname --with-secondary --without-secondary ]
2706
2707 --pgdata path to data directory
2708 --monitor pg_auto_failover Monitor Postgres URL
2709 --formation name of the formation to create
2710 --kind formation kind, either "pgsql" or "citus"
2711 --dbname name for postgres database to use in this formation
2712 --enable-secondary create a formation that has multiple nodes that can be
2713 used for fail over when others have issues
2714 --disable-secondary create a citus formation without nodes to fail over to
2715 --number-sync-standbys minimum number of standbys to confirm write
2716
2717 Description
2718 A single pg_auto_failover monitor may manage any number of formations,
2719 each composed of at least one Postgres service group. This commands
2720 creates a new formation so that it is then possible to register Post‐
2721 gres nodes in the new formation.
2722
2723 Options
2724 The following options are available to pg_autoctl create formation:
2725
2726 --pgdata
2727 Location where to initialize a Postgres database cluster, using
2728 either pg_ctl initdb or pg_basebackup. Defaults to the environ‐
2729 ment variable PGDATA.
2730
2731 --monitor
2732 Postgres URI used to connect to the monitor. Must use the au‐
2733 toctl_node username and target the pg_auto_failover database
2734 name. It is possible to show the Postgres URI from the monitor
2735 node using the command pg_autoctl_show_uri.
2736
2737 --formation
2738 Name of the formation to create.
2739
2740 --kind A pg_auto_failover formation could be of kind pgsql or of kind
2741 citus. At the moment citus formation kinds are not managed in
2742 the Open Source version of pg_auto_failover.
2743
2744 --dbname
2745 Name of the database to use in the formation, mostly useful to
2746 formation kinds citus where the Citus extension is only in‐
2747 stalled in a single target database.
2748
2749 --enable-secondary
2750 The formation to be created allows using standby nodes. Defaults
2751 to true. Mostly useful for Citus formations.
2752
2753 --disable-secondary
2754 See --enable-secondary above.
2755
2756 --number-sync-standby
2757 Postgres streaming replication uses synchronous_standby_names to
2758 setup how many standby nodes should have received a copy of the
2759 transaction data. When using pg_auto_failover this setup is han‐
2760 dled at the formation level.
2761
2762 Defaults to zero when creating the first two Postgres nodes in a
2763 formation in the same group. When set to zero pg_auto_failover
2764 uses synchronous replication only when a standby node is avail‐
2765 able: the idea is to allow failover, this setting does not allow
2766 proper HA for Postgres.
2767
2768 When adding a third node that participates in the quorum (one
2769 primary, two secondaries), the setting is automatically changed
2770 from zero to one.
2771
2772 pg_autoctl drop
2773 pg_autoctl drop - Drop a pg_auto_failover node, or formation
2774
2775 pg_autoctl drop monitor
2776 pg_autoctl drop monitor - Drop the pg_auto_failover monitor
2777
2778 Synopsis
2779 This command allows to review all the replication settings of a given
2780 formation (defaults to 'default' as usual):
2781
2782 usage: pg_autoctl drop monitor [ --pgdata --destroy ]
2783
2784 --pgdata path to data directory
2785 --destroy also destroy Postgres database
2786
2787 Options
2788 --pgdata
2789 Location of the Postgres node being managed locally. Defaults to
2790 the environment variable PGDATA. Use --monitor to connect to a
2791 monitor from anywhere, rather than the monitor URI used by a lo‐
2792 cal Postgres node managed with pg_autoctl.
2793
2794 --destroy
2795 By default the pg_autoctl drop monitor commands does not remove
2796 the Postgres database for the monitor. When using --destroy, the
2797 Postgres installation is also deleted.
2798
2799 pg_autoctl drop node
2800 pg_autoctl drop node - Drop a node from the pg_auto_failover monitor
2801
2802 Synopsis
2803 This command drops a Postgres node from the pg_auto_failover monitor:
2804
2805 usage: pg_autoctl drop node [ [ [ --pgdata ] [ --destroy ] ] | [ --monitor [ [ --hostname --pgport ] | [ --formation --name ] ] ] ]
2806
2807 --pgdata path to data directory
2808 --monitor pg_auto_failover Monitor Postgres URL
2809 --formation pg_auto_failover formation
2810 --name drop the node with the given node name
2811 --hostname drop the node with given hostname and pgport
2812 --pgport drop the node with given hostname and pgport
2813 --destroy also destroy Postgres database
2814 --force force dropping the node from the monitor
2815 --wait how many seconds to wait, default to 60
2816
2817 Description
2818 Two modes of operations are implemented in the pg_autoctl drop node
2819 command.
2820
2821 When removing a node that still exists, it is possible to use pg_au‐
2822 toctl drop node --destroy to remove the node both from the monitor and
2823 also delete the local Postgres instance entirely.
2824
2825 When removing a node that doesn't exist physically anymore, or when the
2826 VM that used to host the node has been lost entirely, use either the
2827 pair of options --hostname and --pgport or the pair of options --forma‐
2828 tion and --name to match the node registration record on the monitor
2829 database, and get it removed from the known list of nodes on the moni‐
2830 tor.
2831
2832 Then option --force can be used when the target node to remove does not
2833 exist anymore. When a node has been lost entirely, it's not going to be
2834 able to finish the procedure itself, and it is then possible to in‐
2835 struct the monitor of the situation.
2836
2837 Options
2838 --pgdata
2839 Location of the Postgres node being managed locally. Defaults to
2840 the environment variable PGDATA. Use --monitor to connect to a
2841 monitor from anywhere, rather than the monitor URI used by a lo‐
2842 cal Postgres node managed with pg_autoctl.
2843
2844 --monitor
2845 Postgres URI used to connect to the monitor. Must use the au‐
2846 toctl_node username and target the pg_auto_failover database
2847 name. It is possible to show the Postgres URI from the monitor
2848 node using the command pg_autoctl_show_uri.
2849
2850 --hostname
2851 Hostname of the Postgres node to remove from the monitor. Use
2852 either --name or --hostname --pgport, but not both.
2853
2854 --pgport
2855 Port of the Postgres node to remove from the monitor. Use either
2856 --name or --hostname --pgport, but not both.
2857
2858 --name Name of the node to remove from the monitor. Use either --name
2859 or --hostname --pgport, but not both.
2860
2861 --destroy
2862 By default the pg_autoctl drop monitor commands does not remove
2863 the Postgres database for the monitor. When using --destroy, the
2864 Postgres installation is also deleted.
2865
2866 --force
2867 By default a node is expected to reach the assigned state
2868 DROPPED when it is removed from the monitor, and has the oppor‐
2869 tunity to implement clean-up actions. When the target node to
2870 remove is not available anymore, it is possible to use the op‐
2871 tion --force to immediately remove the node from the monitor.
2872
2873 --wait How many seconds to wait for the node to be dropped entirely.
2874 The command stops when the target node is not to be found on the
2875 monitor anymore, or when the timeout has elapsed, whichever
2876 comes first. The value 0 (zero) disables the timeout and dis‐
2877 ables waiting entirely, making the command async.
2878
2879 Examples
2880 $ pg_autoctl drop node --destroy --pgdata ./node3
2881 17:52:21 54201 INFO Reaching assigned state "secondary"
2882 17:52:21 54201 INFO Removing node with name "node3" in formation "default" from the monitor
2883 17:52:21 54201 WARN Postgres is not running and we are in state secondary
2884 17:52:21 54201 WARN Failed to update the keeper's state from the local PostgreSQL instance, see above for details.
2885 17:52:21 54201 INFO Calling node_active for node default/4/0 with current state: PostgreSQL is running is false, sync_state is "", latest WAL LSN is 0/0.
2886 17:52:21 54201 INFO FSM transition to "dropped": This node is being dropped from the monitor
2887 17:52:21 54201 INFO Transition complete: current state is now "dropped"
2888 17:52:21 54201 INFO This node with id 4 in formation "default" and group 0 has been dropped from the monitor
2889 17:52:21 54201 INFO Stopping PostgreSQL at "/Users/dim/dev/MS/pg_auto_failover/tmux/node3"
2890 17:52:21 54201 INFO /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl --pgdata /Users/dim/dev/MS/pg_auto_failover/tmux/node3 --wait stop --mode fast
2891 17:52:21 54201 INFO /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl status -D /Users/dim/dev/MS/pg_auto_failover/tmux/node3 [3]
2892 17:52:21 54201 INFO pg_ctl: no server running
2893 17:52:21 54201 INFO pg_ctl stop failed, but PostgreSQL is not running anyway
2894 17:52:21 54201 INFO Removing "/Users/dim/dev/MS/pg_auto_failover/tmux/node3"
2895 17:52:21 54201 INFO Removing "/Users/dim/dev/MS/pg_auto_failover/tmux/config/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node3/pg_autoctl.cfg"
2896
2897 pg_autoctl drop formation
2898 pg_autoctl drop formation - Drop a formation on the pg_auto_failover
2899 monitor
2900
2901 Synopsis
2902 This command drops an existing formation on the monitor:
2903
2904 usage: pg_autoctl drop formation [ --pgdata --formation ]
2905
2906 --pgdata path to data directory
2907 --monitor pg_auto_failover Monitor Postgres URL
2908 --formation name of the formation to drop
2909
2910 Options
2911 --pgdata
2912 Location of the Postgres node being managed locally. Defaults to
2913 the environment variable PGDATA. Use --monitor to connect to a
2914 monitor from anywhere, rather than the monitor URI used by a lo‐
2915 cal Postgres node managed with pg_autoctl.
2916
2917 --monitor
2918 Postgres URI used to connect to the monitor. Must use the au‐
2919 toctl_node username and target the pg_auto_failover database
2920 name. It is possible to show the Postgres URI from the monitor
2921 node using the command pg_autoctl_show_uri.
2922
2923 --formation
2924 Name of the formation to drop from the monitor.
2925
2926 pg_autoctl config
2927 pg_autoctl config - Manages the pg_autoctl configuration
2928
2929 pg_autoctl config get
2930 pg_autoctl config get - Get the value of a given pg_autoctl configura‐
2931 tion variable
2932
2933 Synopsis
2934 This command prints a pg_autoctl configuration setting:
2935
2936 usage: pg_autoctl config get [ --pgdata ] [ --json ] [ section.option ]
2937
2938 --pgdata path to data directory
2939
2940 Options
2941 --pgdata
2942 Location of the Postgres node being managed locally. Defaults to
2943 the environment variable PGDATA. Use --monitor to connect to a
2944 monitor from anywhere, rather than the monitor URI used by a lo‐
2945 cal Postgres node managed with pg_autoctl.
2946
2947 --json Output JSON formated data.
2948
2949 Description
2950 When the argument section.option is used, this is the name of a config‐
2951 uration ooption. The configuration file for pg_autoctl is stored using
2952 the INI format.
2953
2954 When no argument is given to pg_autoctl config get the entire configu‐
2955 ration file is given in the output. To figure out where the configura‐
2956 tion file is stored, see pg_autoctl_show_file and use pg_autoctl show
2957 file --config.
2958
2959 Examples
2960 Without arguments, we get the entire file:
2961
2962 $ pg_autoctl config get --pgdata node1
2963 [pg_autoctl]
2964 role = keeper
2965 monitor = postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer
2966 formation = default
2967 group = 0
2968 name = node1
2969 hostname = localhost
2970 nodekind = standalone
2971
2972 [postgresql]
2973 pgdata = /Users/dim/dev/MS/pg_auto_failover/tmux/node1
2974 pg_ctl = /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl
2975 dbname = demo
2976 host = /tmp
2977 port = 5501
2978 proxyport = 0
2979 listen_addresses = *
2980 auth_method = trust
2981 hba_level = app
2982
2983 [ssl]
2984 active = 1
2985 sslmode = require
2986 cert_file = /Users/dim/dev/MS/pg_auto_failover/tmux/node1/server.crt
2987 key_file = /Users/dim/dev/MS/pg_auto_failover/tmux/node1/server.key
2988
2989 [replication]
2990 maximum_backup_rate = 100M
2991 backup_directory = /Users/dim/dev/MS/pg_auto_failover/tmux/backup/node_1
2992
2993 [timeout]
2994 network_partition_timeout = 20
2995 prepare_promotion_catchup = 30
2996 prepare_promotion_walreceiver = 5
2997 postgresql_restart_failure_timeout = 20
2998 postgresql_restart_failure_max_retries = 3
2999
3000 It is possible to pipe JSON formated output to the jq command line and
3001 filter the result down to a specific section of the file:
3002
3003 $ pg_autoctl config get --pgdata node1 --json | jq .pg_autoctl
3004 {
3005 "role": "keeper",
3006 "monitor": "postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer",
3007 "formation": "default",
3008 "group": 0,
3009 "name": "node1",
3010 "hostname": "localhost",
3011 "nodekind": "standalone"
3012 }
3013
3014 Finally, a single configuration element can be listed:
3015
3016 $ pg_autoctl config get --pgdata node1 ssl.sslmode --json
3017 require
3018
3019 pg_autoctl config set
3020 pg_autoctl config set - Set the value of a given pg_autoctl configura‐
3021 tion variable
3022
3023 Synopsis
3024 This command prints a pg_autoctl configuration setting:
3025
3026 usage: pg_autoctl config set [ --pgdata ] [ --json ] section.option [ value ]
3027
3028 --pgdata path to data directory
3029
3030 Options
3031 --pgdata
3032 Location of the Postgres node being managed locally. Defaults to
3033 the environment variable PGDATA. Use --monitor to connect to a
3034 monitor from anywhere, rather than the monitor URI used by a lo‐
3035 cal Postgres node managed with pg_autoctl.
3036
3037 --json Output JSON formated data.
3038
3039 Description
3040 This commands allows to set a pg_autoctl configuration setting to a new
3041 value. Most settings are possible to change and can be reloaded online.
3042
3043 Some of those commands can then be applied with a pg_autoctl reload
3044 command to an already running process.
3045
3046 Settings
3047 pg_autoctl.role
3048 This setting can not be changed. It can be either monitor or keeper
3049 and the rest of the configuration file is read depending on this
3050 value.
3051
3052 pg_autoctl.monitor
3053 URI of the pg_autoctl monitor Postgres service. Can be changed with
3054 a reload.
3055
3056 To register an existing node to a new monitor, use pg_autoctl dis‐
3057 able monitor and then pg_autoctl enable monitor.
3058
3059 pg_autoctl.formation
3060 Formation to which this node has been registered. Changing this set‐
3061 ting is not supported.
3062
3063 pg_autoctl.group
3064 Group in which this node has been registered. Changing this setting
3065 is not supported.
3066
3067 pg_autoctl.name
3068 Name of the node as known to the monitor and listed in pg_autoctl
3069 show state. Can be changed with a reload.
3070
3071 pg_autoctl.hostname
3072 Hostname or IP address of the node, as known to the monitor. Can be
3073 changed with a reload.
3074
3075 pg_autoctl.nodekind
3076 This setting can not be changed and depends on the command that has
3077 been used to create this pg_autoctl node.
3078
3079 postgresql.pgdata
3080 Directory where the managed Postgres instance is to be created (or
3081 found) and managed. Can't be changed.
3082
3083 postgresql.pg_ctl
3084 Path to the pg_ctl tool used to manage this Postgres instance. Ab‐
3085 solute path depends on the major version of Postgres and looks like
3086 /usr/lib/postgresql/13/bin/pg_ctl when using a debian or ubuntu OS.
3087
3088 Can be changed after a major upgrade of Postgres.
3089
3090 postgresql.dbname
3091 Name of the database that is used to connect to Postgres. Can be
3092 changed, but then must be changed manually on the monitor's pgauto‐
3093 failover.formation table with a SQL command.
3094
3095 WARNING:
3096 When using pg_auto_failover enterprise edition with Citus sup‐
3097 port, this is the database where pg_autoctl maintains the list
3098 of Citus nodes on the coordinator. Using the same database name
3099 as your application that uses Citus is then crucial.
3100
3101 postgresql.host
3102 Hostname to use in connection strings when connecting from the local
3103 pg_autoctl process to the local Postgres database. Defaults to using
3104 the Operating System default value for the Unix Domain Socket direc‐
3105 tory, either /tmp or when using debian or ubuntu /var/run/post‐
3106 gresql.
3107
3108 Can be changed with a reload.
3109
3110 postgresql.port
3111 Port on which Postgres should be managed. Can be changed offline,
3112 between a pg_autoctl stop and a subsequent pg_autoctl start.
3113
3114 postgresql.listen_addresses
3115 Value to set to Postgres parameter of the same name. At the moment
3116 pg_autoctl only supports a single address for this parameter.
3117
3118 postgresql.auth_method
3119 Authentication method to use when editing HBA rules to allow the
3120 Postgres nodes of a formation to connect to each other, and to the
3121 monitor, and to allow the monitor to connect to the nodes.
3122
3123 Can be changed online with a reload, but actually adding new HBA
3124 rules requires a restart of the "node-active" service.
3125
3126 postgresql.hba_level
3127 This setting reflects the choice of --skip-pg-hba or --pg-hba-lan
3128 that has been used when creating this pg_autoctl node. Can be
3129 changed with a reload, though the HBA rules that have been previ‐
3130 ously added will not get removed.
3131
3132 ssl.active, ssl.sslmode, ssl.cert_file, ssl.key_file, etc
3133 Please use the command pg_autoctl enable ssl or pg_autoctl disable
3134 ssl to manage the SSL settings in the ssl section of the configura‐
3135 tion. Using those commands, the settings can be changed online.
3136
3137 replication.maximum_backup_rate
3138 Used as a parameter to pg_basebackup, defaults to 100M. Can be
3139 changed with a reload. Changing this value does not affect an al‐
3140 ready running pg_basebackup command.
3141
3142 Limiting the bandwidth used by pg_basebackup makes the operation
3143 slower, and still has the advantage of limiting the impact on the
3144 disks of the primary server.
3145
3146 replication.backup_directory
3147 Target location of the pg_basebackup command used by pg_autoctl when
3148 creating a secondary node. When done with fetching the data over the
3149 network, then pg_autoctl uses the rename(2) system-call to rename
3150 the temporary download location to the target PGDATA location.
3151
3152 The rename(2) system-call is known to be atomic when both the source
3153 and the target of the operation are using the same file system /
3154 mount point.
3155
3156 Can be changed online with a reload, will not affect already running
3157 pg_basebackup sub-processes.
3158
3159 replication.password
3160 Used as a parameter in the connection string to the upstream Post‐
3161 gres node. The "replication" connection uses the password set-up in
3162 the pg_autoctl configuration file.
3163
3164 Changing the replication.password of a pg_autoctl configuration has
3165 no effect on the Postgres database itself. The password must match
3166 what the Postgres upstream node expects, which can be set with the
3167 following SQL command run on the upstream server (primary or other
3168 standby node):
3169
3170 alter user pgautofailover_replicator password 'h4ckm3m0r3';
3171
3172 The replication.password can be changed online with a reload, but
3173 requires restarting the Postgres service to be activated. Postgres
3174 only reads the primary_conninfo connection string at start-up, up to
3175 and including Postgres 12. With Postgres 13 and following, it is
3176 possible to reload this Postgres paramater.
3177
3178 timeout.network_partition_timeout
3179 Timeout (in seconds) that pg_autoctl waits before deciding that it
3180 is on the losing side of a network partition. When pg_autoctl fails
3181 to connect to the monitor and when the local Postgres instance
3182 pg_stat_replication system view is empty, and after this many sec‐
3183 onds have passed, then pg_autoctl demotes itself.
3184
3185 Can be changed with a reload.
3186
3187 timeout.prepare_promotion_catchup
3188 Currently not used in the source code. Can be changed with a reload.
3189
3190 timeout.prepare_promotion_walreceiver
3191 Currently not used in the source code. Can be changed with a reload.
3192
3193 timeout.postgresql_restart_failure_timeout
3194 When pg_autoctl fails to start Postgres for at least this duration
3195 from the first attempt, then it starts reporting that Postgres is
3196 not running to the monitor, which might then decide to implement a
3197 failover.
3198
3199 Can be changed with a reload.
3200
3201 timeout.postgresql_restart_failure_max_retries
3202 When pg_autoctl fails to start Postgres for at least this many times
3203 then it starts reporting that Postgres is not running to the moni‐
3204 tor, which them might decide to implement a failover.
3205
3206 Can be changed with a reload.
3207
3208 pg_autoctl config check
3209 pg_autoctl config check - Check pg_autoctl configuration
3210
3211 Synopsis
3212 This command implements a very basic list of sanity checks for a pg_au‐
3213 toctl node setup:
3214
3215 usage: pg_autoctl config check [ --pgdata ] [ --json ]
3216
3217 --pgdata path to data directory
3218 --json output data in the JSON format
3219
3220 Options
3221 --pgdata
3222 Location of the Postgres node being managed locally. Defaults to
3223 the environment variable PGDATA. Use --monitor to connect to a
3224 monitor from anywhere, rather than the monitor URI used by a lo‐
3225 cal Postgres node managed with pg_autoctl.
3226
3227 --json Output JSON formated data.
3228
3229 Examples
3230 $ pg_autoctl config check --pgdata node1
3231 18:37:27 63749 INFO Postgres setup for PGDATA "/Users/dim/dev/MS/pg_auto_failover/tmux/node1" is ok, running with PID 5501 and port 99698
3232 18:37:27 63749 INFO Connection to local Postgres ok, using "port=5501 dbname=demo host=/tmp"
3233 18:37:27 63749 INFO Postgres configuration settings required for pg_auto_failover are ok
3234 18:37:27 63749 WARN Postgres 12.1 does not support replication slots on a standby node
3235 18:37:27 63749 INFO Connection to monitor ok, using "postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer"
3236 18:37:27 63749 INFO Monitor is running version "1.5.0.1", as expected
3237 pgdata: /Users/dim/dev/MS/pg_auto_failover/tmux/node1
3238 pg_ctl: /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl
3239 pg_version: 12.3
3240 pghost: /tmp
3241 pgport: 5501
3242 proxyport: 0
3243 pid: 99698
3244 is in recovery: no
3245 Control Version: 1201
3246 Catalog Version: 201909212
3247 System Identifier: 6941034382470571312
3248 Latest checkpoint LSN: 0/6000098
3249 Postmaster status: ready
3250
3251 pg_autoctl show
3252 pg_autoctl show - Show pg_auto_failover information
3253
3254 pg_autoctl show uri
3255 pg_autoctl show uri - Show the postgres uri to use to connect to
3256 pg_auto_failover nodes
3257
3258 Synopsis
3259 This command outputs the monitor or the coordinator Postgres URI to use
3260 from an application to connect to Postgres:
3261
3262 usage: pg_autoctl show uri [ --pgdata --monitor --formation --json ]
3263
3264 --pgdata path to data directory
3265 --monitor monitor uri
3266 --formation show the coordinator uri of given formation
3267 --json output data in the JSON format
3268
3269 Options
3270 --pgdata
3271 Location of the Postgres node being managed locally. Defaults to
3272 the environment variable PGDATA. Use --monitor to connect to a
3273 monitor from anywhere, rather than the monitor URI used by a lo‐
3274 cal Postgres node managed with pg_autoctl.
3275
3276 --monitor
3277 Postgres URI used to connect to the monitor. Must use the au‐
3278 toctl_node username and target the pg_auto_failover database
3279 name. It is possible to show the Postgres URI from the monitor
3280 node using the command pg_autoctl show uri.
3281
3282 Defaults to the value of the environment variable PG_AU‐
3283 TOCTL_MONITOR.
3284
3285 --formation
3286 When --formation is used, lists the Postgres URIs of all known
3287 formations on the monitor.
3288
3289 --json Output a JSON formated data instead of a table formatted list.
3290
3291 Examples
3292 $ pg_autoctl show uri
3293 Type | Name | Connection String
3294 -------------+---------+-------------------------------
3295 monitor | monitor | postgres://autoctl_node@localhost:5500/pg_auto_failover
3296 formation | default | postgres://localhost:5502,localhost:5503,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer
3297
3298 $ pg_autoctl show uri --formation monitor
3299 postgres://autoctl_node@localhost:5500/pg_auto_failover
3300
3301 $ pg_autoctl show uri --formation default
3302 postgres://localhost:5503,localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer
3303
3304 $ pg_autoctl show uri --json
3305 [
3306 {
3307 "uri": "postgres://autoctl_node@localhost:5500/pg_auto_failover",
3308 "name": "monitor",
3309 "type": "monitor"
3310 },
3311 {
3312 "uri": "postgres://localhost:5503,localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer",
3313 "name": "default",
3314 "type": "formation"
3315 }
3316 ]
3317
3318 Multi-hosts Postgres connection strings
3319 PostgreSQL since version 10 includes support for multiple hosts in its
3320 connection driver libpq, with the special target_session_attrs connec‐
3321 tion property.
3322
3323 This multi-hosts connection string facility allows applications to keep
3324 using the same stable connection string over server-side failovers.
3325 That's why pg_autoctl show uri uses that format.
3326
3327 pg_autoctl show events
3328 pg_autoctl show events - Prints monitor's state of nodes in a given
3329 formation and group
3330
3331 Synopsis
3332 This command outputs the events that the pg_auto_failover events
3333 records about state changes of the pg_auto_failover nodes managed by
3334 the monitor:
3335
3336 usage: pg_autoctl show events [ --pgdata --formation --group --count ]
3337
3338 --pgdata path to data directory
3339 --monitor pg_auto_failover Monitor Postgres URL
3340 --formation formation to query, defaults to 'default'
3341 --group group to query formation, defaults to all
3342 --count how many events to fetch, defaults to 10
3343 --watch display an auto-updating dashboard
3344 --json output data in the JSON format
3345
3346 Options
3347 --pgdata
3348 Location of the Postgres node being managed locally. Defaults to
3349 the environment variable PGDATA. Use --monitor to connect to a
3350 monitor from anywhere, rather than the monitor URI used by a lo‐
3351 cal Postgres node managed with pg_autoctl.
3352
3353 --monitor
3354 Postgres URI used to connect to the monitor. Must use the au‐
3355 toctl_node username and target the pg_auto_failover database
3356 name. It is possible to show the Postgres URI from the monitor
3357 node using the command pg_autoctl_show_uri.
3358
3359 --formation
3360 List the events recorded for nodes in the given formation. De‐
3361 faults to default.
3362
3363 --count
3364 By default only the last 10 events are printed.
3365
3366 --watch
3367 Take control of the terminal and display the current state of
3368 the system and the last events from the monitor. The display is
3369 updated automatically every 500 milliseconds (half a second) and
3370 reacts properly to window size change.
3371
3372 Depending on the terminal window size, a different set of col‐
3373 umns is visible in the state part of the output. See pg_au‐
3374 toctl_watch.
3375
3376 --json Output a JSON formated data instead of a table formatted list.
3377
3378 Examples
3379 $ pg_autoctl show events --count 2 --json
3380 [
3381 {
3382 "nodeid": 1,
3383 "eventid": 15,
3384 "groupid": 0,
3385 "nodehost": "localhost",
3386 "nodename": "node1",
3387 "nodeport": 5501,
3388 "eventtime": "2021-03-18T12:32:36.103467+01:00",
3389 "goalstate": "primary",
3390 "description": "Setting goal state of node 1 \"node1\" (localhost:5501) to primary now that at least one secondary candidate node is healthy.",
3391 "formationid": "default",
3392 "reportedlsn": "0/4000060",
3393 "reportedstate": "wait_primary",
3394 "reportedrepstate": "async",
3395 "candidatepriority": 50,
3396 "replicationquorum": true
3397 },
3398 {
3399 "nodeid": 1,
3400 "eventid": 16,
3401 "groupid": 0,
3402 "nodehost": "localhost",
3403 "nodename": "node1",
3404 "nodeport": 5501,
3405 "eventtime": "2021-03-18T12:32:36.215494+01:00",
3406 "goalstate": "primary",
3407 "description": "New state is reported by node 1 \"node1\" (localhost:5501): \"primary\"",
3408 "formationid": "default",
3409 "reportedlsn": "0/4000110",
3410 "reportedstate": "primary",
3411 "reportedrepstate": "quorum",
3412 "candidatepriority": 50,
3413 "replicationquorum": true
3414 }
3415 ]
3416
3417 pg_autoctl show state
3418 pg_autoctl show state - Prints monitor's state of nodes in a given for‐
3419 mation and group
3420
3421 Synopsis
3422 This command outputs the current state of the formation and groups reg‐
3423 istered to the pg_auto_failover monitor:
3424
3425 usage: pg_autoctl show state [ --pgdata --formation --group ]
3426
3427 --pgdata path to data directory
3428 --monitor pg_auto_failover Monitor Postgres URL
3429 --formation formation to query, defaults to 'default'
3430 --group group to query formation, defaults to all
3431 --local show local data, do not connect to the monitor
3432 --watch display an auto-updating dashboard
3433 --json output data in the JSON format
3434
3435 Options
3436 --pgdata
3437 Location of the Postgres node being managed locally. Defaults to
3438 the environment variable PGDATA. Use --monitor to connect to a
3439 monitor from anywhere, rather than the monitor URI used by a lo‐
3440 cal Postgres node managed with pg_autoctl.
3441
3442 --monitor
3443 Postgres URI used to connect to the monitor. Must use the au‐
3444 toctl_node username and target the pg_auto_failover database
3445 name. It is possible to show the Postgres URI from the monitor
3446 node using the command pg_autoctl_show_uri.
3447
3448 --formation
3449 List the events recorded for nodes in the given formation. De‐
3450 faults to default.
3451
3452 --group
3453 Limit output to a single group in the formation. Default to in‐
3454 cluding all groups registered in the target formation.
3455
3456 --local
3457 Print the local state information without connecting to the mon‐
3458 itor.
3459
3460 --watch
3461 Take control of the terminal and display the current state of
3462 the system and the last events from the monitor. The display is
3463 updated automatically every 500 milliseconds (half a second) and
3464 reacts properly to window size change.
3465
3466 Depending on the terminal window size, a different set of col‐
3467 umns is visible in the state part of the output. See pg_au‐
3468 toctl_watch.
3469
3470 --json Output a JSON formated data instead of a table formatted list.
3471
3472 Description
3473 The pg_autoctl show state output includes the following columns:
3474
3475 • Name
3476 Name of the node.
3477
3478 • Node
3479 Node information. When the formation has a single group (group
3480 zero), then this column only contains the nodeId.
3481
3482 Only Citus formations allow several groups. When using a Citus
3483 formation the Node column contains the groupId and the nodeId,
3484 separated by a colon, such as 0:1 for the first coordinator
3485 node.
3486
3487 • Host:Port
3488 Hostname and port number used to connect to the node.
3489
3490 • TLI: LSN
3491 Timeline identifier (TLI) and Postgres Log Sequence Number
3492 (LSN).
3493
3494 The LSN is the current position in the Postgres WAL stream.
3495 This is a hexadecimal number. See pg_lsn for more information.
3496
3497 The current timeline is incremented each time a failover hap‐
3498 pens, or when doing Point In Time Recovery. A node can only
3499 reach the secondary state when it is on the same timeline as
3500 its primary node.
3501
3502 • Connection
3503 This output field contains two bits of information. First, the
3504 Postgres connection type that the node provides, either
3505 read-write or read-only. Then the mark ! is added when the
3506 monitor has failed to connect to this node, and ? when the
3507 monitor didn't connect to the node yet.
3508
3509 • Reported State
3510 The latest reported FSM state, as reported to the monitor by
3511 the pg_autoctl process running on the Postgres node.
3512
3513 • Assigned State
3514 The assigned FSM state on the monitor. When the assigned state
3515 is not the same as the reported start, then the pg_autoctl
3516 process running on the Postgres node might have not retrieved
3517 the assigned state yet, or might still be implementing the FSM
3518 transition from the current state to the assigned state.
3519
3520 Examples
3521 $ pg_autoctl show state
3522 Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
3523 ------+-------+----------------+----------------+--------------+---------------------+--------------------
3524 node1 | 1 | localhost:5501 | 1: 0/4000678 | read-write | primary | primary
3525 node2 | 2 | localhost:5502 | 1: 0/4000678 | read-only | secondary | secondary
3526 node3 | 3 | localhost:5503 | 1: 0/4000678 | read-only | secondary | secondary
3527
3528 $ pg_autoctl show state --local
3529 Name | Node | Host:Port | TLI: LSN | Connection | Reported State | Assigned State
3530 ------+-------+----------------+----------------+--------------+---------------------+--------------------
3531 node1 | 1 | localhost:5501 | 1: 0/4000678 | read-write ? | primary | primary
3532
3533 $ pg_autoctl show state --json
3534 [
3535 {
3536 "health": 1,
3537 "node_id": 1,
3538 "group_id": 0,
3539 "nodehost": "localhost",
3540 "nodename": "node1",
3541 "nodeport": 5501,
3542 "reported_lsn": "0/4000678",
3543 "reported_tli": 1,
3544 "formation_kind": "pgsql",
3545 "candidate_priority": 50,
3546 "replication_quorum": true,
3547 "current_group_state": "primary",
3548 "assigned_group_state": "primary"
3549 },
3550 {
3551 "health": 1,
3552 "node_id": 2,
3553 "group_id": 0,
3554 "nodehost": "localhost",
3555 "nodename": "node2",
3556 "nodeport": 5502,
3557 "reported_lsn": "0/4000678",
3558 "reported_tli": 1,
3559 "formation_kind": "pgsql",
3560 "candidate_priority": 50,
3561 "replication_quorum": true,
3562 "current_group_state": "secondary",
3563 "assigned_group_state": "secondary"
3564 },
3565 {
3566 "health": 1,
3567 "node_id": 3,
3568 "group_id": 0,
3569 "nodehost": "localhost",
3570 "nodename": "node3",
3571 "nodeport": 5503,
3572 "reported_lsn": "0/4000678",
3573 "reported_tli": 1,
3574 "formation_kind": "pgsql",
3575 "candidate_priority": 50,
3576 "replication_quorum": true,
3577 "current_group_state": "secondary",
3578 "assigned_group_state": "secondary"
3579 }
3580 ]
3581
3582 pg_autoctl show settings
3583 pg_autoctl show settings - Print replication settings for a formation
3584 from the monitor
3585
3586 Synopsis
3587 This command allows to review all the replication settings of a given
3588 formation (defaults to 'default' as usual):
3589
3590 usage: pg_autoctl show settings [ --pgdata ] [ --json ] [ --formation ]
3591
3592 --pgdata path to data directory
3593 --monitor pg_auto_failover Monitor Postgres URL
3594 --json output data in the JSON format
3595 --formation pg_auto_failover formation
3596
3597 Description
3598 See also pg_autoctl_get_formation_settings which is a synonym.
3599
3600 The output contains setting and values that apply at different con‐
3601 texts, as shown here with a formation of four nodes, where node_4 is
3602 not participating in the replication quorum and also not a candidate
3603 for failover:
3604
3605 $ pg_autoctl show settings
3606 Context | Name | Setting | Value
3607 ----------+---------+---------------------------+-------------------------------------------------------------
3608 formation | default | number_sync_standbys | 1
3609 primary | node_1 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_3, pgautofailover_standby_2)'
3610 node | node_1 | replication quorum | true
3611 node | node_2 | replication quorum | true
3612 node | node_3 | replication quorum | true
3613 node | node_4 | replication quorum | false
3614 node | node_1 | candidate priority | 50
3615 node | node_2 | candidate priority | 50
3616 node | node_3 | candidate priority | 50
3617 node | node_4 | candidate priority | 0
3618
3619 Three replication settings context are listed:
3620
3621 1. The "formation" context contains a single entry, the value of
3622 number_sync_standbys for the target formation.
3623
3624 2. The "primary" context contains one entry per group of Postgres
3625 nodes in the formation, and shows the current value of the syn‐
3626 chronous_standby_names Postgres setting as computed by the moni‐
3627 tor. It should match what's currently set on the primary node un‐
3628 less while applying a change, as shown by the primary being in
3629 the APPLY_SETTING state.
3630
3631 3. The "node" context contains two entry per nodes, one line shows
3632 the replication quorum setting of nodes, and another line shows
3633 the candidate priority of nodes.
3634
3635 This command gives an overview of all the settings that apply to the
3636 current formation.
3637
3638 Options
3639 --pgdata
3640 Location of the Postgres node being managed locally. Defaults to
3641 the environment variable PGDATA. Use --monitor to connect to a
3642 monitor from anywhere, rather than the monitor URI used by a lo‐
3643 cal Postgres node managed with pg_autoctl.
3644
3645 --monitor
3646 Postgres URI used to connect to the monitor. Must use the au‐
3647 toctl_node username and target the pg_auto_failover database
3648 name. It is possible to show the Postgres URI from the monitor
3649 node using the command pg_autoctl_show_uri.
3650
3651 Defaults to the value of the environment variable PG_AU‐
3652 TOCTL_MONITOR.
3653
3654 --formation
3655 Show the current replication settings for the given formation.
3656 Defaults to the default formation.
3657
3658 --json Output a JSON formated data instead of a table formatted list.
3659
3660 Examples
3661 $ pg_autoctl show settings
3662 Context | Name | Setting | Value
3663 ----------+---------+---------------------------+-------------------------------------------------------------
3664 formation | default | number_sync_standbys | 1
3665 primary | node1 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'
3666 node | node1 | candidate priority | 50
3667 node | node2 | candidate priority | 50
3668 node | node3 | candidate priority | 50
3669 node | node1 | replication quorum | true
3670 node | node2 | replication quorum | true
3671 node | node3 | replication quorum | true
3672
3673 pg_autoctl show standby-names
3674 pg_autoctl show standby-names - Prints synchronous_standby_names for a
3675 given group
3676
3677 Synopsis
3678 This command prints the current value for synchronous_standby_names for
3679 the primary Postgres server of the target group (default 0) in the tar‐
3680 get formation (default default), as computed by the monitor:
3681
3682 usage: pg_autoctl show standby-names [ --pgdata ] --formation --group
3683
3684 --pgdata path to data directory
3685 --monitor pg_auto_failover Monitor Postgres URL
3686 --formation formation to query, defaults to 'default'
3687 --group group to query formation, defaults to all
3688 --json output data in the JSON format
3689
3690 Options
3691 --pgdata
3692 Location of the Postgres node being managed locally. Defaults to
3693 the environment variable PGDATA. Use --monitor to connect to a
3694 monitor from anywhere, rather than the monitor URI used by a lo‐
3695 cal Postgres node managed with pg_autoctl.
3696
3697 --monitor
3698 Postgres URI used to connect to the monitor. Must use the au‐
3699 toctl_node username and target the pg_auto_failover database
3700 name. It is possible to show the Postgres URI from the monitor
3701 node using the command pg_autoctl_show_uri.
3702
3703 Defaults to the value of the environment variable PG_AU‐
3704 TOCTL_MONITOR.
3705
3706 --formation
3707 Show the current synchronous_standby_names value for the given
3708 formation. Defaults to the default formation.
3709
3710 --group
3711 Show the current synchronous_standby_names value for the given
3712 group in the given formation. Defaults to group 0.
3713
3714 --json Output a JSON formated data instead of a table formatted list.
3715
3716 Examples
3717 $ pg_autoctl show standby-names
3718 'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'
3719
3720 $ pg_autoctl show standby-names --json
3721 {
3722 "formation": "default",
3723 "group": 0,
3724 "synchronous_standby_names": "ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)"
3725 }
3726
3727 pg_autoctl show file
3728 pg_autoctl show file - List pg_autoctl internal files (config, state,
3729 pid)
3730
3731 Synopsis
3732 This command the files that pg_autoctl uses internally for its own con‐
3733 figuration, state, and pid:
3734
3735 usage: pg_autoctl show file [ --pgdata --all --config | --state | --init | --pid --contents ]
3736
3737 --pgdata path to data directory
3738 --all show all pg_autoctl files
3739 --config show pg_autoctl configuration file
3740 --state show pg_autoctl state file
3741 --init show pg_autoctl initialisation state file
3742 --pid show pg_autoctl PID file
3743 --contents show selected file contents
3744 --json output data in the JSON format
3745
3746 Description
3747 The pg_autoctl command follows the XDG Base Directory Specification and
3748 places its internal and configuration files by default in places such
3749 as ~/.config/pg_autoctl and ~/.local/share/pg_autoctl.
3750
3751 It is possible to change the default XDG locations by using the envi‐
3752 ronment variables XDG_CONFIG_HOME, XDG_DATA_HOME, and XDG_RUNTIME_DIR.
3753
3754 Also, pg_config uses sub-directories that are specific to a given PG‐
3755 DATA, making it possible to run several Postgres nodes on the same ma‐
3756 chine, which is very practical for testing and development purposes,
3757 though not advised for production setups.
3758
3759 Configuration File
3760 The pg_autoctl configuration file for an instance serving the data di‐
3761 rectory at /data/pgsql is found at ~/.config/pg_au‐
3762 toctl/data/pgsql/pg_autoctl.cfg, written in the INI format.
3763
3764 It is possible to get the location of the configuration file by using
3765 the command pg_autoctl show file --config --pgdata /data/pgsql and to
3766 output its content by using the command pg_autoctl show file --config
3767 --contents --pgdata /data/pgsql.
3768
3769 See also pg_autoctl_config_get and pg_autoctl_config_set.
3770
3771 State File
3772 The pg_autoctl state file for an instance serving the data directory at
3773 /data/pgsql is found at ~/.local/share/pg_autoctl/data/pgsql/pg_au‐
3774 toctl.state, written in a specific binary format.
3775
3776 This file is not intended to be written by anything else than pg_au‐
3777 toctl itself. In case of state corruption, see the trouble shooting
3778 section of the documentation.
3779
3780 It is possible to get the location of the state file by using the com‐
3781 mand pg_autoctl show file --state --pgdata /data/pgsql and to output
3782 its content by using the command pg_autoctl show file --state --con‐
3783 tents --pgdata /data/pgsql.
3784
3785 Init State File
3786 The pg_autoctl init state file for an instance serving the data direc‐
3787 tory at /data/pgsql is found at ~/.local/share/pg_au‐
3788 toctl/data/pgsql/pg_autoctl.init, written in a specific binary format.
3789
3790 This file is not intended to be written by anything else than pg_au‐
3791 toctl itself. In case of state corruption, see the trouble shooting
3792 section of the documentation.
3793
3794 This initialization state file only exists during the initialization of
3795 a pg_auto_failover node. In normal operations, this file does not ex‐
3796 ist.
3797
3798 It is possible to get the location of the state file by using the com‐
3799 mand pg_autoctl show file --init --pgdata /data/pgsql and to output its
3800 content by using the command pg_autoctl show file --init --contents
3801 --pgdata /data/pgsql.
3802
3803 PID File
3804 The pg_autoctl PID file for an instance serving the data directory at
3805 /data/pgsql is found at /tmp/pg_autoctl/data/pgsql/pg_autoctl.pid,
3806 written in a specific text format.
3807
3808 The PID file is located in a temporary directory by default, or in the
3809 XDG_RUNTIME_DIR directory when this is setup.
3810
3811 Options
3812 --pgdata
3813 Location of the Postgres node being managed locally. Defaults to
3814 the environment variable PGDATA. Use --monitor to connect to a
3815 monitor from anywhere, rather than the monitor URI used by a lo‐
3816 cal Postgres node managed with pg_autoctl.
3817
3818 --all List all the files that belong to this pg_autoctl node.
3819
3820 --config
3821 Show only the configuration file.
3822
3823 --state
3824 Show only the state file.
3825
3826 --init Show only the init state file, which only exists while the com‐
3827 mand pg_autoctl create postgres or the command pg_autoctl create
3828 monitor is running, or when than command failed (and can then be
3829 retried).
3830
3831 --pid Show only the pid file.
3832
3833 --contents
3834 When one of the options to show a specific file is in use, then
3835 --contents shows the contents of the selected file instead of
3836 showing its absolute file path.
3837
3838 --json Output JSON formated data.
3839
3840 Examples
3841 The following examples are taken from a QA environment that has been
3842 prepared thanks to the make cluster command made available to the
3843 pg_auto_failover contributors. As a result, the XDG environment vari‐
3844 ables have been tweaked to obtain a self-contained test:
3845
3846 $ tmux show-env | grep XDG
3847 XDG_CONFIG_HOME=/Users/dim/dev/MS/pg_auto_failover/tmux/config
3848 XDG_DATA_HOME=/Users/dim/dev/MS/pg_auto_failover/tmux/share
3849 XDG_RUNTIME_DIR=/Users/dim/dev/MS/pg_auto_failover/tmux/run
3850
3851 Within that self-contained test location, we can see the following ex‐
3852 amples.
3853
3854 $ pg_autoctl show file --pgdata ./node1
3855 File | Path
3856 --------+----------------
3857 Config | /Users/dim/dev/MS/pg_auto_failover/tmux/config/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.cfg
3858 State | /Users/dim/dev/MS/pg_auto_failover/tmux/share/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.state
3859 Init | /Users/dim/dev/MS/pg_auto_failover/tmux/share/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.init
3860 Pid | /Users/dim/dev/MS/pg_auto_failover/tmux/run/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.pid
3861 'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'
3862
3863 $ pg_autoctl show file --pgdata node1 --state
3864 /Users/dim/dev/MS/pg_auto_failover/tmux/share/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.state
3865
3866 $ pg_autoctl show file --pgdata node1 --state --contents
3867 Current Role: primary
3868 Assigned Role: primary
3869 Last Monitor Contact: Thu Mar 18 17:32:25 2021
3870 Last Secondary Contact: 0
3871 pg_autoctl state version: 1
3872 group: 0
3873 node id: 1
3874 nodes version: 0
3875 PostgreSQL Version: 1201
3876 PostgreSQL CatVersion: 201909212
3877 PostgreSQL System Id: 6940955496243696337
3878
3879 pg_autoctl show file --pgdata node1 --config --contents --json | jq .pg_autoctl
3880 {
3881 "role": "keeper",
3882 "monitor": "postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer",
3883 "formation": "default",
3884 "group": 0,
3885 "name": "node1",
3886 "hostname": "localhost",
3887 "nodekind": "standalone"
3888 }
3889
3890 pg_autoctl show systemd
3891 pg_autoctl show systemd - Print systemd service file for this node
3892
3893 Synopsis
3894 This command outputs a configuration unit that is suitable for regis‐
3895 tering pg_autoctl as a systemd service.
3896
3897 Examples
3898 $ pg_autoctl show systemd --pgdata node1
3899 17:38:29 99778 INFO HINT: to complete a systemd integration, run the following commands:
3900 17:38:29 99778 INFO pg_autoctl -q show systemd --pgdata "node1" | sudo tee /etc/systemd/system/pgautofailover.service
3901 17:38:29 99778 INFO sudo systemctl daemon-reload
3902 17:38:29 99778 INFO sudo systemctl enable pgautofailover
3903 17:38:29 99778 INFO sudo systemctl start pgautofailover
3904 [Unit]
3905 Description = pg_auto_failover
3906
3907 [Service]
3908 WorkingDirectory = /Users/dim
3909 Environment = 'PGDATA=node1'
3910 User = dim
3911 ExecStart = /Applications/Postgres.app/Contents/Versions/12/bin/pg_autoctl run
3912 Restart = always
3913 StartLimitBurst = 0
3914 ExecReload = /Applications/Postgres.app/Contents/Versions/12/bin/pg_autoctl reload
3915
3916 [Install]
3917 WantedBy = multi-user.target
3918
3919 To avoid the logs output, use the -q option:
3920
3921 $ pg_autoctl show systemd --pgdata node1 -q
3922 [Unit]
3923 Description = pg_auto_failover
3924
3925 [Service]
3926 WorkingDirectory = /Users/dim
3927 Environment = 'PGDATA=node1'
3928 User = dim
3929 ExecStart = /Applications/Postgres.app/Contents/Versions/12/bin/pg_autoctl run
3930 Restart = always
3931 StartLimitBurst = 0
3932 ExecReload = /Applications/Postgres.app/Contents/Versions/12/bin/pg_autoctl reload
3933
3934 [Install]
3935 WantedBy = multi-user.target
3936
3937 pg_autoctl enable
3938 pg_autoctl enable - Enable a feature on a formation
3939
3940 pg_autoctl enable secondary
3941 pg_autoctl enable secondary - Enable secondary nodes on a formation
3942
3943 Synopsis
3944 This feature makes the most sense when using the Enterprise Edition of
3945 pg_auto_failover, which is fully compatible with Citus formations. When
3946 secondary are enabled, then Citus workers creation policy is to assign
3947 a primary node then a standby node for each group. When secondary is
3948 disabled the Citus workers creation policy is to assign only the pri‐
3949 mary nodes.
3950
3951 usage: pg_autoctl enable secondary [ --pgdata --formation ]
3952
3953 --pgdata path to data directory
3954 --formation Formation to enable secondary on
3955
3956 Options
3957 --pgdata
3958 Location of the Postgres node being managed locally. Defaults to
3959 the environment variable PGDATA. Use --monitor to connect to a
3960 monitor from anywhere, rather than the monitor URI used by a lo‐
3961 cal Postgres node managed with pg_autoctl.
3962
3963 --formation
3964 Target formation where to enable secondary feature.
3965
3966 pg_autoctl enable maintenance
3967 pg_autoctl enable maintenance - Enable Postgres maintenance mode on
3968 this node
3969
3970 Synopsis
3971 A pg_auto_failover can be put to a maintenance state. The Postgres node
3972 is then still registered to the monitor, and is known to be unreliable
3973 until maintenance is disabled. A node in the maintenance state is not a
3974 candidate for promotion.
3975
3976 Typical use of the maintenance state include Operating System or Post‐
3977 gres reboot, e.g. when applying security upgrades.
3978
3979 usage: pg_autoctl enable maintenance [ --pgdata --allow-failover ]
3980
3981 --pgdata path to data directory
3982
3983 Options
3984 --pgdata
3985 Location of the Postgres node being managed locally. Defaults to
3986 the environment variable PGDATA. Use --monitor to connect to a
3987 monitor from anywhere, rather than the monitor URI used by a lo‐
3988 cal Postgres node managed with pg_autoctl.
3989
3990 --formation
3991 Target formation where to enable secondary feature.
3992
3993 Examples
3994 pg_autoctl show state
3995 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
3996 ------+-------+----------------+-----------+--------------+---------------------+--------------------
3997 node1 | 1 | localhost:5501 | 0/4000760 | read-write | primary | primary
3998 node2 | 2 | localhost:5502 | 0/4000760 | read-only | secondary | secondary
3999 node3 | 3 | localhost:5503 | 0/4000760 | read-only | secondary | secondary
4000
4001 $ pg_autoctl enable maintenance --pgdata node3
4002 12:06:12 47086 INFO Listening monitor notifications about state changes in formation "default" and group 0
4003 12:06:12 47086 INFO Following table displays times when notifications are received
4004 Time | Name | Node | Host:Port | Current State | Assigned State
4005 ---------+-------+-------+----------------+---------------------+--------------------
4006 12:06:12 | node1 | 1 | localhost:5501 | primary | join_primary
4007 12:06:12 | node3 | 3 | localhost:5503 | secondary | wait_maintenance
4008 12:06:12 | node3 | 3 | localhost:5503 | wait_maintenance | wait_maintenance
4009 12:06:12 | node1 | 1 | localhost:5501 | join_primary | join_primary
4010 12:06:12 | node3 | 3 | localhost:5503 | wait_maintenance | maintenance
4011 12:06:12 | node1 | 1 | localhost:5501 | join_primary | primary
4012 12:06:13 | node3 | 3 | localhost:5503 | maintenance | maintenance
4013
4014 $ pg_autoctl show state
4015 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4016 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4017 node1 | 1 | localhost:5501 | 0/4000810 | read-write | primary | primary
4018 node2 | 2 | localhost:5502 | 0/4000810 | read-only | secondary | secondary
4019 node3 | 3 | localhost:5503 | 0/4000810 | none | maintenance | maintenance
4020
4021 pg_autoctl enable ssl
4022 pg_autoctl enable ssl - Enable SSL configuration on this node
4023
4024 Synopsis
4025 It is possible to manage Postgres SSL settings with the pg_autoctl com‐
4026 mand, both at pg_autoctl_create_postgres time and then again to change
4027 your mind and update the SSL settings at run-time.
4028
4029 usage: pg_autoctl enable ssl [ --pgdata ] [ --json ]
4030
4031 --pgdata path to data directory
4032 --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
4033 --ssl-mode use that sslmode in connection strings
4034 --ssl-ca-file set the Postgres ssl_ca_file to that file path
4035 --ssl-crl-file set the Postgres ssl_crl_file to that file path
4036 --no-ssl don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
4037 --server-key set the Postgres ssl_key_file to that file path
4038 --server-cert set the Postgres ssl_cert_file to that file path
4039
4040 Options
4041 --pgdata
4042 Location of the Postgres node being managed locally. Defaults to
4043 the environment variable PGDATA. Use --monitor to connect to a
4044 monitor from anywhere, rather than the monitor URI used by a lo‐
4045 cal Postgres node managed with pg_autoctl.
4046
4047 --ssl-self-signed
4048 Generate SSL self-signed certificates to provide network encryp‐
4049 tion. This does not protect against man-in-the-middle kinds of
4050 attacks. See security for more about our SSL settings.
4051
4052 --ssl-mode
4053 SSL Mode used by pg_autoctl when connecting to other nodes, in‐
4054 cluding when connecting for streaming replication.
4055
4056 --ssl-ca-file
4057 Set the Postgres ssl_ca_file to that file path.
4058
4059 --ssl-crl-file
4060 Set the Postgres ssl_crl_file to that file path.
4061
4062 --no-ssl
4063 Don't enable network encryption. This is not recommended, prefer
4064 --ssl-self-signed.
4065
4066 --server-key
4067 Set the Postgres ssl_key_file to that file path.
4068
4069 --server-cert
4070 Set the Postgres ssl_cert_file to that file path.
4071
4072 pg_autoctl enable monitor
4073 pg_autoctl enable monitor - Enable a monitor for this node to be or‐
4074 chestrated from
4075
4076 Synopsis
4077 It is possible to disable the pg_auto_failover monitor and enable it
4078 again online in a running pg_autoctl Postgres node. The main use-cases
4079 where this operation is useful is when the monitor node has to be re‐
4080 placed, either after a full crash of the previous monitor node, of for
4081 migrating to a new monitor node (hardware replacement, region or zone
4082 migration, etc).
4083
4084 usage: pg_autoctl enable monitor [ --pgdata --allow-failover ] postgres://autoctl_node@new.monitor.add.ress/pg_auto_failover
4085
4086 --pgdata path to data directory
4087
4088 Options
4089 --pgdata
4090 Location of the Postgres node being managed locally. Defaults to
4091 the environment variable PGDATA. Use --monitor to connect to a
4092 monitor from anywhere, rather than the monitor URI used by a lo‐
4093 cal Postgres node managed with pg_autoctl.
4094
4095 Examples
4096 $ pg_autoctl show state
4097 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4098 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4099 node1 | 1 | localhost:5501 | 0/4000760 | read-write | primary | primary
4100 node2 | 2 | localhost:5502 | 0/4000760 | read-only | secondary | secondary
4101
4102
4103 $ pg_autoctl enable monitor --pgdata node3 'postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=require'
4104 12:42:07 43834 INFO Registered node 3 (localhost:5503) with name "node3" in formation "default", group 0, state "wait_standby"
4105 12:42:07 43834 INFO Successfully registered to the monitor with nodeId 3
4106 12:42:08 43834 INFO Still waiting for the monitor to drive us to state "catchingup"
4107 12:42:08 43834 WARN Please make sure that the primary node is currently running `pg_autoctl run` and contacting the monitor.
4108
4109 $ pg_autoctl show state
4110 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4111 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4112 node1 | 1 | localhost:5501 | 0/4000810 | read-write | primary | primary
4113 node2 | 2 | localhost:5502 | 0/4000810 | read-only | secondary | secondary
4114 node3 | 3 | localhost:5503 | 0/4000810 | read-only | secondary | secondary
4115
4116 pg_autoctl disable
4117 pg_autoctl disable - Disable a feature on a formation
4118
4119 pg_autoctl disable secondary
4120 pg_autoctl disable secondary - Disable secondary nodes on a formation
4121
4122 Synopsis
4123 This feature makes the most sense when using the Enterprise Edition of
4124 pg_auto_failover, which is fully compatible with Citus formations. When
4125 secondary are disabled, then Citus workers creation policy is to assign
4126 a primary node then a standby node for each group. When secondary is
4127 disabled the Citus workers creation policy is to assign only the pri‐
4128 mary nodes.
4129
4130 usage: pg_autoctl disable secondary [ --pgdata --formation ]
4131
4132 --pgdata path to data directory
4133 --formation Formation to disable secondary on
4134
4135 Options
4136 --pgdata
4137 Location of the Postgres node being managed locally. Defaults to
4138 the environment variable PGDATA. Use --monitor to connect to a
4139 monitor from anywhere, rather than the monitor URI used by a lo‐
4140 cal Postgres node managed with pg_autoctl.
4141
4142 --formation
4143 Target formation where to disable secondary feature.
4144
4145 pg_autoctl disable maintenance
4146 pg_autoctl disable maintenance - Disable Postgres maintenance mode on
4147 this node
4148
4149 Synopsis
4150 A pg_auto_failover can be put to a maintenance state. The Postgres node
4151 is then still registered to the monitor, and is known to be unreliable
4152 until maintenance is disabled. A node in the maintenance state is not a
4153 candidate for promotion.
4154
4155 Typical use of the maintenance state include Operating System or Post‐
4156 gres reboot, e.g. when applying security upgrades.
4157
4158 usage: pg_autoctl disable maintenance [ --pgdata --allow-failover ]
4159
4160 --pgdata path to data directory
4161
4162 Options
4163 --pgdata
4164 Location of the Postgres node being managed locally. Defaults to
4165 the environment variable PGDATA. Use --monitor to connect to a
4166 monitor from anywhere, rather than the monitor URI used by a lo‐
4167 cal Postgres node managed with pg_autoctl.
4168
4169 --formation
4170 Target formation where to disable secondary feature.
4171
4172 Examples
4173 $ pg_autoctl show state
4174 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4175 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4176 node1 | 1 | localhost:5501 | 0/4000810 | read-write | primary | primary
4177 node2 | 2 | localhost:5502 | 0/4000810 | read-only | secondary | secondary
4178 node3 | 3 | localhost:5503 | 0/4000810 | none | maintenance | maintenance
4179
4180 $ pg_autoctl disable maintenance --pgdata node3
4181 12:06:37 47542 INFO Listening monitor notifications about state changes in formation "default" and group 0
4182 12:06:37 47542 INFO Following table displays times when notifications are received
4183 Time | Name | Node | Host:Port | Current State | Assigned State
4184 ---------+-------+-------+----------------+---------------------+--------------------
4185 12:06:37 | node3 | 3 | localhost:5503 | maintenance | catchingup
4186 12:06:37 | node3 | 3 | localhost:5503 | catchingup | catchingup
4187 12:06:37 | node3 | 3 | localhost:5503 | catchingup | secondary
4188 12:06:37 | node3 | 3 | localhost:5503 | secondary | secondary
4189
4190 $ pg_autoctl show state
4191 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4192 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4193 node1 | 1 | localhost:5501 | 0/4000848 | read-write | primary | primary
4194 node2 | 2 | localhost:5502 | 0/4000848 | read-only | secondary | secondary
4195 node3 | 3 | localhost:5503 | 0/4000000 | read-only | secondary | secondary
4196
4197 pg_autoctl disable ssl
4198 pg_autoctl disable ssl - Disable SSL configuration on this node
4199
4200 Synopsis
4201 It is possible to manage Postgres SSL settings with the pg_autoctl com‐
4202 mand, both at pg_autoctl_create_postgres time and then again to change
4203 your mind and update the SSL settings at run-time.
4204
4205 usage: pg_autoctl disable ssl [ --pgdata ] [ --json ]
4206
4207 --pgdata path to data directory
4208 --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
4209 --ssl-mode use that sslmode in connection strings
4210 --ssl-ca-file set the Postgres ssl_ca_file to that file path
4211 --ssl-crl-file set the Postgres ssl_crl_file to that file path
4212 --no-ssl don't disable network encryption (NOT recommended, prefer --ssl-self-signed)
4213 --server-key set the Postgres ssl_key_file to that file path
4214 --server-cert set the Postgres ssl_cert_file to that file path
4215
4216 Options
4217 --pgdata
4218 Location of the Postgres node being managed locally. Defaults to
4219 the environment variable PGDATA. Use --monitor to connect to a
4220 monitor from anywhere, rather than the monitor URI used by a lo‐
4221 cal Postgres node managed with pg_autoctl.
4222
4223 --ssl-self-signed
4224 Generate SSL self-signed certificates to provide network encryp‐
4225 tion. This does not protect against man-in-the-middle kinds of
4226 attacks. See security for more about our SSL settings.
4227
4228 --ssl-mode
4229 SSL Mode used by pg_autoctl when connecting to other nodes, in‐
4230 cluding when connecting for streaming replication.
4231
4232 --ssl-ca-file
4233 Set the Postgres ssl_ca_file to that file path.
4234
4235 --ssl-crl-file
4236 Set the Postgres ssl_crl_file to that file path.
4237
4238 --no-ssl
4239 Don't disable network encryption. This is not recommended, pre‐
4240 fer --ssl-self-signed.
4241
4242 --server-key
4243 Set the Postgres ssl_key_file to that file path.
4244
4245 --server-cert
4246 Set the Postgres ssl_cert_file to that file path.
4247
4248 pg_autoctl disable monitor
4249 pg_autoctl disable monitor - Disable the monitor for this node
4250
4251 Synopsis
4252 It is possible to disable the pg_auto_failover monitor and enable it
4253 again online in a running pg_autoctl Postgres node. The main use-cases
4254 where this operation is useful is when the monitor node has to be re‐
4255 placed, either after a full crash of the previous monitor node, of for
4256 migrating to a new monitor node (hardware replacement, region or zone
4257 migration, etc).
4258
4259 usage: pg_autoctl disable monitor [ --pgdata --force ]
4260
4261 --pgdata path to data directory
4262 --force force unregistering from the monitor
4263
4264 Options
4265 --pgdata
4266 Location of the Postgres node being managed locally. Defaults to
4267 the environment variable PGDATA. Use --monitor to connect to a
4268 monitor from anywhere, rather than the monitor URI used by a lo‐
4269 cal Postgres node managed with pg_autoctl.
4270
4271 --force
4272 The --force covers the two following situations:
4273
4274 1. By default, the command expects to be able to connect to
4275 the current monitor. When the current known monitor in the
4276 setup is not running anymore, use --force to skip this
4277 step.
4278
4279 2. When pg_autoctl could connect to the monitor and the node
4280 is found there, this is normally an error that prevents
4281 from disabling the monitor. Using --force allows the com‐
4282 mand to drop the node from the monitor and continue with
4283 disabling the monitor.
4284
4285 Examples
4286 $ pg_autoctl show state
4287 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4288 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4289 node1 | 1 | localhost:5501 | 0/4000148 | read-write | primary | primary
4290 node2 | 2 | localhost:5502 | 0/4000148 | read-only | secondary | secondary
4291 node3 | 3 | localhost:5503 | 0/4000148 | read-only | secondary | secondary
4292
4293
4294 $ pg_autoctl disable monitor --pgdata node3
4295 12:41:21 43039 INFO Found node 3 "node3" (localhost:5503) on the monitor
4296 12:41:21 43039 FATAL Use --force to remove the node from the monitor
4297
4298 $ pg_autoctl disable monitor --pgdata node3 --force
4299 12:41:32 43219 INFO Removing node 3 "node3" (localhost:5503) from monitor
4300
4301 $ pg_autoctl show state
4302 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4303 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4304 node1 | 1 | localhost:5501 | 0/4000760 | read-write | primary | primary
4305 node2 | 2 | localhost:5502 | 0/4000760 | read-only | secondary | secondary
4306
4307 pg_autoctl get
4308 pg_autoctl get - Get a pg_auto_failover node, or formation setting
4309
4310 pg_autoctl get formation settings
4311 pg_autoctl get formation settings - get replication settings for a for‐
4312 mation from the monitor
4313
4314 Synopsis
4315 This command prints a pg_autoctl replication settings:
4316
4317 usage: pg_autoctl get formation settings [ --pgdata ] [ --json ] [ --formation ]
4318
4319 --pgdata path to data directory
4320 --json output data in the JSON format
4321 --formation pg_auto_failover formation
4322
4323 Description
4324 See also pg_autoctl_show_settings which is a synonym.
4325
4326 Options
4327 --pgdata
4328 Location of the Postgres node being managed locally. Defaults to
4329 the environment variable PGDATA. Use --monitor to connect to a
4330 monitor from anywhere, rather than the monitor URI used by a lo‐
4331 cal Postgres node managed with pg_autoctl.
4332
4333 --json Output JSON formated data.
4334
4335 --formation
4336 Show replication settings for given formation. Defaults to de‐
4337 fault.
4338
4339 Examples
4340 $ pg_autoctl get formation settings
4341 Context | Name | Setting | Value
4342 ----------+---------+---------------------------+-------------------------------------------------------------
4343 formation | default | number_sync_standbys | 1
4344 primary | node1 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'
4345 node | node1 | candidate priority | 50
4346 node | node2 | candidate priority | 50
4347 node | node3 | candidate priority | 50
4348 node | node1 | replication quorum | true
4349 node | node2 | replication quorum | true
4350 node | node3 | replication quorum | true
4351
4352 $ pg_autoctl get formation settings --json
4353 {
4354 "nodes": [
4355 {
4356 "value": "true",
4357 "context": "node",
4358 "node_id": 1,
4359 "setting": "replication quorum",
4360 "group_id": 0,
4361 "nodename": "node1"
4362 },
4363 {
4364 "value": "true",
4365 "context": "node",
4366 "node_id": 2,
4367 "setting": "replication quorum",
4368 "group_id": 0,
4369 "nodename": "node2"
4370 },
4371 {
4372 "value": "true",
4373 "context": "node",
4374 "node_id": 3,
4375 "setting": "replication quorum",
4376 "group_id": 0,
4377 "nodename": "node3"
4378 },
4379 {
4380 "value": "50",
4381 "context": "node",
4382 "node_id": 1,
4383 "setting": "candidate priority",
4384 "group_id": 0,
4385 "nodename": "node1"
4386 },
4387 {
4388 "value": "50",
4389 "context": "node",
4390 "node_id": 2,
4391 "setting": "candidate priority",
4392 "group_id": 0,
4393 "nodename": "node2"
4394 },
4395 {
4396 "value": "50",
4397 "context": "node",
4398 "node_id": 3,
4399 "setting": "candidate priority",
4400 "group_id": 0,
4401 "nodename": "node3"
4402 }
4403 ],
4404 "primary": [
4405 {
4406 "value": "'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'",
4407 "context": "primary",
4408 "node_id": 1,
4409 "setting": "synchronous_standby_names",
4410 "group_id": 0,
4411 "nodename": "node1"
4412 }
4413 ],
4414 "formation": {
4415 "value": "1",
4416 "context": "formation",
4417 "node_id": null,
4418 "setting": "number_sync_standbys",
4419 "group_id": null,
4420 "nodename": "default"
4421 }
4422 }
4423
4424 pg_autoctl get formation number-sync-standbys
4425 pg_autoctl get formation number-sync-standbys - get number_sync_stand‐
4426 bys for a formation from the monitor
4427
4428 Synopsis
4429 This command prints a pg_autoctl replication settings for number sync
4430 standbys:
4431
4432 usage: pg_autoctl get formation number-sync-standbys [ --pgdata ] [ --json ] [ --formation ]
4433
4434 --pgdata path to data directory
4435 --json output data in the JSON format
4436 --formation pg_auto_failover formation
4437
4438 Description
4439 See also pg_autoctl_show_settings for the full list of replication set‐
4440 tings.
4441
4442 Options
4443 --pgdata
4444 Location of the Postgres node being managed locally. Defaults to
4445 the environment variable PGDATA. Use --monitor to connect to a
4446 monitor from anywhere, rather than the monitor URI used by a lo‐
4447 cal Postgres node managed with pg_autoctl.
4448
4449 --json Output JSON formated data.
4450
4451 --formation
4452 Show replication settings for given formation. Defaults to de‐
4453 fault.
4454
4455 Examples
4456 $ pg_autoctl get formation number-sync-standbys
4457 1
4458
4459 $ pg_autoctl get formation number-sync-standbys --json
4460 {
4461 "number-sync-standbys": 1
4462 }
4463
4464 pg_autoctl get node replication-quorum
4465 pg_autoctl get replication-quorum - get replication-quorum property
4466 from the monitor
4467
4468 Synopsis
4469 This command prints pg_autoctl replication quorun for a given node:
4470
4471 usage: pg_autoctl get node replication-quorum [ --pgdata ] [ --json ] [ --formation ] [ --name ]
4472
4473 --pgdata path to data directory
4474 --formation pg_auto_failover formation
4475 --name pg_auto_failover node name
4476 --json output data in the JSON format
4477
4478 Description
4479 See also pg_autoctl_show_settings for the full list of replication set‐
4480 tings.
4481
4482 Options
4483 --pgdata
4484 Location of the Postgres node being managed locally. Defaults to
4485 the environment variable PGDATA. Use --monitor to connect to a
4486 monitor from anywhere, rather than the monitor URI used by a lo‐
4487 cal Postgres node managed with pg_autoctl.
4488
4489 --json Output JSON formated data.
4490
4491 --formation
4492 Show replication settings for given formation. Defaults to de‐
4493 fault.
4494
4495 --name Show replication settings for given node, selected by name.
4496
4497 Examples
4498 $ pg_autoctl get node replication-quorum --name node1
4499 true
4500
4501 $ pg_autoctl get node replication-quorum --name node1 --json
4502 {
4503 "name": "node1",
4504 "replication-quorum": true
4505 }
4506
4507 pg_autoctl get node candidate-priority
4508 pg_autoctl get candidate-priority - get candidate-priority property
4509 from the monitor
4510
4511 Synopsis
4512 This command prints pg_autoctl candidate priority for a given node:
4513
4514 usage: pg_autoctl get node candidate-priority [ --pgdata ] [ --json ] [ --formation ] [ --name ]
4515
4516 --pgdata path to data directory
4517 --formation pg_auto_failover formation
4518 --name pg_auto_failover node name
4519 --json output data in the JSON format
4520
4521 Description
4522 See also pg_autoctl_show_settings for the full list of replication set‐
4523 tings.
4524
4525 Options
4526 --pgdata
4527 Location of the Postgres node being managed locally. Defaults to
4528 the environment variable PGDATA. Use --monitor to connect to a
4529 monitor from anywhere, rather than the monitor URI used by a lo‐
4530 cal Postgres node managed with pg_autoctl.
4531
4532 --json Output JSON formated data.
4533
4534 --formation
4535 Show replication settings for given formation. Defaults to de‐
4536 fault.
4537
4538 --name Show replication settings for given node, selected by name.
4539
4540 Examples
4541 $ pg_autoctl get node candidate-priority --name node1
4542 50
4543
4544 $ pg_autoctl get node candidate-priority --name node1 --json
4545 {
4546 "name": "node1",
4547 "candidate-priority": 50
4548 }
4549
4550 pg_autoctl set
4551 pg_autoctl set - Set a pg_auto_failover node, or formation setting
4552
4553 pg_autoctl set formation number-sync-standbys
4554 pg_autoctl set formation number-sync-standbys - set number_sync_stand‐
4555 bys for a formation from the monitor
4556
4557 Synopsis
4558 This command set a pg_autoctl replication settings for number sync
4559 standbys:
4560
4561 usage: pg_autoctl set formation number-sync-standbys [ --pgdata ] [ --json ] [ --formation ] <number_sync_standbys>
4562
4563 --pgdata path to data directory
4564 --formation pg_auto_failover formation
4565 --json output data in the JSON format
4566
4567 Description
4568 The pg_auto_failover monitor ensures that at least N+1 candidate
4569 standby nodes are registered when number-sync-standbys is N. This means
4570 that to be able to run the following command, at least 3 standby nodes
4571 with a non-zero candidate priority must be registered to the monitor:
4572
4573 $ pg_autoctl set formation number-sync-standbys 2
4574
4575 See also pg_autoctl_show_settings for the full list of replication set‐
4576 tings.
4577
4578 Options
4579 --pgdata
4580 Location of the Postgres node being managed locally. Defaults to
4581 the environment variable PGDATA. Use --monitor to connect to a
4582 monitor from anywhere, rather than the monitor URI used by a lo‐
4583 cal Postgres node managed with pg_autoctl.
4584
4585 --json Output JSON formated data.
4586
4587 --formation
4588 Show replication settings for given formation. Defaults to de‐
4589 fault.
4590
4591 pg_autoctl set node replication-quorum
4592 pg_autoctl set replication-quorum - set replication-quorum property
4593 from the monitor
4594
4595 Synopsis
4596 This command sets pg_autoctl replication quorum for a given node:
4597
4598 usage: pg_autoctl set node replication-quorum [ --pgdata ] [ --json ] [ --formation ] [ --name ] <true|false>
4599
4600 --pgdata path to data directory
4601 --formation pg_auto_failover formation
4602 --name pg_auto_failover node name
4603 --json output data in the JSON format
4604
4605 Description
4606 See also pg_autoctl_show_settings for the full list of replication set‐
4607 tings.
4608
4609 Options
4610 --pgdata
4611 Location of the Postgres node being managed locally. Defaults to
4612 the environment variable PGDATA. Use --monitor to connect to a
4613 monitor from anywhere, rather than the monitor URI used by a lo‐
4614 cal Postgres node managed with pg_autoctl.
4615
4616 --json Output JSON formated data.
4617
4618 --formation
4619 Show replication settings for given formation. Defaults to de‐
4620 fault.
4621
4622 --name Show replication settings for given node, selected by name.
4623
4624 Examples
4625 $ pg_autoctl set node replication-quorum --name node1 false
4626 12:49:37 94092 INFO Waiting for the settings to have been applied to the monitor and primary node
4627 12:49:37 94092 INFO New state is reported by node 1 "node1" (localhost:5501): "apply_settings"
4628 12:49:37 94092 INFO Setting goal state of node 1 "node1" (localhost:5501) to primary after it applied replication properties change.
4629 12:49:37 94092 INFO New state is reported by node 1 "node1" (localhost:5501): "primary"
4630 false
4631
4632 $ pg_autoctl set node replication-quorum --name node1 true --json
4633 12:49:42 94199 INFO Waiting for the settings to have been applied to the monitor and primary node
4634 12:49:42 94199 INFO New state is reported by node 1 "node1" (localhost:5501): "apply_settings"
4635 12:49:42 94199 INFO Setting goal state of node 1 "node1" (localhost:5501) to primary after it applied replication properties change.
4636 12:49:43 94199 INFO New state is reported by node 1 "node1" (localhost:5501): "primary"
4637 {
4638 "replication-quorum": true
4639 }
4640
4641 pg_autoctl set node candidate-priority
4642 pg_autoctl set candidate-priority - set candidate-priority property
4643 from the monitor
4644
4645 Synopsis
4646 This command sets the pg_autoctl candidate priority for a given node:
4647
4648 usage: pg_autoctl set node candidate-priority [ --pgdata ] [ --json ] [ --formation ] [ --name ] <priority: 0..100>
4649
4650 --pgdata path to data directory
4651 --formation pg_auto_failover formation
4652 --name pg_auto_failover node name
4653 --json output data in the JSON format
4654
4655 Description
4656 See also pg_autoctl_show_settings for the full list of replication set‐
4657 tings.
4658
4659 Options
4660 --pgdata
4661 Location of the Postgres node being managed locally. Defaults to
4662 the environment variable PGDATA. Use --monitor to connect to a
4663 monitor from anywhere, rather than the monitor URI used by a lo‐
4664 cal Postgres node managed with pg_autoctl.
4665
4666 --json Output JSON formated data.
4667
4668 --formation
4669 Show replication settings for given formation. Defaults to de‐
4670 fault.
4671
4672 --name Show replication settings for given node, selected by name.
4673
4674 Examples
4675 $ pg_autoctl set node candidate-priority --name node1 65
4676 12:47:59 92326 INFO Waiting for the settings to have been applied to the monitor and primary node
4677 12:47:59 92326 INFO New state is reported by node 1 "node1" (localhost:5501): "apply_settings"
4678 12:47:59 92326 INFO Setting goal state of node 1 "node1" (localhost:5501) to primary after it applied replication properties change.
4679 12:47:59 92326 INFO New state is reported by node 1 "node1" (localhost:5501): "primary"
4680 65
4681
4682 $ pg_autoctl set node candidate-priority --name node1 50 --json
4683 12:48:05 92450 INFO Waiting for the settings to have been applied to the monitor and primary node
4684 12:48:05 92450 INFO New state is reported by node 1 "node1" (localhost:5501): "apply_settings"
4685 12:48:05 92450 INFO Setting goal state of node 1 "node1" (localhost:5501) to primary after it applied replication properties change.
4686 12:48:05 92450 INFO New state is reported by node 1 "node1" (localhost:5501): "primary"
4687 {
4688 "candidate-priority": 50
4689 }
4690
4691 pg_autoctl perform
4692 pg_autoctl perform - Perform an action orchestrated by the monitor
4693
4694 pg_autoctl perform failover
4695 pg_autoctl perform failover - Perform a failover for given formation
4696 and group
4697
4698 Synopsis
4699 This command starts a Postgres failover orchestration from the
4700 pg_auto_failover monitor:
4701
4702 usage: pg_autoctl perform failover [ --pgdata --formation --group ]
4703
4704 --pgdata path to data directory
4705 --formation formation to target, defaults to 'default'
4706 --group group to target, defaults to 0
4707 --wait how many seconds to wait, default to 60
4708
4709 Description
4710 The pg_auto_failover monitor can be used to orchestrate a manual
4711 failover, sometimes also known as a switchover. When doing so,
4712 split-brain are prevented thanks to intermediary states being used in
4713 the Finite State Machine.
4714
4715 The pg_autoctl perform failover command waits until the failover is
4716 known complete on the monitor, or until the hard-coded 60s timeout has
4717 passed.
4718
4719 The failover orchestration is done in the background by the monitor, so
4720 even if the pg_autoctl perform failover stops on the timeout, the
4721 failover orchestration continues at the monitor.
4722
4723 Options
4724 --pgdata
4725 Location of the Postgres node being managed locally. Defaults to
4726 the environment variable PGDATA. Use --monitor to connect to a
4727 monitor from anywhere, rather than the monitor URI used by a lo‐
4728 cal Postgres node managed with pg_autoctl.
4729
4730 --formation
4731 Formation to target for the operation. Defaults to default.
4732
4733 --group
4734 Postgres group to target for the operation. Defaults to 0, only
4735 Citus formations may have more than one group.
4736
4737 --wait How many seconds to wait for notifications about the promotion.
4738 The command stops when the promotion is finished (a node is pri‐
4739 mary), or when the timeout has elapsed, whichever comes first.
4740 The value 0 (zero) disables the timeout and allows the command
4741 to wait forever.
4742
4743 Examples
4744 $ pg_autoctl perform failover
4745 12:57:30 3635 INFO Listening monitor notifications about state changes in formation "default" and group 0
4746 12:57:30 3635 INFO Following table displays times when notifications are received
4747 Time | Name | Node | Host:Port | Current State | Assigned State
4748 ---------+-------+-------+----------------+---------------------+--------------------
4749 12:57:30 | node1 | 1 | localhost:5501 | primary | draining
4750 12:57:30 | node1 | 1 | localhost:5501 | draining | draining
4751 12:57:30 | node2 | 2 | localhost:5502 | secondary | report_lsn
4752 12:57:30 | node3 | 3 | localhost:5503 | secondary | report_lsn
4753 12:57:36 | node3 | 3 | localhost:5503 | report_lsn | report_lsn
4754 12:57:36 | node2 | 2 | localhost:5502 | report_lsn | report_lsn
4755 12:57:36 | node2 | 2 | localhost:5502 | report_lsn | prepare_promotion
4756 12:57:36 | node2 | 2 | localhost:5502 | prepare_promotion | prepare_promotion
4757 12:57:36 | node2 | 2 | localhost:5502 | prepare_promotion | stop_replication
4758 12:57:36 | node1 | 1 | localhost:5501 | draining | demote_timeout
4759 12:57:36 | node3 | 3 | localhost:5503 | report_lsn | join_secondary
4760 12:57:36 | node1 | 1 | localhost:5501 | demote_timeout | demote_timeout
4761 12:57:36 | node3 | 3 | localhost:5503 | join_secondary | join_secondary
4762 12:57:37 | node2 | 2 | localhost:5502 | stop_replication | stop_replication
4763 12:57:37 | node2 | 2 | localhost:5502 | stop_replication | wait_primary
4764 12:57:37 | node1 | 1 | localhost:5501 | demote_timeout | demoted
4765 12:57:37 | node1 | 1 | localhost:5501 | demoted | demoted
4766 12:57:37 | node2 | 2 | localhost:5502 | wait_primary | wait_primary
4767 12:57:37 | node3 | 3 | localhost:5503 | join_secondary | secondary
4768 12:57:37 | node1 | 1 | localhost:5501 | demoted | catchingup
4769 12:57:38 | node3 | 3 | localhost:5503 | secondary | secondary
4770 12:57:38 | node2 | 2 | localhost:5502 | wait_primary | primary
4771 12:57:38 | node1 | 1 | localhost:5501 | catchingup | catchingup
4772 12:57:38 | node2 | 2 | localhost:5502 | primary | primary
4773
4774 $ pg_autoctl show state
4775 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4776 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4777 node1 | 1 | localhost:5501 | 0/4000F50 | read-only | secondary | secondary
4778 node2 | 2 | localhost:5502 | 0/4000F50 | read-write | primary | primary
4779 node3 | 3 | localhost:5503 | 0/4000F50 | read-only | secondary | secondary
4780
4781 pg_autoctl perform switchover
4782 pg_autoctl perform switchover - Perform a switchover for given forma‐
4783 tion and group
4784
4785 Synopsis
4786 This command starts a Postgres switchover orchestration from the
4787 pg_auto_switchover monitor:
4788
4789 usage: pg_autoctl perform switchover [ --pgdata --formation --group ]
4790
4791 --pgdata path to data directory
4792 --formation formation to target, defaults to 'default'
4793 --group group to target, defaults to 0
4794
4795 Description
4796 The pg_auto_switchover monitor can be used to orchestrate a manual
4797 switchover, sometimes also known as a switchover. When doing so,
4798 split-brain are prevented thanks to intermediary states being used in
4799 the Finite State Machine.
4800
4801 The pg_autoctl perform switchover command waits until the switchover is
4802 known complete on the monitor, or until the hard-coded 60s timeout has
4803 passed.
4804
4805 The switchover orchestration is done in the background by the monitor,
4806 so even if the pg_autoctl perform switchover stops on the timeout, the
4807 switchover orchestration continues at the monitor.
4808
4809 See also pg_autoctl_perform_failover, a synonym for this command.
4810
4811 Options
4812 --pgdata
4813 Location of the Postgres node being managed locally. Defaults to
4814 the environment variable PGDATA. Use --monitor to connect to a
4815 monitor from anywhere, rather than the monitor URI used by a lo‐
4816 cal Postgres node managed with pg_autoctl.
4817
4818 --formation
4819 Formation to target for the operation. Defaults to default.
4820
4821 --group
4822 Postgres group to target for the operation. Defaults to 0, only
4823 Citus formations may have more than one group.
4824
4825 pg_autoctl perform promotion
4826 pg_autoctl perform promotion - Perform a failover that promotes a tar‐
4827 get node
4828
4829 Synopsis
4830 This command starts a Postgres failover orchestration from the
4831 pg_auto_promotion monitor and targets given node:
4832
4833 usage: pg_autoctl perform promotion [ --pgdata --formation --group ]
4834
4835 --pgdata path to data directory
4836 --formation formation to target, defaults to 'default'
4837 --name node name to target, defaults to current node
4838 --wait how many seconds to wait, default to 60
4839
4840 Description
4841 The pg_auto_promotion monitor can be used to orchestrate a manual pro‐
4842 motion, sometimes also known as a switchover. When doing so,
4843 split-brain are prevented thanks to intermediary states being used in
4844 the Finite State Machine.
4845
4846 The pg_autoctl perform promotion command waits until the promotion is
4847 known complete on the monitor, or until the hard-coded 60s timeout has
4848 passed.
4849
4850 The promotion orchestration is done in the background by the monitor,
4851 so even if the pg_autoctl perform promotion stops on the timeout, the
4852 promotion orchestration continues at the monitor.
4853
4854 Options
4855 --pgdata
4856 Location of the Postgres node being managed locally. Defaults to
4857 the environment variable PGDATA. Use --monitor to connect to a
4858 monitor from anywhere, rather than the monitor URI used by a lo‐
4859 cal Postgres node managed with pg_autoctl.
4860
4861 --formation
4862 Formation to target for the operation. Defaults to default.
4863
4864 --name Name of the node that should be elected as the new primary node.
4865
4866 --wait How many seconds to wait for notifications about the promotion.
4867 The command stops when the promotion is finished (a node is pri‐
4868 mary), or when the timeout has elapsed, whichever comes first.
4869 The value 0 (zero) disables the timeout and allows the command
4870 to wait forever.
4871
4872 Examples
4873 $ pg_autoctl show state
4874 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4875 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4876 node1 | 1 | localhost:5501 | 0/4000F88 | read-only | secondary | secondary
4877 node2 | 2 | localhost:5502 | 0/4000F88 | read-write | primary | primary
4878 node3 | 3 | localhost:5503 | 0/4000F88 | read-only | secondary | secondary
4879
4880
4881 $ pg_autoctl perform promotion --name node1
4882 13:08:13 15297 INFO Listening monitor notifications about state changes in formation "default" and group 0
4883 13:08:13 15297 INFO Following table displays times when notifications are received
4884 Time | Name | Node | Host:Port | Current State | Assigned State
4885 ---------+-------+-------+----------------+---------------------+--------------------
4886 13:08:13 | node1 | 0/1 | localhost:5501 | secondary | secondary
4887 13:08:13 | node2 | 0/2 | localhost:5502 | primary | draining
4888 13:08:13 | node2 | 0/2 | localhost:5502 | draining | draining
4889 13:08:13 | node1 | 0/1 | localhost:5501 | secondary | report_lsn
4890 13:08:13 | node3 | 0/3 | localhost:5503 | secondary | report_lsn
4891 13:08:19 | node3 | 0/3 | localhost:5503 | report_lsn | report_lsn
4892 13:08:19 | node1 | 0/1 | localhost:5501 | report_lsn | report_lsn
4893 13:08:19 | node1 | 0/1 | localhost:5501 | report_lsn | prepare_promotion
4894 13:08:19 | node1 | 0/1 | localhost:5501 | prepare_promotion | prepare_promotion
4895 13:08:19 | node1 | 0/1 | localhost:5501 | prepare_promotion | stop_replication
4896 13:08:19 | node2 | 0/2 | localhost:5502 | draining | demote_timeout
4897 13:08:19 | node3 | 0/3 | localhost:5503 | report_lsn | join_secondary
4898 13:08:19 | node2 | 0/2 | localhost:5502 | demote_timeout | demote_timeout
4899 13:08:19 | node3 | 0/3 | localhost:5503 | join_secondary | join_secondary
4900 13:08:20 | node1 | 0/1 | localhost:5501 | stop_replication | stop_replication
4901 13:08:20 | node1 | 0/1 | localhost:5501 | stop_replication | wait_primary
4902 13:08:20 | node2 | 0/2 | localhost:5502 | demote_timeout | demoted
4903 13:08:20 | node1 | 0/1 | localhost:5501 | wait_primary | wait_primary
4904 13:08:20 | node3 | 0/3 | localhost:5503 | join_secondary | secondary
4905 13:08:20 | node2 | 0/2 | localhost:5502 | demoted | demoted
4906 13:08:20 | node2 | 0/2 | localhost:5502 | demoted | catchingup
4907 13:08:21 | node3 | 0/3 | localhost:5503 | secondary | secondary
4908 13:08:21 | node1 | 0/1 | localhost:5501 | wait_primary | primary
4909 13:08:21 | node2 | 0/2 | localhost:5502 | catchingup | catchingup
4910 13:08:21 | node1 | 0/1 | localhost:5501 | primary | primary
4911
4912 $ pg_autoctl show state
4913 Name | Node | Host:Port | LSN | Connection | Current State | Assigned State
4914 ------+-------+----------------+-----------+--------------+---------------------+--------------------
4915 node1 | 1 | localhost:5501 | 0/40012F0 | read-write | primary | primary
4916 node2 | 2 | localhost:5502 | 0/40012F0 | read-only | secondary | secondary
4917 node3 | 3 | localhost:5503 | 0/40012F0 | read-only | secondary | secondary
4918
4919 pg_autoctl do
4920 pg_autoctl do - Internal commands and internal QA tooling
4921
4922 The debug commands for pg_autoctl are only available when the environ‐
4923 ment variable PG_AUTOCTL_DEBUG is set (to any value).
4924
4925 When testing pg_auto_failover, it is helpful to be able to play with
4926 the local nodes using the same lower-level API as used by the
4927 pg_auto_failover Finite State Machine transitions. Some commands could
4928 be useful in contexts other than pg_auto_failover development and QA
4929 work, so some documentation has been made available.
4930
4931 pg_autoctl do tmux
4932 pg_autoctl do tmux - Set of facilities to handle tmux interactive ses‐
4933 sions
4934
4935 Synopsis
4936 pg_autoctl do tmux provides the following commands:
4937
4938 pg_autoctl do tmux
4939 script Produce a tmux script for a demo or a test case (debug only)
4940 session Run a tmux session for a demo or a test case
4941 stop Stop pg_autoctl processes that belong to a tmux session
4942 wait Wait until a given node has been registered on the monitor
4943 clean Clean-up a tmux session processes and root dir
4944
4945 Description
4946 An easy way to get started with pg_auto_failover in a localhost only
4947 formation with three nodes is to run the following command:
4948
4949 $ PG_AUTOCTL_DEBUG=1 pg_autoctl do tmux session \
4950 --root /tmp/pgaf \
4951 --first-pgport 9000 \
4952 --nodes 4 \
4953 --layout tiled
4954
4955 This requires the command tmux to be available in your PATH. The pg_au‐
4956 toctl do tmux session commands prepares a self-contained root directory
4957 where to create pg_auto_failover nodes and their configuration, then
4958 prepares a tmux script, and then runs the script with a command such
4959 as:
4960
4961 /usr/local/bin/tmux -v start-server ; source-file /tmp/pgaf/script-9000.tmux
4962
4963 The tmux session contains a single tmux window multiple panes:
4964
4965 • one pane for the monitor
4966
4967 • one pane per Postgres nodes, here 4 of them
4968
4969 • one pane for running watch pg_autoctl show state
4970
4971 • one extra pane for an interactive shell.
4972
4973 Usually the first two commands to run in the interactive shell, once
4974 the formation is stable (one node is primary, the other ones are all
4975 secondary), are the following:
4976
4977 $ pg_autoctl get formation settings
4978 $ pg_autoctl perform failover
4979
4980 pg_autoctl do demo
4981 pg_autoctl do demo - Use a demo application for pg_auto_failover
4982
4983 Synopsis
4984 pg_autoctl do demo provides the following commands:
4985
4986 pg_autoctl do demo
4987 run Run the pg_auto_failover demo application
4988 uri Grab the application connection string from the monitor
4989 ping Attempt to connect to the application URI
4990 summary Display a summary of the previous demo app run
4991
4992 To run a demo, use pg_autoctl do demo run:
4993
4994 usage: pg_autoctl do demo run [option ...]
4995
4996 --monitor Postgres URI of the pg_auto_failover monitor
4997 --formation Formation to use (default)
4998 --group Group Id to failover (0)
4999 --username PostgreSQL's username
5000 --clients How many client processes to use (1)
5001 --duration Duration of the demo app, in seconds (30)
5002 --first-failover Timing of the first failover (10)
5003 --failover-freq Seconds between subsequent failovers (45)
5004
5005 Description
5006 The pg_autoctl debug tooling includes a demo application.
5007
5008 The demo prepare its Postgres schema on the target database, and then
5009 starts several clients (see --clients) that concurrently connect to the
5010 target application URI and record the time it took to establish the
5011 Postgres connection to the current read-write node, with information
5012 about the retry policy metrics.
5013
5014 Example
5015 $ pg_autoctl do demo run --monitor 'postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer' --clients 10
5016 14:43:35 19660 INFO Using application connection string "postgres://localhost:5502,localhost:5503,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer"
5017 14:43:35 19660 INFO Using Postgres user PGUSER "dim"
5018 14:43:35 19660 INFO Preparing demo schema: drop schema if exists demo cascade
5019 14:43:35 19660 WARN NOTICE: schema "demo" does not exist, skipping
5020 14:43:35 19660 INFO Preparing demo schema: create schema demo
5021 14:43:35 19660 INFO Preparing demo schema: create table demo.tracking(ts timestamptz default now(), client integer, loop integer, retries integer, us bigint, recovery bool)
5022 14:43:36 19660 INFO Preparing demo schema: create table demo.client(client integer, pid integer, retry_sleep_ms integer, retry_cap_ms integer, failover_count integer)
5023 14:43:36 19660 INFO Starting 10 concurrent clients as sub-processes
5024 14:43:36 19675 INFO Failover client is started, will failover in 10s and every 45s after that
5025 ...
5026
5027 $ pg_autoctl do demo summary --monitor 'postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer' --clients 10
5028 14:44:27 22789 INFO Using application connection string "postgres://localhost:5503,localhost:5501,localhost:5502/demo?target_session_attrs=read-write&sslmode=prefer"
5029 14:44:27 22789 INFO Using Postgres user PGUSER "dim"
5030 14:44:27 22789 INFO Summary for the demo app running with 10 clients for 30s
5031 Client | Connections | Retries | Min Connect Time (ms) | max | p95 | p99
5032 ----------------------+-------------+---------+-----------------------+----------+---------+---------
5033 Client 1 | 136 | 14 | 58.318 | 2601.165 | 244.443 | 261.809
5034 Client 2 | 136 | 5 | 55.199 | 2514.968 | 242.362 | 259.282
5035 Client 3 | 134 | 6 | 55.815 | 2974.247 | 241.740 | 262.908
5036 Client 4 | 135 | 7 | 56.542 | 2970.922 | 238.995 | 251.177
5037 Client 5 | 136 | 8 | 58.339 | 2758.106 | 238.720 | 252.439
5038 Client 6 | 134 | 9 | 58.679 | 2813.653 | 244.696 | 254.674
5039 Client 7 | 134 | 11 | 58.737 | 2795.974 | 243.202 | 253.745
5040 Client 8 | 136 | 12 | 52.109 | 2354.952 | 242.664 | 254.233
5041 Client 9 | 137 | 19 | 59.735 | 2628.496 | 235.668 | 253.582
5042 Client 10 | 133 | 6 | 57.994 | 3060.489 | 242.156 | 256.085
5043 All Clients Combined | 1351 | 97 | 52.109 | 3060.489 | 241.848 | 258.450
5044 (11 rows)
5045
5046 Min Connect Time (ms) | max | freq | bar
5047 -----------------------+----------+------+-----------------------------------------------
5048 52.109 | 219.105 | 1093 | ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
5049 219.515 | 267.168 | 248 | ▒▒▒▒▒▒▒▒▒▒
5050 2354.952 | 2354.952 | 1 |
5051 2514.968 | 2514.968 | 1 |
5052 2601.165 | 2628.496 | 2 |
5053 2758.106 | 2813.653 | 3 |
5054 2970.922 | 2974.247 | 2 |
5055 3060.489 | 3060.489 | 1 |
5056 (8 rows)
5057
5058 pg_autoctl do service restart
5059 pg_autoctl do service restart - Run pg_autoctl sub-processes (services)
5060
5061 Synopsis
5062 pg_autoctl do service restart provides the following commands:
5063
5064 pg_autoctl do service restart
5065 postgres Restart the pg_autoctl postgres controller service
5066 listener Restart the pg_autoctl monitor listener service
5067 node-active Restart the pg_autoctl keeper node-active service
5068
5069 Description
5070 It is possible to restart the pg_autoctl or the Postgres service with‐
5071 out affecting the other running service. Typically, to restart the
5072 pg_autoctl parts without impacting Postgres:
5073
5074 $ pg_autoctl do service restart node-active --pgdata node1
5075 14:52:06 31223 INFO Sending the TERM signal to service "node-active" with pid 26626
5076 14:52:06 31223 INFO Service "node-active" has been restarted with pid 31230
5077 31230
5078
5079 The Postgres service has not been impacted by the restart of the pg_au‐
5080 toctl process.
5081
5082 pg_autoctl do show
5083 pg_autoctl do show - Show some debug level information
5084
5085 Synopsis
5086 The commands pg_autoctl_create_monitor and pg_autoctl_create_postgres
5087 both implement some level of automated detection of the node network
5088 settings when the option --hostname is not used.
5089
5090 Adding to those commands, when a new node is registered to the monitor,
5091 other nodes also edit their Postgres HBA rules to allow the new node to
5092 connect, unless the option --skip-pg-hba has been used.
5093
5094 The debug sub-commands for pg_autoctl do show can be used to see in de‐
5095 tails the network discovery done by pg_autoctl.
5096
5097 pg_autoctl do show provides the following commands:
5098
5099 pg_autoctl do show
5100 ipaddr Print this node's IP address information
5101 cidr Print this node's CIDR information
5102 lookup Print this node's DNS lookup information
5103 hostname Print this node's default hostname
5104 reverse Lookup given hostname and check reverse DNS setup
5105
5106 pg_autoctl do show ipaddr
5107 Connects to an external IP address and uses getsockname(2) to retrieve
5108 the current address to which the socket is bound.
5109
5110 The external IP address defaults to 8.8.8.8, the IP address of a Google
5111 provided public DNS server, or to the monitor IP address or hostname in
5112 the context of pg_autoctl_create_postgres.
5113
5114 $ pg_autoctl do show ipaddr
5115 16:42:40 62631 INFO ipaddr.c:107: Connecting to 8.8.8.8 (port 53)
5116 192.168.1.156
5117
5118 pg_autoctl do show cidr
5119 Connects to an external IP address in the same way as the previous com‐
5120 mand pg_autoctl do show ipaddr and then matches the local socket name
5121 with the list of local network interfaces. When a match is found, uses
5122 the netmask of the interface to compute the CIDR notation from the IP
5123 address.
5124
5125 The computed CIDR notation is then used in HBA rules.
5126
5127 $ pg_autoctl do show cidr
5128 16:43:19 63319 INFO Connecting to 8.8.8.8 (port 53)
5129 192.168.1.0/24
5130
5131 pg_autoctl do show hostname
5132 Uses either its first (and only) argument or the result of gethost‐
5133 name(2) as the candidate hostname to use in HBA rules, and then check
5134 that the hostname resolves to an IP address that belongs to one of the
5135 machine network interfaces.
5136
5137 When the hostname forward-dns lookup resolves to an IP address that is
5138 local to the node where the command is run, then a reverse-lookup from
5139 the IP address is made to see if it matches with the candidate host‐
5140 name.
5141
5142 $ pg_autoctl do show hostname
5143 DESKTOP-IC01GOOS.europe.corp.microsoft.com
5144
5145 $ pg_autoctl -vv do show hostname 'postgres://autoctl_node@localhost:5500/pg_auto_failover'
5146 13:45:00 93122 INFO cli_do_show.c:256: Using monitor hostname "localhost" and port 5500
5147 13:45:00 93122 INFO ipaddr.c:107: Connecting to ::1 (port 5500)
5148 13:45:00 93122 DEBUG cli_do_show.c:272: cli_show_hostname: ip ::1
5149 13:45:00 93122 DEBUG cli_do_show.c:283: cli_show_hostname: host localhost
5150 13:45:00 93122 DEBUG cli_do_show.c:294: cli_show_hostname: ip ::1
5151 localhost
5152
5153 pg_autoctl do show lookup
5154 Checks that the given argument is an hostname that resolves to a local
5155 IP address, that is an IP address associated with a local network in‐
5156 terface.
5157
5158 $ pg_autoctl do show lookup DESKTOP-IC01GOOS.europe.corp.microsoft.com
5159 DESKTOP-IC01GOOS.europe.corp.microsoft.com: 192.168.1.156
5160
5161 pg_autoctl do show reverse
5162 Implements the same DNS checks as Postgres HBA matching code: first
5163 does a forward DNS lookup of the given hostname, and then a re‐
5164 verse-lookup from all the IP addresses obtained. Success is reached
5165 when at least one of the IP addresses from the forward lookup resolves
5166 back to the given hostname (as the first answer to the reverse DNS
5167 lookup).
5168
5169 $ pg_autoctl do show reverse DESKTOP-IC01GOOS.europe.corp.microsoft.com
5170 16:44:49 64910 FATAL Failed to find an IP address for hostname "DESKTOP-IC01GOOS.europe.corp.microsoft.com" that matches hostname again in a reverse-DNS lookup.
5171 16:44:49 64910 INFO Continuing with IP address "192.168.1.156"
5172
5173 $ pg_autoctl -vv do show reverse DESKTOP-IC01GOOS.europe.corp.microsoft.com
5174 16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 192.168.1.156
5175 16:44:45 64832 DEBUG ipaddr.c:733: reverse lookup for "192.168.1.156" gives "desktop-ic01goos.europe.corp.microsoft.com" first
5176 16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 192.168.1.156
5177 16:44:45 64832 DEBUG ipaddr.c:733: reverse lookup for "192.168.1.156" gives "desktop-ic01goos.europe.corp.microsoft.com" first
5178 16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 2a01:110:10:40c::2ad
5179 16:44:45 64832 DEBUG ipaddr.c:728: Failed to resolve hostname from address "192.168.1.156": nodename nor servname provided, or not known
5180 16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 2a01:110:10:40c::2ad
5181 16:44:45 64832 DEBUG ipaddr.c:728: Failed to resolve hostname from address "192.168.1.156": nodename nor servname provided, or not known
5182 16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 100.64.34.213
5183 16:44:45 64832 DEBUG ipaddr.c:728: Failed to resolve hostname from address "192.168.1.156": nodename nor servname provided, or not known
5184 16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 100.64.34.213
5185 16:44:45 64832 DEBUG ipaddr.c:728: Failed to resolve hostname from address "192.168.1.156": nodename nor servname provided, or not known
5186 16:44:45 64832 FATAL cli_do_show.c:333: Failed to find an IP address for hostname "DESKTOP-IC01GOOS.europe.corp.microsoft.com" that matches hostname again in a reverse-DNS lookup.
5187 16:44:45 64832 INFO cli_do_show.c:334: Continuing with IP address "192.168.1.156"
5188
5189 pg_autoctl do pgsetup
5190 pg_autoctl do pgsetup - Manage a local Postgres setup
5191
5192 Synopsis
5193 The main pg_autoctl commands implement low-level management tooling for
5194 a local Postgres instance. Some of the low-level Postgres commands can
5195 be used as their own tool in some cases.
5196
5197 pg_autoctl do pgsetup provides the following commands:
5198
5199 pg_autoctl do pgsetup
5200 pg_ctl Find a non-ambiguous pg_ctl program and Postgres version
5201 discover Discover local PostgreSQL instance, if any
5202 ready Return true is the local Postgres server is ready
5203 wait Wait until the local Postgres server is ready
5204 logs Outputs the Postgres startup logs
5205 tune Compute and log some Postgres tuning options
5206
5207 pg_autoctl do pgsetup pg_ctl
5208 In a similar way to which -a, this commands scans your PATH for pg_ctl
5209 commands. Then it runs the pg_ctl --version command and parses the out‐
5210 put to determine the version of Postgres that is available in the path.
5211
5212 $ pg_autoctl do pgsetup pg_ctl --pgdata node1
5213 16:49:18 69684 INFO Environment variable PG_CONFIG is set to "/Applications/Postgres.app//Contents/Versions/12/bin/pg_config"
5214 16:49:18 69684 INFO `pg_autoctl create postgres` would use "/Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl" for Postgres 12.3
5215 16:49:18 69684 INFO `pg_autoctl create monitor` would use "/Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl" for Postgres 12.3
5216
5217 pg_autoctl do pgsetup discover
5218 Given a PGDATA or --pgdata option, the command discovers if a running
5219 Postgres service matches the pg_autoctl setup, and prints the informa‐
5220 tion that pg_autoctl typically needs when managing a Postgres instance.
5221
5222 $ pg_autoctl do pgsetup discover --pgdata node1
5223 pgdata: /Users/dim/dev/MS/pg_auto_failover/tmux/node1
5224 pg_ctl: /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl
5225 pg_version: 12.3
5226 pghost: /tmp
5227 pgport: 5501
5228 proxyport: 0
5229 pid: 21029
5230 is in recovery: no
5231 Control Version: 1201
5232 Catalog Version: 201909212
5233 System Identifier: 6942422768095393833
5234 Latest checkpoint LSN: 0/4059C18
5235 Postmaster status: ready
5236
5237 pg_autoctl do pgsetup ready
5238 Similar to the pg_isready command, though uses the Postgres specifica‐
5239 tions found in the pg_autoctl node setup.
5240
5241 $ pg_autoctl do pgsetup ready --pgdata node1
5242 16:50:08 70582 INFO Postgres status is: "ready"
5243
5244 pg_autoctl do pgsetup wait
5245 When pg_autoctl do pgsetup ready would return false because Postgres is
5246 not ready yet, this command continues probing every second for 30 sec‐
5247 onds, and exists as soon as Postgres is ready.
5248
5249 $ pg_autoctl do pgsetup wait --pgdata node1
5250 16:50:22 70829 INFO Postgres is now serving PGDATA "/Users/dim/dev/MS/pg_auto_failover/tmux/node1" on port 5501 with pid 21029
5251 16:50:22 70829 INFO Postgres status is: "ready"
5252
5253 pg_autoctl do pgsetup logs
5254 Outputs the Postgres logs from the most recent log file in the PG‐
5255 DATA/log directory.
5256
5257 $ pg_autoctl do pgsetup logs --pgdata node1
5258 16:50:39 71126 WARN Postgres logs from "/Users/dim/dev/MS/pg_auto_failover/tmux/node1/startup.log":
5259 16:50:39 71126 INFO 2021-03-22 14:43:48.911 CET [21029] LOG: starting PostgreSQL 12.3 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit
5260 16:50:39 71126 INFO 2021-03-22 14:43:48.913 CET [21029] LOG: listening on IPv6 address "::", port 5501
5261 16:50:39 71126 INFO 2021-03-22 14:43:48.913 CET [21029] LOG: listening on IPv4 address "0.0.0.0", port 5501
5262 16:50:39 71126 INFO 2021-03-22 14:43:48.913 CET [21029] LOG: listening on Unix socket "/tmp/.s.PGSQL.5501"
5263 16:50:39 71126 INFO 2021-03-22 14:43:48.931 CET [21029] LOG: redirecting log output to logging collector process
5264 16:50:39 71126 INFO 2021-03-22 14:43:48.931 CET [21029] HINT: Future log output will appear in directory "log".
5265 16:50:39 71126 WARN Postgres logs from "/Users/dim/dev/MS/pg_auto_failover/tmux/node1/log/postgresql-2021-03-22_144348.log":
5266 16:50:39 71126 INFO 2021-03-22 14:43:48.937 CET [21033] LOG: database system was shut down at 2021-03-22 14:43:46 CET
5267 16:50:39 71126 INFO 2021-03-22 14:43:48.937 CET [21033] LOG: entering standby mode
5268 16:50:39 71126 INFO 2021-03-22 14:43:48.942 CET [21033] LOG: consistent recovery state reached at 0/4022E88
5269 16:50:39 71126 INFO 2021-03-22 14:43:48.942 CET [21033] LOG: invalid record length at 0/4022E88: wanted 24, got 0
5270 16:50:39 71126 INFO 2021-03-22 14:43:48.946 CET [21029] LOG: database system is ready to accept read only connections
5271 16:50:39 71126 INFO 2021-03-22 14:43:49.032 CET [21038] LOG: fetching timeline history file for timeline 4 from primary server
5272 16:50:39 71126 INFO 2021-03-22 14:43:49.037 CET [21038] LOG: started streaming WAL from primary at 0/4000000 on timeline 3
5273 16:50:39 71126 INFO 2021-03-22 14:43:49.046 CET [21038] LOG: replication terminated by primary server
5274 16:50:39 71126 INFO 2021-03-22 14:43:49.046 CET [21038] DETAIL: End of WAL reached on timeline 3 at 0/4022E88.
5275 16:50:39 71126 INFO 2021-03-22 14:43:49.047 CET [21033] LOG: new target timeline is 4
5276 16:50:39 71126 INFO 2021-03-22 14:43:49.049 CET [21038] LOG: restarted WAL streaming at 0/4000000 on timeline 4
5277 16:50:39 71126 INFO 2021-03-22 14:43:49.210 CET [21033] LOG: redo starts at 0/4022E88
5278 16:50:39 71126 INFO 2021-03-22 14:52:06.692 CET [21029] LOG: received SIGHUP, reloading configuration files
5279 16:50:39 71126 INFO 2021-03-22 14:52:06.906 CET [21029] LOG: received SIGHUP, reloading configuration files
5280 16:50:39 71126 FATAL 2021-03-22 15:34:24.920 CET [21038] FATAL: terminating walreceiver due to timeout
5281 16:50:39 71126 INFO 2021-03-22 15:34:24.973 CET [21033] LOG: invalid record length at 0/4059CC8: wanted 24, got 0
5282 16:50:39 71126 INFO 2021-03-22 15:34:25.105 CET [35801] LOG: started streaming WAL from primary at 0/4000000 on timeline 4
5283 16:50:39 71126 FATAL 2021-03-22 16:12:56.918 CET [35801] FATAL: terminating walreceiver due to timeout
5284 16:50:39 71126 INFO 2021-03-22 16:12:57.086 CET [38741] LOG: started streaming WAL from primary at 0/4000000 on timeline 4
5285 16:50:39 71126 FATAL 2021-03-22 16:23:39.349 CET [38741] FATAL: terminating walreceiver due to timeout
5286 16:50:39 71126 INFO 2021-03-22 16:23:39.497 CET [41635] LOG: started streaming WAL from primary at 0/4000000 on timeline 4
5287
5288 pg_autoctl do pgsetup tune
5289 Outputs the pg_autoclt automated tuning options. Depending on the num‐
5290 ber of CPU and amount of RAM detected in the environment where it is
5291 run, pg_autoctl can adjust some very basic Postgres tuning knobs to get
5292 started.
5293
5294 $ pg_autoctl do pgsetup tune --pgdata node1 -vv
5295 13:25:25 77185 DEBUG pgtuning.c:85: Detected 12 CPUs and 16 GB total RAM on this server
5296 13:25:25 77185 DEBUG pgtuning.c:225: Setting autovacuum_max_workers to 3
5297 13:25:25 77185 DEBUG pgtuning.c:228: Setting shared_buffers to 4096 MB
5298 13:25:25 77185 DEBUG pgtuning.c:231: Setting work_mem to 24 MB
5299 13:25:25 77185 DEBUG pgtuning.c:235: Setting maintenance_work_mem to 512 MB
5300 13:25:25 77185 DEBUG pgtuning.c:239: Setting effective_cache_size to 12 GB
5301 # basic tuning computed by pg_auto_failover
5302 track_functions = pl
5303 shared_buffers = '4096 MB'
5304 work_mem = '24 MB'
5305 maintenance_work_mem = '512 MB'
5306 effective_cache_size = '12 GB'
5307 autovacuum_max_workers = 3
5308 autovacuum_vacuum_scale_factor = 0.08
5309 autovacuum_analyze_scale_factor = 0.02
5310
5311 The low-level API is made available through the following pg_autoctl do
5312 commands, only available in debug environments:
5313
5314 pg_autoctl do
5315 + monitor Query a pg_auto_failover monitor
5316 + fsm Manually manage the keeper's state
5317 + primary Manage a PostgreSQL primary server
5318 + standby Manage a PostgreSQL standby server
5319 + show Show some debug level information
5320 + pgsetup Manage a local Postgres setup
5321 + pgctl Signal the pg_autoctl postgres service
5322 + service Run pg_autoctl sub-processes (services)
5323 + tmux Set of facilities to handle tmux interactive sessions
5324 + azure Manage a set of Azure resources for a pg_auto_failover demo
5325 + demo Use a demo application for pg_auto_failover
5326
5327 pg_autoctl do monitor
5328 + get Get information from the monitor
5329 register Register the current node with the monitor
5330 active Call in the pg_auto_failover Node Active protocol
5331 version Check that monitor version is 1.5.0.1; alter extension update if not
5332 parse-notification parse a raw notification message
5333
5334 pg_autoctl do monitor get
5335 primary Get the primary node from pg_auto_failover in given formation/group
5336 others Get the other nodes from the pg_auto_failover group of hostname/port
5337 coordinator Get the coordinator node from the pg_auto_failover formation
5338
5339 pg_autoctl do fsm
5340 init Initialize the keeper's state on-disk
5341 state Read the keeper's state from disk and display it
5342 list List reachable FSM states from current state
5343 gv Output the FSM as a .gv program suitable for graphviz/dot
5344 assign Assign a new goal state to the keeper
5345 step Make a state transition if instructed by the monitor
5346 + nodes Manually manage the keeper's nodes list
5347
5348 pg_autoctl do fsm nodes
5349 get Get the list of nodes from file (see --disable-monitor)
5350 set Set the list of nodes to file (see --disable-monitor)
5351
5352 pg_autoctl do primary
5353 + slot Manage replication slot on the primary server
5354 + adduser Create users on primary
5355 defaults Add default settings to postgresql.conf
5356 identify Run the IDENTIFY_SYSTEM replication command on given host
5357
5358 pg_autoctl do primary slot
5359 create Create a replication slot on the primary server
5360 drop Drop a replication slot on the primary server
5361
5362 pg_autoctl do primary adduser
5363 monitor add a local user for queries from the monitor
5364 replica add a local user with replication privileges
5365
5366 pg_autoctl do standby
5367 init Initialize the standby server using pg_basebackup
5368 rewind Rewind a demoted primary server using pg_rewind
5369 promote Promote a standby server to become writable
5370
5371 pg_autoctl do show
5372 ipaddr Print this node's IP address information
5373 cidr Print this node's CIDR information
5374 lookup Print this node's DNS lookup information
5375 hostname Print this node's default hostname
5376 reverse Lookup given hostname and check reverse DNS setup
5377
5378 pg_autoctl do pgsetup
5379 pg_ctl Find a non-ambiguous pg_ctl program and Postgres version
5380 discover Discover local PostgreSQL instance, if any
5381 ready Return true is the local Postgres server is ready
5382 wait Wait until the local Postgres server is ready
5383 logs Outputs the Postgres startup logs
5384 tune Compute and log some Postgres tuning options
5385
5386 pg_autoctl do pgctl
5387 on Signal pg_autoctl postgres service to ensure Postgres is running
5388 off Signal pg_autoctl postgres service to ensure Postgres is stopped
5389
5390 pg_autoctl do service
5391 + getpid Get the pid of pg_autoctl sub-processes (services)
5392 + restart Restart pg_autoctl sub-processes (services)
5393 pgcontroller pg_autoctl supervised postgres controller
5394 postgres pg_autoctl service that start/stop postgres when asked
5395 listener pg_autoctl service that listens to the monitor notifications
5396 node-active pg_autoctl service that implements the node active protocol
5397
5398 pg_autoctl do service getpid
5399 postgres Get the pid of the pg_autoctl postgres controller service
5400 listener Get the pid of the pg_autoctl monitor listener service
5401 node-active Get the pid of the pg_autoctl keeper node-active service
5402
5403 pg_autoctl do service restart
5404 postgres Restart the pg_autoctl postgres controller service
5405 listener Restart the pg_autoctl monitor listener service
5406 node-active Restart the pg_autoctl keeper node-active service
5407
5408 pg_autoctl do tmux
5409 script Produce a tmux script for a demo or a test case (debug only)
5410 session Run a tmux session for a demo or a test case
5411 stop Stop pg_autoctl processes that belong to a tmux session
5412 wait Wait until a given node has been registered on the monitor
5413 clean Clean-up a tmux session processes and root dir
5414
5415 pg_autoctl do azure
5416 + provision provision azure resources for a pg_auto_failover demo
5417 + tmux Run a tmux session with an Azure setup for QA/testing
5418 + show show azure resources for a pg_auto_failover demo
5419 deploy Deploy a pg_autoctl VMs, given by name
5420 create Create an azure QA environment
5421 drop Drop an azure QA environment: resource group, network, VMs
5422 ls List resources in a given azure region
5423 ssh Runs ssh -l ha-admin <public ip address> for a given VM name
5424 sync Rsync pg_auto_failover sources on all the target region VMs
5425
5426 pg_autoctl do azure provision
5427 region Provision an azure region: resource group, network, VMs
5428 nodes Provision our pre-created VM with pg_autoctl Postgres nodes
5429
5430 pg_autoctl do azure tmux
5431 session Create or attach a tmux session for the created Azure VMs
5432 kill Kill an existing tmux session for Azure VMs
5433
5434 pg_autoctl do azure show
5435 ips Show public and private IP addresses for selected VMs
5436 state Connect to the monitor node to show the current state
5437
5438 pg_autoctl do demo
5439 run Run the pg_auto_failover demo application
5440 uri Grab the application connection string from the monitor
5441 ping Attempt to connect to the application URI
5442 summary Display a summary of the previous demo app run
5443
5444 pg_autoctl run
5445 pg_autoctl run - Run the pg_autoctl service (monitor or keeper)
5446
5447 Synopsis
5448 This commands starts the processes needed to run a monitor node or a
5449 keeper node, depending on the configuration file that belongs to the
5450 --pgdata option or PGDATA environment variable.
5451
5452 usage: pg_autoctl run [ --pgdata --name --hostname --pgport ]
5453
5454 --pgdata path to data directory
5455 --name pg_auto_failover node name
5456 --hostname hostname used to connect from other nodes
5457 --pgport PostgreSQL's port number
5458
5459 Description
5460 When registering Postgres nodes to the pg_auto_failover monitor using
5461 the pg_autoctl_create_postgres command, the nodes are registered with
5462 metadata: the node name, hostname and Postgres port.
5463
5464 The node name is used mostly in the logs and pg_autoctl_show_state com‐
5465 mands and helps human administrators of the formation.
5466
5467 The node hostname and pgport are used by other nodes, including the
5468 pg_auto_failover monitor, to open a Postgres connection.
5469
5470 Both the node name and the node hostname and port can be changed after
5471 the node registration by using either this command (pg_autoctl run) or
5472 the pg_autoctl_config_set command.
5473
5474 Options
5475 --pgdata
5476 Location of the Postgres node being managed locally. Defaults to
5477 the environment variable PGDATA. Use --monitor to connect to a
5478 monitor from anywhere, rather than the monitor URI used by a lo‐
5479 cal Postgres node managed with pg_autoctl.
5480
5481 --name Node name used on the monitor to refer to this node. The host‐
5482 name is a technical information, and given Postgres requirements
5483 on the HBA setup and DNS resolution (both forward and reverse
5484 lookups), IP addresses are often used for the hostname.
5485
5486 The --name option allows using a user-friendly name for your
5487 Postgres nodes.
5488
5489 --hostname
5490 Hostname or IP address (both v4 and v6 are supported) to use
5491 from any other node to connect to this node.
5492
5493 When not provided, a default value is computed by running the
5494 following algorithm.
5495
5496 1. We get this machine's "public IP" by opening a connection
5497 to the given monitor hostname or IP address. Then we get
5498 TCP/IP client address that has been used to make that con‐
5499 nection.
5500
5501 2. We then do a reverse DNS lookup on the IP address found in
5502 the previous step to fetch a hostname for our local ma‐
5503 chine.
5504
5505 3. If the reverse DNS lookup is successful , then pg_autoctl
5506 does a forward DNS lookup of that hostname.
5507
5508 When the forward DNS lookup response in step 3. is an IP address
5509 found in one of our local network interfaces, then pg_autoctl
5510 uses the hostname found in step 2. as the default --hostname.
5511 Otherwise it uses the IP address found in step 1.
5512
5513 You may use the --hostname command line option to bypass the
5514 whole DNS lookup based process and force the local node name to
5515 a fixed value.
5516
5517 --pgport
5518 Postgres port to use, defaults to 5432.
5519
5520 pg_autoctl watch
5521 pg_autoctl watch - Display an auto-updating dashboard
5522
5523 Synopsis
5524 This command outputs the events that the pg_auto_failover events
5525 records about state changes of the pg_auto_failover nodes managed by
5526 the monitor:
5527
5528 usage: pg_autoctl watch [ --pgdata --formation --group ]
5529
5530 --pgdata path to data directory
5531 --monitor show the monitor uri
5532 --formation formation to query, defaults to 'default'
5533 --group group to query formation, defaults to all
5534 --json output data in the JSON format
5535
5536 Options
5537 --pgdata
5538 Location of the Postgres node being managed locally. Defaults to
5539 the environment variable PGDATA. Use --monitor to connect to a
5540 monitor from anywhere, rather than the monitor URI used by a lo‐
5541 cal Postgres node managed with pg_autoctl.
5542
5543 --monitor
5544 Postgres URI used to connect to the monitor. Must use the au‐
5545 toctl_node username and target the pg_auto_failover database
5546 name. It is possible to show the Postgres URI from the monitor
5547 node using the command pg_autoctl_show_uri.
5548
5549 --formation
5550 List the events recorded for nodes in the given formation. De‐
5551 faults to default.
5552
5553 --group
5554 Limit output to a single group in the formation. Default to in‐
5555 cluding all groups registered in the target formation.
5556
5557 Description
5558 The pg_autoctl watch output is divided in 3 sections.
5559
5560 The first section is a single header line which includes the name of
5561 the currently selected formation, the formation replication setting
5562 number_sync_standbys, and then in the right most position the current
5563 time.
5564
5565 The second section displays one line per node, and each line contains a
5566 list of columns that describe the current state for the node. This list
5567 can includes the following columns, and which columns are part of the
5568 output depends on the terminal window size. This choice is dynamic and
5569 changes if your terminal window size changes:
5570
5571 • Name
5572 Name of the node.
5573
5574 • Node, or Id
5575 Node information. When the formation has a single group (group
5576 zero), then this column only contains the nodeId.
5577
5578 Only Citus formations allow several groups. When using a Citus
5579 formation the Node column contains the groupId and the nodeId,
5580 separated by a colon, such as 0:1 for the first coordinator
5581 node.
5582
5583 • Last Report, or Report
5584 Time interval between now and the last known time when a node
5585 has reported to the monitor, using the node_active protocol.
5586
5587 This value is expected to stay under 2s or abouts, and is
5588 known to increment when either the pg_autoctl run service is
5589 not running, or when there is a network split.
5590
5591 • Last Check, or Check
5592 Time inverval between now and the last known time when the
5593 monitor could connect to a node's Postgres instance, via its
5594 health check mechanism.
5595
5596 This value is known to increment when either the Postgres ser‐
5597 vice is not running on the target node, when there is a net‐
5598 work split, or when the internal machinery (the health check
5599 worker background process) implements jitter.
5600
5601 • Host:Port
5602 Hostname and port number used to connect to the node.
5603
5604 • TLI: LSN
5605 Timeline identifier (TLI) and Postgres Log Sequence Number
5606 (LSN).
5607
5608 The LSN is the current position in the Postgres WAL stream.
5609 This is a hexadecimal number. See pg_lsn for more information.
5610
5611 The current timeline is incremented each time a failover hap‐
5612 pens, or when doing Point In Time Recovery. A node can only
5613 reach the secondary state when it is on the same timeline as
5614 its primary node.
5615
5616 • Connection
5617 This output field contains two bits of information. First, the
5618 Postgres connection type that the node provides, either
5619 read-write or read-only. Then the mark ! is added when the
5620 monitor has failed to connect to this node, and ? when the
5621 monitor didn't connect to the node yet.
5622
5623 • Reported State
5624 The current FSM state as reported to the monitor by the pg_au‐
5625 toctl process running on the Postgres node.
5626
5627 • Assigned State
5628 The assigned FSM state on the monitor. When the assigned state
5629 is not the same as the reported start, then the pg_autoctl
5630 process running on the Postgres node might have not retrieved
5631 the assigned state yet, or might still be implementing the FSM
5632 transition from the current state to the assigned state.
5633
5634 The third and last section lists the most recent events that the moni‐
5635 tor has registered, the more recent event is found at the bottom of the
5636 screen.
5637
5638 To quit the command hit either the F1 key or the q key.
5639
5640 pg_autoctl stop
5641 pg_autoctl stop - signal the pg_autoctl service for it to stop
5642
5643 Synopsis
5644 This commands stops the processes needed to run a monitor node or a
5645 keeper node, depending on the configuration file that belongs to the
5646 --pgdata option or PGDATA environment variable.
5647
5648 usage: pg_autoctl stop [ --pgdata --fast --immediate ]
5649
5650 --pgdata path to data directory
5651 --fast fast shutdown mode for the keeper
5652 --immediate immediate shutdown mode for the keeper
5653
5654 Description
5655 The pg_autoctl stop commands finds the PID of the running service for
5656 the given --pgdata, and if the process is still running, sends a
5657 SIGTERM signal to the process.
5658
5659 When pg_autoclt receives a shutdown signal a shutdown sequence is trig‐
5660 gered. Depending on the signal received, an operation that has been
5661 started (such as a state transition) is either run to completion,
5662 stopped as the next opportunity, or stopped immediately even when in
5663 the middle of the transition.
5664
5665 Options
5666 --pgdata
5667 Location of the Postgres node being managed locally. Defaults to
5668 the environment variable PGDATA. Use --monitor to connect to a
5669 monitor from anywhere, rather than the monitor URI used by a lo‐
5670 cal Postgres node managed with pg_autoctl.
5671
5672 --fast Fast Shutdown mode for pg_autoctl. Sends the SIGINT signal to
5673 the running service, which is the same as using C-c on an inter‐
5674 active process running as a foreground shell job.
5675
5676 --immediate
5677 Immediate Shutdown mode for pg_autoctl. Sends the SIGQUIT signal
5678 to the running service.
5679
5680 pg_autoctl reload
5681 pg_autoctl reload - signal the pg_autoctl for it to reload its configu‐
5682 ration
5683
5684 Synopsis
5685 This commands signals a running pg_autoctl process to reload its con‐
5686 figuration from disk, and also signal the managed Postgres service to
5687 reload its configuration.
5688
5689 usage: pg_autoctl reload [ --pgdata ] [ --json ]
5690
5691 --pgdata path to data directory
5692
5693 Description
5694 The pg_autoctl reload commands finds the PID of the running service for
5695 the given --pgdata, and if the process is still running, sends a SIGHUP
5696 signal to the process.
5697
5698 Options
5699 --pgdata
5700 Location of the Postgres node being managed locally. Defaults to
5701 the environment variable PGDATA. Use --monitor to connect to a
5702 monitor from anywhere, rather than the monitor URI used by a lo‐
5703 cal Postgres node managed with pg_autoctl.
5704
5705 pg_autoctl status
5706 pg_autoctl status - Display the current status of the pg_autoctl ser‐
5707 vice
5708
5709 Synopsis
5710 This commands outputs the current process status for the pg_autoctl
5711 service running for the given --pgdata location.
5712
5713 usage: pg_autoctl status [ --pgdata ] [ --json ]
5714
5715 --pgdata path to data directory
5716 --json output data in the JSON format
5717
5718 Options
5719 --pgdata
5720 Location of the Postgres node being managed locally. Defaults to
5721 the environment variable PGDATA. Use --monitor to connect to a
5722 monitor from anywhere, rather than the monitor URI used by a lo‐
5723 cal Postgres node managed with pg_autoctl.
5724
5725 --json Output a JSON formated data instead of a table formatted list.
5726
5727 Example
5728 $ pg_autoctl status --pgdata node1
5729 11:26:30 27248 INFO pg_autoctl is running with pid 26618
5730 11:26:30 27248 INFO Postgres is serving PGDATA "/Users/dim/dev/MS/pg_auto_failover/tmux/node1" on port 5501 with pid 26725
5731
5732 $ pg_autoctl status --pgdata node1 --json
5733 11:26:37 27385 INFO pg_autoctl is running with pid 26618
5734 11:26:37 27385 INFO Postgres is serving PGDATA "/Users/dim/dev/MS/pg_auto_failover/tmux/node1" on port 5501 with pid 26725
5735 {
5736 "postgres": {
5737 "pgdata": "\/Users\/dim\/dev\/MS\/pg_auto_failover\/tmux\/node1",
5738 "pg_ctl": "\/Applications\/Postgres.app\/Contents\/Versions\/12\/bin\/pg_ctl",
5739 "version": "12.3",
5740 "host": "\/tmp",
5741 "port": 5501,
5742 "proxyport": 0,
5743 "pid": 26725,
5744 "in_recovery": false,
5745 "control": {
5746 "version": 0,
5747 "catalog_version": 0,
5748 "system_identifier": "0"
5749 },
5750 "postmaster": {
5751 "status": "ready"
5752 }
5753 },
5754 "pg_autoctl": {
5755 "pid": 26618,
5756 "status": "running",
5757 "pgdata": "\/Users\/dim\/dev\/MS\/pg_auto_failover\/tmux\/node1",
5758 "version": "1.5.0",
5759 "semId": 196609,
5760 "services": [
5761 {
5762 "name": "postgres",
5763 "pid": 26625,
5764 "status": "running",
5765 "version": "1.5.0",
5766 "pgautofailover": "1.5.0.1"
5767 },
5768 {
5769 "name": "node-active",
5770 "pid": 26626,
5771 "status": "running",
5772 "version": "1.5.0",
5773 "pgautofailover": "1.5.0.1"
5774 }
5775 ]
5776 }
5777 }
5778
5780 Several defaults settings of pg_auto_failover can be reviewed and
5781 changed depending on the trade-offs you want to implement in your own
5782 production setup. The settings that you can change will have an impact
5783 of the following operations:
5784
5785 • Deciding when to promote the secondary
5786
5787 pg_auto_failover decides to implement a failover to the secondary
5788 node when it detects that the primary node is unhealthy. Changing
5789 the following settings will have an impact on when the
5790 pg_auto_failover monitor decides to promote the secondary Post‐
5791 greSQL node:
5792
5793 pgautofailover.health_check_max_retries
5794 pgautofailover.health_check_period
5795 pgautofailover.health_check_retry_delay
5796 pgautofailover.health_check_timeout
5797 pgautofailover.node_considered_unhealthy_timeout
5798
5799 • Time taken to promote the secondary
5800
5801 At secondary promotion time, pg_auto_failover waits for the fol‐
5802 lowing timeout to make sure that all pending writes on the primary
5803 server made it to the secondary at shutdown time, thus preventing
5804 data loss.:
5805
5806 pgautofailover.primary_demote_timeout
5807
5808 • Preventing promotion of the secondary
5809
5810 pg_auto_failover implements a trade-off where data availability
5811 trumps service availability. When the primary node of a PostgreSQL
5812 service is detected unhealthy, the secondary is only promoted if
5813 it was known to be eligible at the moment when the primary is
5814 lost.
5815
5816 In the case when synchronous replication was in use at the moment
5817 when the primary node is lost, then we know we can switch to the
5818 secondary safely, and the wal lag is 0 in that case.
5819
5820 In the case when the secondary server had been detected unhealthy
5821 before, then the pg_auto_failover monitor switches it from the
5822 state SECONDARY to the state CATCHING-UP and promotion is pre‐
5823 vented then.
5824
5825 The following setting allows to still promote the secondary, al‐
5826 lowing for a window of data loss:
5827
5828 pgautofailover.promote_wal_log_threshold
5829
5830 pg_auto_failover Monitor
5831 The configuration for the behavior of the monitor happens in the Post‐
5832 greSQL database where the extension has been deployed:
5833
5834 pg_auto_failover=> select name, setting, unit, short_desc from pg_settings where name ~ 'pgautofailover.';
5835 -[ RECORD 1 ]----------------------------------------------------------------------------------------------------
5836 name | pgautofailover.enable_sync_wal_log_threshold
5837 setting | 16777216
5838 unit |
5839 short_desc | Don't enable synchronous replication until secondary xlog is within this many bytes of the primary's
5840 -[ RECORD 2 ]----------------------------------------------------------------------------------------------------
5841 name | pgautofailover.health_check_max_retries
5842 setting | 2
5843 unit |
5844 short_desc | Maximum number of re-tries before marking a node as failed.
5845 -[ RECORD 3 ]----------------------------------------------------------------------------------------------------
5846 name | pgautofailover.health_check_period
5847 setting | 5000
5848 unit | ms
5849 short_desc | Duration between each check (in milliseconds).
5850 -[ RECORD 4 ]----------------------------------------------------------------------------------------------------
5851 name | pgautofailover.health_check_retry_delay
5852 setting | 2000
5853 unit | ms
5854 short_desc | Delay between consecutive retries.
5855 -[ RECORD 5 ]----------------------------------------------------------------------------------------------------
5856 name | pgautofailover.health_check_timeout
5857 setting | 5000
5858 unit | ms
5859 short_desc | Connect timeout (in milliseconds).
5860 -[ RECORD 6 ]----------------------------------------------------------------------------------------------------
5861 name | pgautofailover.node_considered_unhealthy_timeout
5862 setting | 20000
5863 unit | ms
5864 short_desc | Mark node unhealthy if last ping was over this long ago
5865 -[ RECORD 7 ]----------------------------------------------------------------------------------------------------
5866 name | pgautofailover.primary_demote_timeout
5867 setting | 30000
5868 unit | ms
5869 short_desc | Give the primary this long to drain before promoting the secondary
5870 -[ RECORD 8 ]----------------------------------------------------------------------------------------------------
5871 name | pgautofailover.promote_wal_log_threshold
5872 setting | 16777216
5873 unit |
5874 short_desc | Don't promote secondary unless xlog is with this many bytes of the master
5875 -[ RECORD 9 ]----------------------------------------------------------------------------------------------------
5876 name | pgautofailover.startup_grace_period
5877 setting | 10000
5878 unit | ms
5879 short_desc | Wait for at least this much time after startup before initiating a failover.
5880
5881 You can edit the parameters as usual with PostgreSQL, either in the
5882 postgresql.conf file or using ALTER DATABASE pg_auto_failover SET pa‐
5883 rameter = value; commands, then issuing a reload.
5884
5885 pg_auto_failover Keeper Service
5886 For an introduction to the pg_autoctl commands relevant to the
5887 pg_auto_failover Keeper configuration, please see pg_autoctl_config.
5888
5889 An example configuration file looks like the following:
5890
5891 [pg_autoctl]
5892 role = keeper
5893 monitor = postgres://autoctl_node@192.168.1.34:6000/pg_auto_failover
5894 formation = default
5895 group = 0
5896 hostname = node1.db
5897 nodekind = standalone
5898
5899 [postgresql]
5900 pgdata = /data/pgsql/
5901 pg_ctl = /usr/pgsql-10/bin/pg_ctl
5902 dbname = postgres
5903 host = /tmp
5904 port = 5000
5905
5906 [replication]
5907 slot = pgautofailover_standby
5908 maximum_backup_rate = 100M
5909 backup_directory = /data/backup/node1.db
5910
5911 [timeout]
5912 network_partition_timeout = 20
5913 postgresql_restart_failure_timeout = 20
5914 postgresql_restart_failure_max_retries = 3
5915
5916 To output, edit and check entries of the configuration, the following
5917 commands are provided:
5918
5919 pg_autoctl config check [--pgdata <pgdata>]
5920 pg_autoctl config get [--pgdata <pgdata>] section.option
5921 pg_autoctl config set [--pgdata <pgdata>] section.option value
5922
5923 The [postgresql] section is discovered automatically by the pg_autoctl
5924 command and is not intended to be changed manually.
5925
5926 pg_autoctl.monitor
5927
5928 PostgreSQL service URL of the pg_auto_failover monitor, as given in the
5929 output of the pg_autoctl show uri command.
5930
5931 pg_autoctl.formation
5932
5933 A single pg_auto_failover monitor may handle several postgres forma‐
5934 tions. The default formation name default is usually fine.
5935
5936 pg_autoctl.group
5937
5938 This information is retrieved by the pg_auto_failover keeper when reg‐
5939 istering a node to the monitor, and should not be changed afterwards.
5940 Use at your own risk.
5941
5942 pg_autoctl.hostname
5943
5944 Node hostname used by all the other nodes in the cluster to contact
5945 this node. In particular, if this node is a primary then its standby
5946 uses that address to setup streaming replication.
5947
5948 replication.slot
5949
5950 Name of the PostgreSQL replication slot used in the streaming replica‐
5951 tion setup automatically deployed by pg_auto_failover. Replication
5952 slots can't be renamed in PostgreSQL.
5953
5954 replication.maximum_backup_rate
5955
5956 When pg_auto_failover (re-)builds a standby node using the pg_base‐
5957 backup command, this parameter is given to pg_basebackup to throttle
5958 the network bandwidth used. Defaults to 100Mbps.
5959
5960 replication.backup_directory
5961
5962 When pg_auto_failover (re-)builds a standby node using the pg_base‐
5963 backup command, this parameter is the target directory where to copy
5964 the bits from the primary server. When the copy has been successful,
5965 then the directory is renamed to postgresql.pgdata.
5966
5967 The default value is computed from ${PGDATA}/../backup/${hostname} and
5968 can be set to any value of your preference. Remember that the directory
5969 renaming is an atomic operation only when both the source and the tar‐
5970 get of the copy are in the same filesystem, at least in Unix systems.
5971
5972 timeout
5973
5974 This section allows to setup the behavior of the pg_auto_failover
5975 keeper in interesting scenarios.
5976
5977 timeout.network_partition_timeout
5978
5979 Timeout in seconds before we consider failure to communicate with other
5980 nodes indicates a network partition. This check is only done on a PRI‐
5981 MARY server, so other nodes mean both the monitor and the standby.
5982
5983 When a PRIMARY node is detected to be on the losing side of a network
5984 partition, the pg_auto_failover keeper enters the DEMOTE state and
5985 stops the PostgreSQL instance in order to protect against split brain
5986 situations.
5987
5988 The default is 20s.
5989
5990 timeout.postgresql_restart_failure_timeout
5991
5992 timeout.postgresql_restart_failure_max_retries
5993
5994 When PostgreSQL is not running, the first thing the pg_auto_failover
5995 keeper does is try to restart it. In case of a transient failure (e.g.
5996 file system is full, or other dynamic OS resource constraint), the best
5997 course of action is to try again for a little while before reaching out
5998 to the monitor and ask for a failover.
5999
6000 The pg_auto_failover keeper tries to restart PostgreSQL timeout.post‐
6001 gresql_restart_failure_max_retries times in a row (default 3) or up to
6002 timeout.postgresql_restart_failure_timeout (defaults 20s) since it de‐
6003 tected that PostgreSQL is not running, whichever comes first.
6004
6006 This section is not yet complete. Please contact us with any questions.
6007
6008 Deployment
6009 pg_auto_failover is a general purpose tool for setting up PostgreSQL
6010 replication in order to implement High Availability of the PostgreSQL
6011 service.
6012
6013 Provisioning
6014 It is also possible to register pre-existing PostgreSQL instances with
6015 a pg_auto_failover monitor. The pg_autoctl create command honors the
6016 PGDATA environment variable, and checks whether PostgreSQL is already
6017 running. If Postgres is detected, the new node is registered in SINGLE
6018 mode, bypassing the monitor's role assignment policy.
6019
6020 Upgrading pg_auto_failover, from versions 1.4 onward
6021 When upgrading a pg_auto_failover setup, the procedure is different on
6022 the monitor and on the Postgres nodes:
6023
6024 • on the monitor, the internal pg_auto_failover database schema
6025 might have changed and needs to be upgraded to its new definition,
6026 porting the existing data over. The pg_auto_failover database con‐
6027 tains the registration of every node in the system and their cur‐
6028 rent state.
6029 It is not possible to trigger a failover during the monitor
6030 update. Postgres operations on the Postgres nodes continue
6031 normally.
6032
6033 During the restart of the monitor, the other nodes might have
6034 trouble connecting to the monitor. The pg_autoctl command is
6035 designed to retry connecting to the monitor and handle errors
6036 gracefully.
6037
6038 • on the Postgres nodes, the pg_autoctl command connects to the mon‐
6039 itor every once in a while (every second by default), and then
6040 calls the node_active protocol, a stored procedure in the monitor
6041 databases.
6042 The pg_autoctl also verifies at each connection to the monitor
6043 that it's running the expected version of the extension. When
6044 that's not the case, the "node-active" sub-process quits, to
6045 be restarted with the possibly new version of the pg_autoctl
6046 binary found on-disk.
6047
6048 As a result, here is the standard upgrade plan for pg_auto_failover:
6049
6050 1. Upgrade the pg_auto_failover package on the all the nodes, moni‐
6051 tor included.
6052 When using a debian based OS, this looks like the following
6053 command when from 1.4 to 1.5:
6054
6055 sudo apt-get remove pg-auto-failover-cli-1.4 postgresql-11-auto-failover-1.4
6056 sudo apt-get install -q -y pg-auto-failover-cli-1.5 postgresql-11-auto-failover-1.5
6057
6058 2. Restart the pgautofailover service on the monitor.
6059 When using the systemd integration, all we need to do is:
6060
6061 sudo systemctl restart pgautofailover
6062
6063 Then we may use the following commands to make sure that the
6064 service is running as expected:
6065
6066 sudo systemctl status pgautofailover
6067 sudo journalctl -u pgautofailover
6068
6069 At this point it is expected that the pg_autoctl logs show
6070 that an upgrade has been performed by using the ALTER EXTEN‐
6071 SION pgautofailover UPDATE TO ... command. The monitor is
6072 ready with the new version of pg_auto_failover.
6073
6074 When the Postgres nodes pg_autoctl process connects to the new monitor
6075 version, the check for version compatibility fails, and the "node-ac‐
6076 tive" sub-process exits. The main pg_autoctl process supervisor then
6077 restart the "node-active" sub-process from its on-disk binary exe‐
6078 cutable file, which has been upgraded to the new version. That's why we
6079 first install the new packages for pg_auto_failover on every node, and
6080 only then restart the monitor.
6081
6082 IMPORTANT:
6083 Before upgrading the monitor, which is a simple restart of the
6084 pg_autoctl process, it is important that the OS packages for pgauto‐
6085 failover be updated on all the Postgres nodes.
6086
6087 When that's not the case, pg_autoctl on the Postgres nodes will
6088 still detect a version mismatch with the monitor extension, and the
6089 "node-active" sub-process will exit. And when restarted automati‐
6090 cally, the same version of the local pg_autoctl binary executable is
6091 found on-disk, leading to the same version mismatch with the monitor
6092 extension.
6093
6094 After restarting the "node-active" process 5 times, pg_autoctl quits
6095 retrying and stops. This includes stopping the Postgres service too,
6096 and a service downtime might then occur.
6097
6098 And when the upgrade is done we can use pg_autoctl show state on the
6099 monitor to see that eveything is as expected.
6100
6101 Upgrading from previous pg_auto_failover versions
6102 The new upgrade procedure described in the previous section is part of
6103 pg_auto_failover since version 1.4. When upgrading from a previous ver‐
6104 sion of pg_auto_failover, up to and including version 1.3, then all the
6105 pg_autoctl processes have to be restarted fully.
6106
6107 To prevent triggering a failover during the upgrade, it's best to put
6108 your secondary nodes in maintenance. The procedure then looks like the
6109 following:
6110
6111 1. Enable maintenance on your secondary node(s):
6112
6113 pg_autoctl enable maintenance
6114
6115 2. Upgrade the OS packages for pg_auto_failover on every node, as
6116 per previous section.
6117
6118 3. Restart the monitor to upgrade it to the new pg_auto_failover
6119 version:
6120 When using the systemd integration, all we need to do is:
6121
6122 sudo systemctl restart pgautofailover
6123
6124 Then we may use the following commands to make sure that the
6125 service is running as expected:
6126
6127 sudo systemctl status pgautofailover
6128 sudo journalctl -u pgautofailover
6129
6130 At this point it is expected that the pg_autoctl logs show
6131 that an upgrade has been performed by using the ALTER EXTEN‐
6132 SION pgautofailover UPDATE TO ... command. The monitor is
6133 ready with the new version of pg_auto_failover.
6134
6135 4. Restart pg_autoctl on all Postgres nodes on the cluster.
6136 When using the systemd integration, all we need to do is:
6137
6138 sudo systemctl restart pgautofailover
6139
6140 As in the previous point in this list, make sure the service
6141 is now running as expected.
6142
6143 5. Disable maintenance on your secondary nodes(s):
6144
6145 pg_autoctl disable maintenance
6146
6147 Extension dependencies when upgrading the monitor
6148 Since version 1.4.0 the pgautofailover extension requires the Postgres
6149 contrib extension btree_gist. The pg_autoctl command arranges for the
6150 creation of this dependency, and has been buggy in some releases.
6151
6152 As a result, you might have trouble upgrade the pg_auto_failover moni‐
6153 tor to a recent version. It is possible to fix the error by connecting
6154 to the monitor Postgres database and running the create extension com‐
6155 mand manually:
6156
6157 # create extension btree_gist;
6158
6159 Cluster Management and Operations
6160 It is possible to operate pg_auto_failover formations and groups di‐
6161 rectly from the monitor. All that is needed is an access to the monitor
6162 Postgres database as a client, such as psql. It's also possible to add
6163 those management SQL function calls in your own ops application if you
6164 have one.
6165
6166 For security reasons, the autoctl_node is not allowed to perform main‐
6167 tenance operations. This user is limited to what pg_autoctl needs. You
6168 can either create a specific user and authentication rule to expose for
6169 management, or edit the default HBA rules for the autoctl user. In the
6170 following examples we're directly connecting as the autoctl role.
6171
6172 The main operations with pg_auto_failover are node maintenance and man‐
6173 ual failover, also known as a controlled switchover.
6174
6175 Maintenance of a secondary node
6176 It is possible to put a secondary node in any group in a MAINTENANCE
6177 state, so that the Postgres server is not doing synchronous replication
6178 anymore and can be taken down for maintenance purposes, such as secu‐
6179 rity kernel upgrades or the like.
6180
6181 The command line tool pg_autoctl exposes an API to schedule maintenance
6182 operations on the current node, which must be a secondary node at the
6183 moment when maintenance is requested.
6184
6185 Here's an example of using the maintenance commands on a secondary
6186 node, including the output. Of course, when you try that on your own
6187 nodes, dates and PID information might differ:
6188
6189 $ pg_autoctl enable maintenance
6190 17:49:19 14377 INFO Listening monitor notifications about state changes in formation "default" and group 0
6191 17:49:19 14377 INFO Following table displays times when notifications are received
6192 Time | ID | Host | Port | Current State | Assigned State
6193 ---------+-----+-----------+--------+---------------------+--------------------
6194 17:49:19 | 1 | localhost | 5001 | primary | wait_primary
6195 17:49:19 | 2 | localhost | 5002 | secondary | wait_maintenance
6196 17:49:19 | 2 | localhost | 5002 | wait_maintenance | wait_maintenance
6197 17:49:20 | 1 | localhost | 5001 | wait_primary | wait_primary
6198 17:49:20 | 2 | localhost | 5002 | wait_maintenance | maintenance
6199 17:49:20 | 2 | localhost | 5002 | maintenance | maintenance
6200
6201 The command listens to the state changes in the current node's forma‐
6202 tion and group on the monitor and displays those changes as it receives
6203 them. The operation is done when the node has reached the maintenance
6204 state.
6205
6206 It is now possible to disable maintenance to allow pg_autoctl to manage
6207 this standby node again:
6208
6209 $ pg_autoctl disable maintenance
6210 17:49:26 14437 INFO Listening monitor notifications about state changes in formation "default" and group 0
6211 17:49:26 14437 INFO Following table displays times when notifications are received
6212 Time | ID | Host | Port | Current State | Assigned State
6213 ---------+-----+-----------+--------+---------------------+--------------------
6214 17:49:27 | 2 | localhost | 5002 | maintenance | catchingup
6215 17:49:27 | 2 | localhost | 5002 | catchingup | catchingup
6216 17:49:28 | 2 | localhost | 5002 | catchingup | secondary
6217 17:49:28 | 1 | localhost | 5001 | wait_primary | primary
6218 17:49:28 | 2 | localhost | 5002 | secondary | secondary
6219 17:49:29 | 1 | localhost | 5001 | primary | primary
6220
6221 When a standby node is in maintenance, the monitor sets the primary
6222 node replication to WAIT_PRIMARY: in this role, the PostgreSQL stream‐
6223 ing replication is now asynchronous and the standby PostgreSQL server
6224 may be stopped, rebooted, etc.
6225
6226 Maintenance of a primary node
6227 A primary node must be available at all times in any formation and
6228 group in pg_auto_failover, that is the invariant provided by the whole
6229 solution. With that in mind, the only way to allow a primary node to go
6230 to a maintenance mode is to first failover and promote the secondary
6231 node.
6232
6233 The same command pg_autoctl enable maintenance implements that opera‐
6234 tion when run on a primary node with the option --allow-failover. Here
6235 is an example of such an operation:
6236
6237 $ pg_autoctl enable maintenance
6238 11:53:03 50526 WARN Enabling maintenance on a primary causes a failover
6239 11:53:03 50526 FATAL Please use --allow-failover to allow the command proceed
6240
6241 As we can see the option allow-maintenance is mandatory. In the next
6242 example we use it:
6243
6244 $ pg_autoctl enable maintenance --allow-failover
6245 13:13:42 1614 INFO Listening monitor notifications about state changes in formation "default" and group 0
6246 13:13:42 1614 INFO Following table displays times when notifications are received
6247 Time | ID | Host | Port | Current State | Assigned State
6248 ---------+-----+-----------+--------+---------------------+--------------------
6249 13:13:43 | 2 | localhost | 5002 | primary | prepare_maintenance
6250 13:13:43 | 1 | localhost | 5001 | secondary | prepare_promotion
6251 13:13:43 | 1 | localhost | 5001 | prepare_promotion | prepare_promotion
6252 13:13:43 | 2 | localhost | 5002 | prepare_maintenance | prepare_maintenance
6253 13:13:44 | 1 | localhost | 5001 | prepare_promotion | stop_replication
6254 13:13:45 | 1 | localhost | 5001 | stop_replication | stop_replication
6255 13:13:46 | 1 | localhost | 5001 | stop_replication | wait_primary
6256 13:13:46 | 2 | localhost | 5002 | prepare_maintenance | maintenance
6257 13:13:46 | 1 | localhost | 5001 | wait_primary | wait_primary
6258 13:13:47 | 2 | localhost | 5002 | maintenance | maintenance
6259
6260 When the operation is done we can have the old primary re-join the
6261 group, this time as a secondary:
6262
6263 $ pg_autoctl disable maintenance
6264 13:14:46 1985 INFO Listening monitor notifications about state changes in formation "default" and group 0
6265 13:14:46 1985 INFO Following table displays times when notifications are received
6266 Time | ID | Host | Port | Current State | Assigned State
6267 ---------+-----+-----------+--------+---------------------+--------------------
6268 13:14:47 | 2 | localhost | 5002 | maintenance | catchingup
6269 13:14:47 | 2 | localhost | 5002 | catchingup | catchingup
6270 13:14:52 | 2 | localhost | 5002 | catchingup | secondary
6271 13:14:52 | 1 | localhost | 5001 | wait_primary | primary
6272 13:14:52 | 2 | localhost | 5002 | secondary | secondary
6273 13:14:53 | 1 | localhost | 5001 | primary | primary
6274
6275 Triggering a failover
6276 It is possible to trigger a manual failover, or a switchover, using the
6277 command pg_autoctl perform failover. Here's an example of what happens
6278 when running the command:
6279
6280 $ pg_autoctl perform failover
6281 11:58:00 53224 INFO Listening monitor notifications about state changes in formation "default" and group 0
6282 11:58:00 53224 INFO Following table displays times when notifications are received
6283 Time | ID | Host | Port | Current State | Assigned State
6284 ---------+-----+-----------+--------+--------------------+-------------------
6285 11:58:01 | 1 | localhost | 5001 | primary | draining
6286 11:58:01 | 2 | localhost | 5002 | secondary | prepare_promotion
6287 11:58:01 | 1 | localhost | 5001 | draining | draining
6288 11:58:01 | 2 | localhost | 5002 | prepare_promotion | prepare_promotion
6289 11:58:02 | 2 | localhost | 5002 | prepare_promotion | stop_replication
6290 11:58:02 | 1 | localhost | 5001 | draining | demote_timeout
6291 11:58:03 | 1 | localhost | 5001 | demote_timeout | demote_timeout
6292 11:58:04 | 2 | localhost | 5002 | stop_replication | stop_replication
6293 11:58:05 | 2 | localhost | 5002 | stop_replication | wait_primary
6294 11:58:05 | 1 | localhost | 5001 | demote_timeout | demoted
6295 11:58:05 | 2 | localhost | 5002 | wait_primary | wait_primary
6296 11:58:05 | 1 | localhost | 5001 | demoted | demoted
6297 11:58:06 | 1 | localhost | 5001 | demoted | catchingup
6298 11:58:06 | 1 | localhost | 5001 | catchingup | catchingup
6299 11:58:08 | 1 | localhost | 5001 | catchingup | secondary
6300 11:58:08 | 2 | localhost | 5002 | wait_primary | primary
6301 11:58:08 | 1 | localhost | 5001 | secondary | secondary
6302 11:58:08 | 2 | localhost | 5002 | primary | primary
6303
6304 Again, timings and PID numbers are not expected to be the same when you
6305 run the command on your own setup.
6306
6307 Also note in the output that the command shows the whole set of transi‐
6308 tions including when the old primary is now a secondary node. The data‐
6309 base is available for read-write traffic as soon as we reach the state
6310 wait_primary.
6311
6312 Implementing a controlled switchover
6313 It is generally useful to distinguish a controlled switchover to a
6314 failover. In a controlled switchover situation it is possible to organ‐
6315 ise the sequence of events in a way to avoid data loss and lower down‐
6316 time to a minimum.
6317
6318 In the case of pg_auto_failover, because we use synchronous replica‐
6319 tion, we don't face data loss risks when triggering a manual failover.
6320 Moreover, our monitor knows the current primary health at the time when
6321 the failover is triggered, and drives the failover accordingly.
6322
6323 So to trigger a controlled switchover with pg_auto_failover you can use
6324 the same API as for a manual failover:
6325
6326 $ pg_autoctl perform switchover
6327
6328 Because the subtelties of orchestrating either a controlled switchover
6329 or an unplanned failover are all handled by the monitor, rather than
6330 the client side command line, at the client level the two command
6331 pg_autoctl perform failover and pg_autoctl perform switchover are syn‐
6332 onyms, or aliases.
6333
6334 Current state, last events
6335 The following commands display information from the pg_auto_failover
6336 monitor tables pgautofailover.node and pgautofailover.event:
6337
6338 $ pg_autoctl show state
6339 $ pg_autoctl show events
6340
6341 When run on the monitor, the commands outputs all the known states and
6342 events for the whole set of formations handled by the monitor. When run
6343 on a PostgreSQL node, the command connects to the monitor and outputs
6344 the information relevant to the service group of the local node only.
6345
6346 For interactive debugging it is helpful to run the following command
6347 from the monitor node while e.g. initializing a formation from scratch,
6348 or performing a manual failover:
6349
6350 $ watch pg_autoctl show state
6351
6352 Monitoring pg_auto_failover in Production
6353 The monitor reports every state change decision to a LISTEN/NOTIFY
6354 channel named state. PostgreSQL logs on the monitor are also stored in
6355 a table, pgautofailover.event, and broadcast by NOTIFY in the channel
6356 log.
6357
6358 Replacing the monitor online
6359 When the monitor node is not available anymore, it is possible to cre‐
6360 ate a new monitor node and then switch existing nodes to a new monitor
6361 by using the following commands.
6362
6363 1. Apply the STONITH approach on the old monitor to make sure this
6364 node is not going to show up again during the procedure. This
6365 step is sometimes refered to as “fencing”.
6366
6367 2. On every node, ending with the (current) Postgres primary node
6368 for each group, disable the monitor while pg_autoctl is still
6369 running:
6370
6371 $ pg_autoctl disable monitor --force
6372
6373 3. Create a new monitor node:
6374
6375 $ pg_autoctl create monitor ...
6376
6377 4. On the current primary node first, so that it's registered first
6378 and as a primary still, for each group in your formation(s), en‐
6379 able the monitor online again:
6380
6381 $ pg_autoctl enable monitor postgresql://autoctl_node@.../pg_auto_failover
6382
6383 5. On every other (secondary) node, enable the monitor online again:
6384
6385 $ pg_autoctl enable monitor postgresql://autoctl_node@.../pg_auto_failover
6386
6387 See pg_autoctl_disable_monitor and pg_autoctl_enable_monitor for de‐
6388 tails about those commands.
6389
6390 This operation relies on the fact that a pg_autoctl can be operated
6391 without a monitor, and when reconnecting to a new monitor, this process
6392 reset the parts of the node state that comes from the monitor, such as
6393 the node identifier.
6394
6395 Trouble-Shooting Guide
6396 pg_auto_failover commands can be run repeatedly. If initialization
6397 fails the first time -- for instance because a firewall rule hasn't yet
6398 activated -- it's possible to try pg_autoctl create again.
6399 pg_auto_failover will review its previous progress and repeat idempo‐
6400 tent operations (create database, create extension etc), gracefully
6401 handling errors.
6402
6404 Those questions have been asked in GitHub issues for the project by
6405 several people. If you have more questions, feel free to open a new is‐
6406 sue, and your question and its answer might make it to this FAQ.
6407
6408 I stopped the primary and no failover is happening for 20s to 30s, why?
6409 In order to avoid spurious failovers when the network connectivity is
6410 not stable, pg_auto_failover implements a timeout of 20s before acting
6411 on a node that is known unavailable. This needs to be added to the de‐
6412 lay between health checks and the retry policy.
6413
6414 See the configuration part for more information about how to setup the
6415 different delays and timeouts that are involved in the decision making.
6416
6417 See also pg_autoctl_watch to have a dashboard that helps understanding
6418 the system and what's going on in the moment.
6419
6420 The secondary is blocked in the CATCHING_UP state, what should I do?
6421 In the pg_auto_failover design, the following two things are needed for
6422 the monitor to be able to orchestrate nodes integration completely:
6423
6424 1. Health Checks must be successful
6425
6426 The monitor runs periodic health checks with all the nodes regis‐
6427 tered in the system. Those health checks are Postgres connections
6428 from the monitor to the registered Postgres nodes, and use the
6429 hostname and port as registered.
6430
6431 The pg_autoctl show state commands column Reachable contains
6432 "yes" when the monitor could connect to a specific node, "no"
6433 when this connection failed, and "unknown" when no connection has
6434 been attempted yet, since the last startup time of the monitor.
6435
6436 The Reachable column from pg_autoctl show state command output
6437 must show a "yes" entry before a new standby node can be orches‐
6438 trated up to the "secondary" goal state.
6439
6440 2. pg_autoctl service must be running
6441
6442 The pg_auto_failover monitor works by assigning goal states to
6443 individual Postgres nodes. The monitor will not assign a new goal
6444 state until the current one has been reached.
6445
6446 To implement a transition from the current state to the goal
6447 state assigned by the monitor, the pg_autoctl service must be
6448 running on every node.
6449
6450 When your new standby node stays in the "catchingup" state for a long
6451 time, please check that the node is reachable from the monitor given
6452 its hostname and port known on the monitor, and check that the pg_au‐
6453 toctl run command is running for this node.
6454
6455 When things are not obvious, the next step is to go read the logs. Both
6456 the output of the pg_autoctl command and the Postgres logs are rele‐
6457 vant. See the Should I read the logs? Where are the logs? question for
6458 details.
6459
6460 Should I read the logs? Where are the logs?
6461 Yes. If anything seems strange to you, please do read the logs.
6462
6463 As maintainers of the pg_autoctl tool, we can't foresee everything that
6464 may happen to your production environment. Still, a lot of efforts is
6465 spent on having a meaningful output. So when you're in a situation
6466 that's hard to understand, please make sure to read the pg_autoctl logs
6467 and the Postgres logs.
6468
6469 When using systemd integration, the pg_autoctl logs are then handled
6470 entirely by the journal facility of systemd. Please then refer to jour‐
6471 nalctl for viewing the logs.
6472
6473 The Postgres logs are to be found in the $PGDATA/log directory with the
6474 default configuration deployed by pg_autoctl create .... When a custom
6475 Postgres setup is used, please refer to your actual setup to find Post‐
6476 gres logs.
6477
6478 The state of the system is blocked, what should I do?
6479 This question is a general case situation that is similar in nature to
6480 the previous situation, reached when adding a new standby to a group of
6481 Postgres nodes. Please check the same two elements: the monitor health
6482 checks are successful, and the pg_autoctl run command is running.
6483
6484 When things are not obvious, the next step is to go read the logs. Both
6485 the output of the pg_autoctl command and the Postgres logs are rele‐
6486 vant. See the Should I read the logs? Where are the logs? question for
6487 details.
6488
6489 The monitor is a SPOF in pg_auto_failover design, how should we handle
6490 that?
6491 When using pg_auto_failover, the monitor is needed to make decisions
6492 and orchestrate changes in all the registered Postgres groups. Deci‐
6493 sions are transmitted to the Postgres nodes by the monitor assigning
6494 nodes a goal state which is different from their current state.
6495
6496 Consequences of the monitor being unavailable
6497 Nodes contact the monitor each second and call the node_active stored
6498 procedure, which returns a goal state that is possibly different from
6499 the current state.
6500
6501 The monitor only assigns Postgres nodes with a new goal state when a
6502 cluster wide operation is needed. In practice, only the following oper‐
6503 ations require the monitor to assign a new goal state to a Postgres
6504 node:
6505
6506 • a new node is registered
6507
6508 • a failover needs to happen, either triggered automatically or man‐
6509 ually
6510
6511 • a node is being put to maintenance
6512
6513 • a node replication setting is being changed.
6514
6515 When the monitor node is not available, the pg_autoctl processes on the
6516 Postgres nodes will fail to contact the monitor every second, and log
6517 about this failure. Adding to that, no orchestration is possible.
6518
6519 The Postgres streaming replication does not need the monitor to be
6520 available in order to deliver its service guarantees to your applica‐
6521 tion, so your Postgres service is still available when the monitor is
6522 not available.
6523
6524 To repair your installation after having lost a monitor, the following
6525 scenarios are to be considered.
6526
6527 The monitor node can be brought up again without data having been lost
6528 This is typically the case in Cloud Native environments such as Kuber‐
6529 netes, where you could have a service migrated to another pod and
6530 re-attached to its disk volume. This scenario is well supported by
6531 pg_auto_failover, and no intervention is needed.
6532
6533 It is also possible to use synchronous archiving with the monitor so
6534 that it's possible to recover from the current archives and continue
6535 operating without intervention on the Postgres nodes, except for updat‐
6536 ing their monitor URI. This requires an archiving setup that uses syn‐
6537 chronous replication so that any transaction committed on the monitor
6538 is known to have been replicated in your WAL archive.
6539
6540 At the moment, you have to take care of that setup yourself. Here's a
6541 quick summary of what needs to be done:
6542
6543 1. Schedule base backups
6544
6545 Use pg_basebackup every once in a while to have a full copy of
6546 the monitor Postgres database available.
6547
6548 2. Archive WAL files in a synchronous fashion
6549
6550 Use pg_receivewal --sync ... as a service to keep a WAL archive
6551 in sync with the monitor Postgres instance at all time.
6552
6553 3. Prepare a recovery tool on top of your archiving strategy
6554
6555 Write a utility that knows how to create a new monitor node from
6556 your most recent pg_basebackup copy and the WAL files copy.
6557
6558 Bonus points if that tool/script is tested at least once a day,
6559 so that you avoid surprises on the unfortunate day that you actu‐
6560 ally need to use it in production.
6561
6562 A future version of pg_auto_failover will include this facility, but
6563 the current versions don't.
6564
6565 The monitor node can only be built from scratch again
6566 If you don't have synchronous archiving for the monitor set-up, then
6567 you might not be able to restore a monitor database with the expected
6568 up-to-date node metadata. Specifically we need the nodes state to be in
6569 sync with what each pg_autoctl process has received the last time they
6570 could contact the monitor, before it has been unavailable.
6571
6572 It is possible to register nodes that are currently running to a new
6573 monitor without restarting Postgres on the primary. For that, the pro‐
6574 cedure mentionned in replacing_monitor_online must be followed, using
6575 the following commands:
6576
6577 $ pg_autoctl disable monitor
6578 $ pg_autoctl enable monitor
6579
6581 Microsoft
6582
6584 Copyright (c) Microsoft Corporation. All rights reserved.
6585
6586
6587
6588
65891.6 Nov 24, 2021 PG_AUTO_FAILOVER(1)