1PG_AUTOCTL(5) pg_auto_failover PG_AUTOCTL(5)
2
3
4
6 pg_autoctl - pg_auto_failover Configuration
7
8 Several defaults settings of pg_auto_failover can be reviewed and
9 changed depending on the trade-offs you want to implement in your own
10 production setup. The settings that you can change will have an impact
11 of the following operations:
12
13 • Deciding when to promote the secondary
14
15 pg_auto_failover decides to implement a failover to the secondary
16 node when it detects that the primary node is unhealthy. Changing
17 the following settings will have an impact on when the
18 pg_auto_failover monitor decides to promote the secondary Post‐
19 greSQL node:
20
21 pgautofailover.health_check_max_retries
22 pgautofailover.health_check_period
23 pgautofailover.health_check_retry_delay
24 pgautofailover.health_check_timeout
25 pgautofailover.node_considered_unhealthy_timeout
26
27 • Time taken to promote the secondary
28
29 At secondary promotion time, pg_auto_failover waits for the fol‐
30 lowing timeout to make sure that all pending writes on the primary
31 server made it to the secondary at shutdown time, thus preventing
32 data loss.:
33
34 pgautofailover.primary_demote_timeout
35
36 • Preventing promotion of the secondary
37
38 pg_auto_failover implements a trade-off where data availability
39 trumps service availability. When the primary node of a PostgreSQL
40 service is detected unhealthy, the secondary is only promoted if
41 it was known to be eligible at the moment when the primary is
42 lost.
43
44 In the case when synchronous replication was in use at the moment
45 when the primary node is lost, then we know we can switch to the
46 secondary safely, and the wal lag is 0 in that case.
47
48 In the case when the secondary server had been detected unhealthy
49 before, then the pg_auto_failover monitor switches it from the
50 state SECONDARY to the state CATCHING-UP and promotion is pre‐
51 vented then.
52
53 The following setting allows to still promote the secondary, al‐
54 lowing for a window of data loss:
55
56 pgautofailover.promote_wal_log_threshold
57
59 The configuration for the behavior of the monitor happens in the Post‐
60 greSQL database where the extension has been deployed:
61
62 pg_auto_failover=> select name, setting, unit, short_desc from pg_settings where name ~ 'pgautofailover.';
63 -[ RECORD 1 ]----------------------------------------------------------------------------------------------------
64 name | pgautofailover.enable_sync_wal_log_threshold
65 setting | 16777216
66 unit |
67 short_desc | Don't enable synchronous replication until secondary xlog is within this many bytes of the primary's
68 -[ RECORD 2 ]----------------------------------------------------------------------------------------------------
69 name | pgautofailover.health_check_max_retries
70 setting | 2
71 unit |
72 short_desc | Maximum number of re-tries before marking a node as failed.
73 -[ RECORD 3 ]----------------------------------------------------------------------------------------------------
74 name | pgautofailover.health_check_period
75 setting | 5000
76 unit | ms
77 short_desc | Duration between each check (in milliseconds).
78 -[ RECORD 4 ]----------------------------------------------------------------------------------------------------
79 name | pgautofailover.health_check_retry_delay
80 setting | 2000
81 unit | ms
82 short_desc | Delay between consecutive retries.
83 -[ RECORD 5 ]----------------------------------------------------------------------------------------------------
84 name | pgautofailover.health_check_timeout
85 setting | 5000
86 unit | ms
87 short_desc | Connect timeout (in milliseconds).
88 -[ RECORD 6 ]----------------------------------------------------------------------------------------------------
89 name | pgautofailover.node_considered_unhealthy_timeout
90 setting | 20000
91 unit | ms
92 short_desc | Mark node unhealthy if last ping was over this long ago
93 -[ RECORD 7 ]----------------------------------------------------------------------------------------------------
94 name | pgautofailover.primary_demote_timeout
95 setting | 30000
96 unit | ms
97 short_desc | Give the primary this long to drain before promoting the secondary
98 -[ RECORD 8 ]----------------------------------------------------------------------------------------------------
99 name | pgautofailover.promote_wal_log_threshold
100 setting | 16777216
101 unit |
102 short_desc | Don't promote secondary unless xlog is with this many bytes of the master
103 -[ RECORD 9 ]----------------------------------------------------------------------------------------------------
104 name | pgautofailover.startup_grace_period
105 setting | 10000
106 unit | ms
107 short_desc | Wait for at least this much time after startup before initiating a failover.
108
109 You can edit the parameters as usual with PostgreSQL, either in the
110 postgresql.conf file or using ALTER DATABASE pg_auto_failover SET pa‐
111 rameter = value; commands, then issuing a reload.
112
114 For an introduction to the pg_autoctl commands relevant to the
115 pg_auto_failover Keeper configuration, please see pg_autoctl config.
116
117 An example configuration file looks like the following:
118
119 [pg_autoctl]
120 role = keeper
121 monitor = postgres://autoctl_node@192.168.1.34:6000/pg_auto_failover
122 formation = default
123 group = 0
124 hostname = node1.db
125 nodekind = standalone
126
127 [postgresql]
128 pgdata = /data/pgsql/
129 pg_ctl = /usr/pgsql-10/bin/pg_ctl
130 dbname = postgres
131 host = /tmp
132 port = 5000
133
134 [replication]
135 slot = pgautofailover_standby
136 maximum_backup_rate = 100M
137 backup_directory = /data/backup/node1.db
138
139 [timeout]
140 network_partition_timeout = 20
141 postgresql_restart_failure_timeout = 20
142 postgresql_restart_failure_max_retries = 3
143
144 To output, edit and check entries of the configuration, the following
145 commands are provided:
146
147 pg_autoctl config check [--pgdata <pgdata>]
148 pg_autoctl config get [--pgdata <pgdata>] section.option
149 pg_autoctl config set [--pgdata <pgdata>] section.option value
150
151 The [postgresql] section is discovered automatically by the pg_autoctl
152 command and is not intended to be changed manually.
153
154 pg_autoctl.monitor
155
156 PostgreSQL service URL of the pg_auto_failover monitor, as given in the
157 output of the pg_autoctl show uri command.
158
159 pg_autoctl.formation
160
161 A single pg_auto_failover monitor may handle several postgres forma‐
162 tions. The default formation name default is usually fine.
163
164 pg_autoctl.group
165
166 This information is retrieved by the pg_auto_failover keeper when reg‐
167 istering a node to the monitor, and should not be changed afterwards.
168 Use at your own risk.
169
170 pg_autoctl.hostname
171
172 Node hostname used by all the other nodes in the cluster to contact
173 this node. In particular, if this node is a primary then its standby
174 uses that address to setup streaming replication.
175
176 replication.slot
177
178 Name of the PostgreSQL replication slot used in the streaming replica‐
179 tion setup automatically deployed by pg_auto_failover. Replication
180 slots can't be renamed in PostgreSQL.
181
182 replication.maximum_backup_rate
183
184 When pg_auto_failover (re-)builds a standby node using the pg_base‐
185 backup command, this parameter is given to pg_basebackup to throttle
186 the network bandwidth used. Defaults to 100Mbps.
187
188 replication.backup_directory
189
190 When pg_auto_failover (re-)builds a standby node using the pg_base‐
191 backup command, this parameter is the target directory where to copy
192 the bits from the primary server. When the copy has been successful,
193 then the directory is renamed to postgresql.pgdata.
194
195 The default value is computed from ${PGDATA}/../backup/${hostname} and
196 can be set to any value of your preference. Remember that the directory
197 renaming is an atomic operation only when both the source and the tar‐
198 get of the copy are in the same filesystem, at least in Unix systems.
199
200 timeout
201
202 This section allows to setup the behavior of the pg_auto_failover
203 keeper in interesting scenarios.
204
205 timeout.network_partition_timeout
206
207 Timeout in seconds before we consider failure to communicate with other
208 nodes indicates a network partition. This check is only done on a PRI‐
209 MARY server, so other nodes mean both the monitor and the standby.
210
211 When a PRIMARY node is detected to be on the losing side of a network
212 partition, the pg_auto_failover keeper enters the DEMOTE state and
213 stops the PostgreSQL instance in order to protect against split brain
214 situations.
215
216 The default is 20s.
217
218 timeout.postgresql_restart_failure_timeout
219
220 timeout.postgresql_restart_failure_max_retries
221
222 When PostgreSQL is not running, the first thing the pg_auto_failover
223 keeper does is try to restart it. In case of a transient failure (e.g.
224 file system is full, or other dynamic OS resource constraint), the best
225 course of action is to try again for a little while before reaching out
226 to the monitor and ask for a failover.
227
228 The pg_auto_failover keeper tries to restart PostgreSQL timeout.post‐
229 gresql_restart_failure_max_retries times in a row (default 3) or up to
230 timeout.postgresql_restart_failure_timeout (defaults 20s) since it de‐
231 tected that PostgreSQL is not running, whichever comes first.
232
234 Microsoft
235
237 Copyright (c) Microsoft Corporation. All rights reserved.
238
239
240
241
2422.0 Sep 13, 2023 PG_AUTOCTL(5)