1PG_BASEBACKUP(1) PostgreSQL 13.4 Documentation PG_BASEBACKUP(1)
2
3
4
6 pg_basebackup - take a base backup of a PostgreSQL cluster
7
9 pg_basebackup [option...]
10
12 pg_basebackup is used to take a base backup of a running PostgreSQL
13 database cluster. The backup is taken without affecting other clients
14 of the database, and can be used both for point-in-time recovery (see
15 Section 25.3) and as the starting point for a log-shipping or
16 streaming-replication standby server (see Section 26.2).
17
18 pg_basebackup makes an exact copy of the database cluster's files,
19 while making sure the server is put into and out of backup mode
20 automatically. Backups are always taken of the entire database cluster;
21 it is not possible to back up individual databases or database objects.
22 For selective backups, another tool such as pg_dump(1) must be used.
23
24 The backup is made over a regular PostgreSQL connection that uses the
25 replication protocol. The connection must be made with a user ID that
26 has REPLICATION permissions (see Section 21.2) or is a superuser, and
27 pg_hba.conf must permit the replication connection. The server must
28 also be configured with max_wal_senders set high enough to provide at
29 least one walsender for the backup plus one for WAL streaming (if
30 used).
31
32 There can be multiple pg_basebackups running at the same time, but it
33 is usually better from a performance point of view to take only one
34 backup, and copy the result.
35
36 pg_basebackup can make a base backup from not only a primary server but
37 also a standby. To take a backup from a standby, set up the standby so
38 that it can accept replication connections (that is, set
39 max_wal_senders and hot_standby, and configure its pg_hba.conf
40 appropriately). You will also need to enable full_page_writes on the
41 primary.
42
43 Note that there are some limitations in taking a backup from a standby:
44
45 • The backup history file is not created in the database cluster
46 backed up.
47
48 • pg_basebackup cannot force the standby to switch to a new WAL file
49 at the end of backup. When you are using -X none, if write activity
50 on the primary is low, pg_basebackup may need to wait a long time
51 for the last WAL file required for the backup to be switched and
52 archived. In this case, it may be useful to run pg_switch_wal on
53 the primary in order to trigger an immediate WAL file switch.
54
55 • If the standby is promoted to be primary during backup, the backup
56 fails.
57
58 • All WAL records required for the backup must contain sufficient
59 full-page writes, which requires you to enable full_page_writes on
60 the primary and not to use a tool like pg_compresslog as
61 archive_command to remove full-page writes from WAL files.
62
63 Whenever pg_basebackup is taking a base backup, the server's
64 pg_stat_progress_basebackup view will report the progress of the
65 backup. See Section 27.4.5 for details.
66
68 The following command-line options control the location and format of
69 the output:
70
71 -D directory
72 --pgdata=directory
73 Sets the target directory to write the output to. pg_basebackup
74 will create this directory (and any missing parent directories) if
75 it does not exist. If it already exists, it must be empty.
76
77 When the backup is in tar format, the target directory may be
78 specified as - (dash), causing the tar file to be written to
79 stdout.
80
81 This option is required.
82
83 -F format
84 --format=format
85 Selects the format for the output. format can be one of the
86 following:
87
88 p
89 plain
90 Write the output as plain files, with the same layout as the
91 source server's data directory and tablespaces. When the
92 cluster has no additional tablespaces, the whole database will
93 be placed in the target directory. If the cluster contains
94 additional tablespaces, the main data directory will be placed
95 in the target directory, but all other tablespaces will be
96 placed in the same absolute path as they have on the source
97 server. (See --tablespace-mapping to change that.)
98
99 This is the default format.
100
101 t
102 tar
103 Write the output as tar files in the target directory. The main
104 data directory's contents will be written to a file named
105 base.tar, and each other tablespace will be written to a
106 separate tar file named after that tablespace's OID.
107
108 If the target directory is specified as - (dash), the tar
109 contents will be written to standard output, suitable for
110 piping to (for example) gzip. This is only allowed if the
111 cluster has no additional tablespaces and WAL streaming is not
112 used.
113
114 -R
115 --write-recovery-conf
116 Creates a standby.signal
117
118 file and appends connection settings to the postgresql.auto.conf
119 file in the target directory (or within the base archive file when
120 using tar format). This eases setting up a standby server using the
121 results of the backup.
122
123 The postgresql.auto.conf file will record the connection settings
124 and, if specified, the replication slot that pg_basebackup is
125 using, so that streaming replication will use the same settings
126 later on.
127
128 -T olddir=newdir
129 --tablespace-mapping=olddir=newdir
130 Relocates the tablespace in directory olddir to newdir during the
131 backup. To be effective, olddir must exactly match the path
132 specification of the tablespace as it is defined on the source
133 server. (But it is not an error if there is no tablespace in olddir
134 on the source server.) Meanwhile newdir is a directory in the
135 receiving host's filesystem. As with the main target directory,
136 newdir need not exist already, but if it does exist it must be
137 empty. Both olddir and newdir must be absolute paths. If either
138 path needs to contain an equal sign (=), precede that with a
139 backslash. This option can be specified multiple times for multiple
140 tablespaces.
141
142 If a tablespace is relocated in this way, the symbolic links inside
143 the main data directory are updated to point to the new location.
144 So the new data directory is ready to be used for a new server
145 instance with all tablespaces in the updated locations.
146
147 Currently, this option only works with plain output format; it is
148 ignored if tar format is selected.
149
150 --waldir=waldir
151 Sets the directory to write WAL (write-ahead log) files to. By
152 default WAL files will be placed in the pg_wal subdirectory of the
153 target directory, but this option can be used to place them
154 elsewhere. waldir must be an absolute path. As with the main
155 target directory, waldir need not exist already, but if it does
156 exist it must be empty. This option can only be specified when the
157 backup is in plain format.
158
159 -X method
160 --wal-method=method
161 Includes the required WAL (write-ahead log) files in the backup.
162 This will include all write-ahead logs generated during the backup.
163 Unless the method none is specified, it is possible to start a
164 postmaster in the target directory without the need to consult the
165 log archive, thus making the output a completely standalone backup.
166
167 The following methods for collecting the write-ahead logs are
168 supported:
169
170 n
171 none
172 Don't include write-ahead logs in the backup.
173
174 f
175 fetch
176 The write-ahead log files are collected at the end of the
177 backup. Therefore, it is necessary for the source server's
178 wal_keep_size parameter to be set high enough that the required
179 log data is not removed before the end of the backup. If the
180 required log data has been recycled before it's time to
181 transfer it, the backup will fail and be unusable.
182
183 When tar format is used, the write-ahead log files will be
184 included in the base.tar file.
185
186 s
187 stream
188 Stream write-ahead log data while the backup is being taken.
189 This method will open a second connection to the server and
190 start streaming the write-ahead log in parallel while running
191 the backup. Therefore, it will require two replication
192 connections not just one. As long as the client can keep up
193 with the write-ahead log data, using this method requires no
194 extra write-ahead logs to be saved on the source server.
195
196 When tar format is used, the write-ahead log files will be
197 written to a separate file named pg_wal.tar (if the server is a
198 version earlier than 10, the file will be named pg_xlog.tar).
199
200 This value is the default.
201
202 -z
203 --gzip
204 Enables gzip compression of tar file output, with the default
205 compression level. Compression is only available when using the tar
206 format, and the suffix .gz will automatically be added to all tar
207 filenames.
208
209 -Z level
210 --compress=level
211 Enables gzip compression of tar file output, and specifies the
212 compression level (0 through 9, 0 being no compression and 9 being
213 best compression). Compression is only available when using the tar
214 format, and the suffix .gz will automatically be added to all tar
215 filenames.
216
217 The following command-line options control the generation of the backup
218 and the invocation of the program:
219
220 -c fast|spread
221 --checkpoint=fast|spread
222 Sets checkpoint mode to fast (immediate) or spread (the default)
223 (see Section 25.3.3).
224
225 -C
226 --create-slot
227 Specifies that the replication slot named by the --slot option
228 should be created before starting the backup. An error is raised if
229 the slot already exists.
230
231 -l label
232 --label=label
233 Sets the label for the backup. If none is specified, a default
234 value of “pg_basebackup base backup” will be used.
235
236 -n
237 --no-clean
238 By default, when pg_basebackup aborts with an error, it removes any
239 directories it might have created before discovering that it cannot
240 finish the job (for example, the target directory and write-ahead
241 log directory). This option inhibits tidying-up and is thus useful
242 for debugging.
243
244 Note that tablespace directories are not cleaned up either way.
245
246 -N
247 --no-sync
248 By default, pg_basebackup will wait for all files to be written
249 safely to disk. This option causes pg_basebackup to return without
250 waiting, which is faster, but means that a subsequent operating
251 system crash can leave the base backup corrupt. Generally, this
252 option is useful for testing but should not be used when creating a
253 production installation.
254
255 -P
256 --progress
257 Enables progress reporting. Turning this on will deliver an
258 approximate progress report during the backup. Since the database
259 may change during the backup, this is only an approximation and may
260 not end at exactly 100%. In particular, when WAL log is included in
261 the backup, the total amount of data cannot be estimated in
262 advance, and in this case the estimated target size will increase
263 once it passes the total estimate without WAL.
264
265 -r rate
266 --max-rate=rate
267 Sets the maximum transfer rate at which data is collected from the
268 source server. This can be useful to limit the impact of
269 pg_basebackup on the server. Values are in kilobytes per second.
270 Use a suffix of M to indicate megabytes per second. A suffix of k
271 is also accepted, and has no effect. Valid values are between 32
272 kilobytes per second and 1024 megabytes per second.
273
274 This option always affects transfer of the data directory. Transfer
275 of WAL files is only affected if the collection method is fetch.
276
277 -S slotname
278 --slot=slotname
279 This option can only be used together with -X stream. It causes WAL
280 streaming to use the specified replication slot. If the base backup
281 is intended to be used as a streaming-replication standby using a
282 replication slot, the standby should then use the same replication
283 slot name as primary_slot_name. This ensures that the primary
284 server does not remove any necessary WAL data in the time between
285 the end of the base backup and the start of streaming replication
286 on the new standby.
287
288 The specified replication slot has to exist unless the option -C is
289 also used.
290
291 If this option is not specified and the server supports temporary
292 replication slots (version 10 and later), then a temporary
293 replication slot is automatically used for WAL streaming.
294
295 -v
296 --verbose
297 Enables verbose mode. Will output some extra steps during startup
298 and shutdown, as well as show the exact file name that is currently
299 being processed if progress reporting is also enabled.
300
301 --manifest-checksums=algorithm
302 Specifies the checksum algorithm that should be applied to each
303 file included in the backup manifest. Currently, the available
304 algorithms are NONE, CRC32C, SHA224, SHA256, SHA384, and SHA512.
305 The default is CRC32C.
306
307 If NONE is selected, the backup manifest will not contain any
308 checksums. Otherwise, it will contain a checksum of each file in
309 the backup using the specified algorithm. In addition, the manifest
310 will always contain a SHA256 checksum of its own contents. The SHA
311 algorithms are significantly more CPU-intensive than CRC32C, so
312 selecting one of them may increase the time required to complete
313 the backup.
314
315 Using a SHA hash function provides a cryptographically secure
316 digest of each file for users who wish to verify that the backup
317 has not been tampered with, while the CRC32C algorithm provides a
318 checksum that is much faster to calculate; it is good at catching
319 errors due to accidental changes but is not resistant to malicious
320 modifications. Note that, to be useful against an adversary who has
321 access to the backup, the backup manifest would need to be stored
322 securely elsewhere or otherwise verified not to have been modified
323 since the backup was taken.
324
325 pg_verifybackup(1) can be used to check the integrity of a backup
326 against the backup manifest.
327
328 --manifest-force-encode
329 Forces all filenames in the backup manifest to be hex-encoded. If
330 this option is not specified, only non-UTF8 filenames are
331 hex-encoded. This option is mostly intended to test that tools
332 which read a backup manifest file properly handle this case.
333
334 --no-estimate-size
335 Prevents the server from estimating the total amount of backup data
336 that will be streamed, resulting in the backup_total column in the
337 pg_stat_progress_basebackup view always being NULL.
338
339 Without this option, the backup will start by enumerating the size
340 of the entire database, and then go back and send the actual
341 contents. This may make the backup take slightly longer, and in
342 particular it will take longer before the first data is sent. This
343 option is useful to avoid such estimation time if it's too long.
344
345 This option is not allowed when using --progress.
346
347 --no-manifest
348 Disables generation of a backup manifest. If this option is not
349 specified, the server will generate and send a backup manifest
350 which can be verified using pg_verifybackup(1). The manifest is a
351 list of every file present in the backup with the exception of any
352 WAL files that may be included. It also stores the size, last
353 modification time, and an optional checksum for each file.
354
355 --no-slot
356 Prevents the creation of a temporary replication slot for the
357 backup.
358
359 By default, if log streaming is selected but no slot name is given
360 with the -S option, then a temporary replication slot is created
361 (if supported by the source server).
362
363 The main purpose of this option is to allow taking a base backup
364 when the server has no free replication slots. Using a replication
365 slot is almost always preferred, because it prevents needed WAL
366 from being removed by the server during the backup.
367
368 --no-verify-checksums
369 Disables verification of checksums, if they are enabled on the
370 server the base backup is taken from.
371
372 By default, checksums are verified and checksum failures will
373 result in a non-zero exit status. However, the base backup will not
374 be removed in such a case, as if the --no-clean option had been
375 used. Checksum verification failures will also be reported in the
376 pg_stat_database view.
377
378 The following command-line options control the connection to the source
379 server:
380
381 -d connstr
382 --dbname=connstr
383 Specifies parameters used to connect to the server, as a connection
384 string; these will override any conflicting command line options.
385
386 The option is called --dbname for consistency with other client
387 applications, but because pg_basebackup doesn't connect to any
388 particular database in the cluster, any database name in the
389 connection string will be ignored.
390
391 -h host
392 --host=host
393 Specifies the host name of the machine on which the server is
394 running. If the value begins with a slash, it is used as the
395 directory for a Unix domain socket. The default is taken from the
396 PGHOST environment variable, if set, else a Unix domain socket
397 connection is attempted.
398
399 -p port
400 --port=port
401 Specifies the TCP port or local Unix domain socket file extension
402 on which the server is listening for connections. Defaults to the
403 PGPORT environment variable, if set, or a compiled-in default.
404
405 -s interval
406 --status-interval=interval
407 Specifies the number of seconds between status packets sent back to
408 the source server. Smaller values allow more accurate monitoring of
409 backup progress from the server. A value of zero disables periodic
410 status updates completely, although an update will still be sent
411 when requested by the server, to avoid timeout-based disconnects.
412 The default value is 10 seconds.
413
414 -U username
415 --username=username
416 Specifies the user name to connect as.
417
418 -w
419 --no-password
420 Prevents issuing a password prompt. If the server requires password
421 authentication and a password is not available by other means such
422 as a .pgpass file, the connection attempt will fail. This option
423 can be useful in batch jobs and scripts where no user is present to
424 enter a password.
425
426 -W
427 --password
428 Forces pg_basebackup to prompt for a password before connecting to
429 the source server.
430
431 This option is never essential, since pg_basebackup will
432 automatically prompt for a password if the server demands password
433 authentication. However, pg_basebackup will waste a connection
434 attempt finding out that the server wants a password. In some cases
435 it is worth typing -W to avoid the extra connection attempt.
436
437 Other options are also available:
438
439 -V
440 --version
441 Prints the pg_basebackup version and exits.
442
443 -?
444 --help
445 Shows help about pg_basebackup command line arguments, and exits.
446
448 This utility, like most other PostgreSQL utilities, uses the
449 environment variables supported by libpq (see Section 33.14).
450
451 The environment variable PG_COLOR specifies whether to use color in
452 diagnostic messages. Possible values are always, auto and never.
453
455 At the beginning of the backup, a checkpoint needs to be performed on
456 the source server. This can take some time (especially if the option
457 --checkpoint=fast is not used), during which pg_basebackup will appear
458 to be idle.
459
460 The backup will include all files in the data directory and
461 tablespaces, including the configuration files and any additional files
462 placed in the directory by third parties, except certain temporary
463 files managed by PostgreSQL. But only regular files and directories are
464 copied, except that symbolic links used for tablespaces are preserved.
465 Symbolic links pointing to certain directories known to PostgreSQL are
466 copied as empty directories. Other symbolic links and special device
467 files are skipped. See Section 52.4 for the precise details.
468
469 In plain format, tablespaces will be backed up to the same path they
470 have on the source server, unless the option --tablespace-mapping is
471 used. Without this option, running a plain format base backup on the
472 same host as the server will not work if tablespaces are in use,
473 because the backup would have to be written to the same directory
474 locations as the original tablespaces.
475
476 When tar format is used, it is the user's responsibility to unpack each
477 tar file before starting a PostgreSQL server that uses the data. If
478 there are additional tablespaces, the tar files for them need to be
479 unpacked in the correct locations. In this case the symbolic links for
480 those tablespaces will be created by the server according to the
481 contents of the tablespace_map file that is included in the base.tar
482 file.
483
484 pg_basebackup works with servers of the same or an older major version,
485 down to 9.1. However, WAL streaming mode (-X stream) only works with
486 server version 9.3 and later, and tar format (--format=tar) only works
487 with server version 9.5 and later.
488
489 pg_basebackup will preserve group permissions for data files if group
490 permissions are enabled on the source cluster.
491
493 To create a base backup of the server at mydbserver and store it in the
494 local directory /usr/local/pgsql/data:
495
496 $ pg_basebackup -h mydbserver -D /usr/local/pgsql/data
497
498 To create a backup of the local server with one compressed tar file for
499 each tablespace, and store it in the directory backup, showing a
500 progress report while running:
501
502 $ pg_basebackup -D backup -Ft -z -P
503
504 To create a backup of a single-tablespace local database and compress
505 this with bzip2:
506
507 $ pg_basebackup -D - -Ft -X fetch | bzip2 > backup.tar.bz2
508
509 (This command will fail if there are multiple tablespaces in the
510 database.)
511
512 To create a backup of a local database where the tablespace in /opt/ts
513 is relocated to ./backup/ts:
514
515 $ pg_basebackup -D backup/data -T /opt/ts=$(pwd)/backup/ts
516
518 pg_dump(1)
519
520
521
522PostgreSQL 13.4 2021 PG_BASEBACKUP(1)