1COPY(7) PostgreSQL 10.7 Documentation COPY(7)
2
3
4
6 COPY - copy data between a file and a table
7
9 COPY table_name [ ( column_name [, ...] ) ]
10 FROM { 'filename' | PROGRAM 'command' | STDIN }
11 [ [ WITH ] ( option [, ...] ) ]
12
13 COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
14 TO { 'filename' | PROGRAM 'command' | STDOUT }
15 [ [ WITH ] ( option [, ...] ) ]
16
17 where option can be one of:
18
19 FORMAT format_name
20 OIDS [ boolean ]
21 FREEZE [ boolean ]
22 DELIMITER 'delimiter_character'
23 NULL 'null_string'
24 HEADER [ boolean ]
25 QUOTE 'quote_character'
26 ESCAPE 'escape_character'
27 FORCE_QUOTE { ( column_name [, ...] ) | * }
28 FORCE_NOT_NULL ( column_name [, ...] )
29 FORCE_NULL ( column_name [, ...] )
30 ENCODING 'encoding_name'
31
33 COPY moves data between PostgreSQL tables and standard file-system
34 files. COPY TO copies the contents of a table to a file, while COPY
35 FROM copies data from a file to a table (appending the data to whatever
36 is in the table already). COPY TO can also copy the results of a
37 SELECT query.
38
39 If a list of columns is specified, COPY will only copy the data in the
40 specified columns to or from the file. If there are any columns in the
41 table that are not in the column list, COPY FROM will insert the
42 default values for those columns.
43
44 COPY with a file name instructs the PostgreSQL server to directly read
45 from or write to a file. The file must be accessible by the PostgreSQL
46 user (the user ID the server runs as) and the name must be specified
47 from the viewpoint of the server. When PROGRAM is specified, the server
48 executes the given command and reads from the standard output of the
49 program, or writes to the standard input of the program. The command
50 must be specified from the viewpoint of the server, and be executable
51 by the PostgreSQL user. When STDIN or STDOUT is specified, data is
52 transmitted via the connection between the client and the server.
53
55 table_name
56 The name (optionally schema-qualified) of an existing table.
57
58 column_name
59 An optional list of columns to be copied. If no column list is
60 specified, all columns of the table will be copied.
61
62 query
63 A SELECT(7), VALUES(7), INSERT(7), UPDATE(7) or DELETE(7) command
64 whose results are to be copied. Note that parentheses are required
65 around the query.
66
67 For INSERT, UPDATE and DELETE queries a RETURNING clause must be
68 provided, and the target relation must not have a conditional rule,
69 nor an ALSO rule, nor an INSTEAD rule that expands to multiple
70 statements.
71
72 filename
73 The path name of the input or output file. An input file name can
74 be an absolute or relative path, but an output file name must be an
75 absolute path. Windows users might need to use an E'' string and
76 double any backslashes used in the path name.
77
78 PROGRAM
79 A command to execute. In COPY FROM, the input is read from standard
80 output of the command, and in COPY TO, the output is written to the
81 standard input of the command.
82
83 Note that the command is invoked by the shell, so if you need to
84 pass any arguments to shell command that come from an untrusted
85 source, you must be careful to strip or escape any special
86 characters that might have a special meaning for the shell. For
87 security reasons, it is best to use a fixed command string, or at
88 least avoid passing any user input in it.
89
90 STDIN
91 Specifies that input comes from the client application.
92
93 STDOUT
94 Specifies that output goes to the client application.
95
96 boolean
97 Specifies whether the selected option should be turned on or off.
98 You can write TRUE, ON, or 1 to enable the option, and FALSE, OFF,
99 or 0 to disable it. The boolean value can also be omitted, in which
100 case TRUE is assumed.
101
102 FORMAT
103 Selects the data format to be read or written: text, csv (Comma
104 Separated Values), or binary. The default is text.
105
106 OIDS
107 Specifies copying the OID for each row. (An error is raised if OIDS
108 is specified for a table that does not have OIDs, or in the case of
109 copying a query.)
110
111 FREEZE
112 Requests copying the data with rows already frozen, just as they
113 would be after running the VACUUM FREEZE command. This is intended
114 as a performance option for initial data loading. Rows will be
115 frozen only if the table being loaded has been created or truncated
116 in the current subtransaction, there are no cursors open and there
117 are no older snapshots held by this transaction. It is currently
118 not possible to perform a COPY FREEZE on a partitioned table.
119
120 Note that all other sessions will immediately be able to see the
121 data once it has been successfully loaded. This violates the normal
122 rules of MVCC visibility and users specifying should be aware of
123 the potential problems this might cause.
124
125 DELIMITER
126 Specifies the character that separates columns within each row
127 (line) of the file. The default is a tab character in text format,
128 a comma in CSV format. This must be a single one-byte character.
129 This option is not allowed when using binary format.
130
131 NULL
132 Specifies the string that represents a null value. The default is
133 \N (backslash-N) in text format, and an unquoted empty string in
134 CSV format. You might prefer an empty string even in text format
135 for cases where you don't want to distinguish nulls from empty
136 strings. This option is not allowed when using binary format.
137
138 Note
139 When using COPY FROM, any data item that matches this string
140 will be stored as a null value, so you should make sure that
141 you use the same string as you used with COPY TO.
142
143 HEADER
144 Specifies that the file contains a header line with the names of
145 each column in the file. On output, the first line contains the
146 column names from the table, and on input, the first line is
147 ignored. This option is allowed only when using CSV format.
148
149 QUOTE
150 Specifies the quoting character to be used when a data value is
151 quoted. The default is double-quote. This must be a single one-byte
152 character. This option is allowed only when using CSV format.
153
154 ESCAPE
155 Specifies the character that should appear before a data character
156 that matches the QUOTE value. The default is the same as the QUOTE
157 value (so that the quoting character is doubled if it appears in
158 the data). This must be a single one-byte character. This option is
159 allowed only when using CSV format.
160
161 FORCE_QUOTE
162 Forces quoting to be used for all non-NULL values in each specified
163 column. NULL output is never quoted. If * is specified, non-NULL
164 values will be quoted in all columns. This option is allowed only
165 in COPY TO, and only when using CSV format.
166
167 FORCE_NOT_NULL
168 Do not match the specified columns' values against the null string.
169 In the default case where the null string is empty, this means that
170 empty values will be read as zero-length strings rather than nulls,
171 even when they are not quoted. This option is allowed only in COPY
172 FROM, and only when using CSV format.
173
174 FORCE_NULL
175 Match the specified columns' values against the null string, even
176 if it has been quoted, and if a match is found set the value to
177 NULL. In the default case where the null string is empty, this
178 converts a quoted empty string into NULL. This option is allowed
179 only in COPY FROM, and only when using CSV format.
180
181 ENCODING
182 Specifies that the file is encoded in the encoding_name. If this
183 option is omitted, the current client encoding is used. See the
184 Notes below for more details.
185
187 On successful completion, a COPY command returns a command tag of the
188 form
189
190 COPY count
191
192 The count is the number of rows copied.
193
194 Note
195 psql will print this command tag only if the command was not COPY
196 ... TO STDOUT, or the equivalent psql meta-command \copy ... to
197 stdout. This is to prevent confusing the command tag with the data
198 that was just printed.
199
201 COPY TO can only be used with plain tables, not with views. However,
202 you can write COPY (SELECT * FROM viewname) TO ... to copy the current
203 contents of a view.
204
205 COPY FROM can be used with plain tables and with views that have
206 INSTEAD OF INSERT triggers.
207
208 COPY only deals with the specific table named; it does not copy data to
209 or from child tables. Thus for example COPY table TO shows the same
210 data as SELECT * FROM ONLY table. But COPY (SELECT * FROM table) TO ...
211 can be used to dump all of the data in an inheritance hierarchy.
212
213 You must have select privilege on the table whose values are read by
214 COPY TO, and insert privilege on the table into which values are
215 inserted by COPY FROM. It is sufficient to have column privileges on
216 the column(s) listed in the command.
217
218 If row-level security is enabled for the table, the relevant SELECT
219 policies will apply to COPY table TO statements. Currently, COPY FROM
220 is not supported for tables with row-level security. Use equivalent
221 INSERT statements instead.
222
223 Files named in a COPY command are read or written directly by the
224 server, not by the client application. Therefore, they must reside on
225 or be accessible to the database server machine, not the client. They
226 must be accessible to and readable or writable by the PostgreSQL user
227 (the user ID the server runs as), not the client. Similarly, the
228 command specified with PROGRAM is executed directly by the server, not
229 by the client application, must be executable by the PostgreSQL user.
230 COPY naming a file or command is only allowed to database superusers,
231 since it allows reading or writing any file that the server has
232 privileges to access.
233
234 Do not confuse COPY with the psql instruction \copy. \copy invokes
235 COPY FROM STDIN or COPY TO STDOUT, and then fetches/stores the data in
236 a file accessible to the psql client. Thus, file accessibility and
237 access rights depend on the client rather than the server when \copy is
238 used.
239
240 It is recommended that the file name used in COPY always be specified
241 as an absolute path. This is enforced by the server in the case of COPY
242 TO, but for COPY FROM you do have the option of reading from a file
243 specified by a relative path. The path will be interpreted relative to
244 the working directory of the server process (normally the cluster's
245 data directory), not the client's working directory.
246
247 Executing a command with PROGRAM might be restricted by the operating
248 system's access control mechanisms, such as SELinux.
249
250 COPY FROM will invoke any triggers and check constraints on the
251 destination table. However, it will not invoke rules.
252
253 For identity columns, the COPY FROM command will always write the
254 column values provided in the input data, like the INSERT option
255 OVERRIDING SYSTEM VALUE.
256
257 COPY input and output is affected by DateStyle. To ensure portability
258 to other PostgreSQL installations that might use non-default DateStyle
259 settings, DateStyle should be set to ISO before using COPY TO. It is
260 also a good idea to avoid dumping data with IntervalStyle set to
261 sql_standard, because negative interval values might be misinterpreted
262 by a server that has a different setting for IntervalStyle.
263
264 Input data is interpreted according to ENCODING option or the current
265 client encoding, and output data is encoded in ENCODING or the current
266 client encoding, even if the data does not pass through the client but
267 is read from or written to a file directly by the server.
268
269 COPY stops operation at the first error. This should not lead to
270 problems in the event of a COPY TO, but the target table will already
271 have received earlier rows in a COPY FROM. These rows will not be
272 visible or accessible, but they still occupy disk space. This might
273 amount to a considerable amount of wasted disk space if the failure
274 happened well into a large copy operation. You might wish to invoke
275 VACUUM to recover the wasted space.
276
277 FORCE_NULL and FORCE_NOT_NULL can be used simultaneously on the same
278 column. This results in converting quoted null strings to null values
279 and unquoted null strings to empty strings.
280
282 Text Format
283 When the text format is used, the data read or written is a text file
284 with one line per table row. Columns in a row are separated by the
285 delimiter character. The column values themselves are strings generated
286 by the output function, or acceptable to the input function, of each
287 attribute's data type. The specified null string is used in place of
288 columns that are null. COPY FROM will raise an error if any line of
289 the input file contains more or fewer columns than are expected. If
290 OIDS is specified, the OID is read or written as the first column,
291 preceding the user data columns.
292
293 End of data can be represented by a single line containing just
294 backslash-period (\.). An end-of-data marker is not necessary when
295 reading from a file, since the end of file serves perfectly well; it is
296 needed only when copying data to or from client applications using
297 pre-3.0 client protocol.
298
299 Backslash characters (\) can be used in the COPY data to quote data
300 characters that might otherwise be taken as row or column delimiters.
301 In particular, the following characters must be preceded by a backslash
302 if they appear as part of a column value: backslash itself, newline,
303 carriage return, and the current delimiter character.
304
305 The specified null string is sent by COPY TO without adding any
306 backslashes; conversely, COPY FROM matches the input against the null
307 string before removing backslashes. Therefore, a null string such as \N
308 cannot be confused with the actual data value \N (which would be
309 represented as \\N).
310
311 The following special backslash sequences are recognized by COPY FROM:
312
313 ┌─────────┬────────────────────────────┐
314 │Sequence │ Represents │
315 ├─────────┼────────────────────────────┤
316 │\b │ Backspace (ASCII 8) │
317 ├─────────┼────────────────────────────┤
318 │\f │ Form feed (ASCII 12) │
319 ├─────────┼────────────────────────────┤
320 │\n │ Newline (ASCII 10) │
321 ├─────────┼────────────────────────────┤
322 │\r │ Carriage return (ASCII 13) │
323 ├─────────┼────────────────────────────┤
324 │\t │ Tab (ASCII 9) │
325 ├─────────┼────────────────────────────┤
326 │\v │ Vertical tab (ASCII 11) │
327 ├─────────┼────────────────────────────┤
328 │\digits │ Backslash followed by one │
329 │ │ to three octal digits │
330 │ │ specifies │
331 │ │ the character with │
332 │ │ that numeric code │
333 ├─────────┼────────────────────────────┤
334 │\xdigits │ Backslash x followed by │
335 │ │ one or two hex digits │
336 │ │ specifies │
337 │ │ the character with │
338 │ │ that numeric code │
339 └─────────┴────────────────────────────┘
340 Presently, COPY TO will never emit an octal or hex-digits backslash
341 sequence, but it does use the other sequences listed above for those
342 control characters.
343
344 Any other backslashed character that is not mentioned in the above
345 table will be taken to represent itself. However, beware of adding
346 backslashes unnecessarily, since that might accidentally produce a
347 string matching the end-of-data marker (\.) or the null string (\N by
348 default). These strings will be recognized before any other backslash
349 processing is done.
350
351 It is strongly recommended that applications generating COPY data
352 convert data newlines and carriage returns to the \n and \r sequences
353 respectively. At present it is possible to represent a data carriage
354 return by a backslash and carriage return, and to represent a data
355 newline by a backslash and newline. However, these representations
356 might not be accepted in future releases. They are also highly
357 vulnerable to corruption if the COPY file is transferred across
358 different machines (for example, from Unix to Windows or vice versa).
359
360 COPY TO will terminate each row with a Unix-style newline (“\n”).
361 Servers running on Microsoft Windows instead output carriage
362 return/newline (“\r\n”), but only for COPY to a server file; for
363 consistency across platforms, COPY TO STDOUT always sends “\n”
364 regardless of server platform. COPY FROM can handle lines ending with
365 newlines, carriage returns, or carriage return/newlines. To reduce the
366 risk of error due to un-backslashed newlines or carriage returns that
367 were meant as data, COPY FROM will complain if the line endings in the
368 input are not all alike.
369
370 CSV Format
371 This format option is used for importing and exporting the Comma
372 Separated Value (CSV) file format used by many other programs, such as
373 spreadsheets. Instead of the escaping rules used by PostgreSQL's
374 standard text format, it produces and recognizes the common CSV
375 escaping mechanism.
376
377 The values in each record are separated by the DELIMITER character. If
378 the value contains the delimiter character, the QUOTE character, the
379 NULL string, a carriage return, or line feed character, then the whole
380 value is prefixed and suffixed by the QUOTE character, and any
381 occurrence within the value of a QUOTE character or the ESCAPE
382 character is preceded by the escape character. You can also use
383 FORCE_QUOTE to force quotes when outputting non-NULL values in specific
384 columns.
385
386 The CSV format has no standard way to distinguish a NULL value from an
387 empty string. PostgreSQL's COPY handles this by quoting. A NULL is
388 output as the NULL parameter string and is not quoted, while a non-NULL
389 value matching the NULL parameter string is quoted. For example, with
390 the default settings, a NULL is written as an unquoted empty string,
391 while an empty string data value is written with double quotes ("").
392 Reading values follows similar rules. You can use FORCE_NOT_NULL to
393 prevent NULL input comparisons for specific columns. You can also use
394 FORCE_NULL to convert quoted null string data values to NULL.
395
396 Because backslash is not a special character in the CSV format, \., the
397 end-of-data marker, could also appear as a data value. To avoid any
398 misinterpretation, a \. data value appearing as a lone entry on a line
399 is automatically quoted on output, and on input, if quoted, is not
400 interpreted as the end-of-data marker. If you are loading a file
401 created by another application that has a single unquoted column and
402 might have a value of \., you might need to quote that value in the
403 input file.
404
405 Note
406 In CSV format, all characters are significant. A quoted value
407 surrounded by white space, or any characters other than DELIMITER,
408 will include those characters. This can cause errors if you import
409 data from a system that pads CSV lines with white space out to some
410 fixed width. If such a situation arises you might need to
411 preprocess the CSV file to remove the trailing white space, before
412 importing the data into PostgreSQL.
413
414 Note
415 CSV format will both recognize and produce CSV files with quoted
416 values containing embedded carriage returns and line feeds. Thus
417 the files are not strictly one line per table row like text-format
418 files.
419
420 Note
421 Many programs produce strange and occasionally perverse CSV files,
422 so the file format is more a convention than a standard. Thus you
423 might encounter some files that cannot be imported using this
424 mechanism, and COPY might produce files that other programs cannot
425 process.
426
427 Binary Format
428 The binary format option causes all data to be stored/read as binary
429 format rather than as text. It is somewhat faster than the text and CSV
430 formats, but a binary-format file is less portable across machine
431 architectures and PostgreSQL versions. Also, the binary format is very
432 data type specific; for example it will not work to output binary data
433 from a smallint column and read it into an integer column, even though
434 that would work fine in text format.
435
436 The binary file format consists of a file header, zero or more tuples
437 containing the row data, and a file trailer. Headers and data are in
438 network byte order.
439
440 Note
441 PostgreSQL releases before 7.4 used a different binary file format.
442
443 File Header
444 The file header consists of 15 bytes of fixed fields, followed by a
445 variable-length header extension area. The fixed fields are:
446
447 Signature
448 11-byte sequence PGCOPY\n\377\r\n\0 — note that the zero byte
449 is a required part of the signature. (The signature is designed
450 to allow easy identification of files that have been munged by
451 a non-8-bit-clean transfer. This signature will be changed by
452 end-of-line-translation filters, dropped zero bytes, dropped
453 high bits, or parity changes.)
454
455 Flags field
456 32-bit integer bit mask to denote important aspects of the file
457 format. Bits are numbered from 0 (LSB) to 31 (MSB). Note that
458 this field is stored in network byte order (most significant
459 byte first), as are all the integer fields used in the file
460 format. Bits 16-31 are reserved to denote critical file format
461 issues; a reader should abort if it finds an unexpected bit set
462 in this range. Bits 0-15 are reserved to signal
463 backwards-compatible format issues; a reader should simply
464 ignore any unexpected bits set in this range. Currently only
465 one flag bit is defined, and the rest must be zero:
466
467 Bit 16
468 if 1, OIDs are included in the data; if 0, not
469
470 Header extension area length
471 32-bit integer, length in bytes of remainder of header, not
472 including self. Currently, this is zero, and the first tuple
473 follows immediately. Future changes to the format might allow
474 additional data to be present in the header. A reader should
475 silently skip over any header extension data it does not know
476 what to do with.
477
478 The header extension area is envisioned to contain a sequence of
479 self-identifying chunks. The flags field is not intended to tell
480 readers what is in the extension area. Specific design of header
481 extension contents is left for a later release.
482
483 This design allows for both backwards-compatible header additions
484 (add header extension chunks, or set low-order flag bits) and
485 non-backwards-compatible changes (set high-order flag bits to
486 signal such changes, and add supporting data to the extension area
487 if needed).
488
489 Tuples
490 Each tuple begins with a 16-bit integer count of the number of
491 fields in the tuple. (Presently, all tuples in a table will have
492 the same count, but that might not always be true.) Then, repeated
493 for each field in the tuple, there is a 32-bit length word followed
494 by that many bytes of field data. (The length word does not include
495 itself, and can be zero.) As a special case, -1 indicates a NULL
496 field value. No value bytes follow in the NULL case.
497
498 There is no alignment padding or any other extra data between
499 fields.
500
501 Presently, all data values in a binary-format file are assumed to
502 be in binary format (format code one). It is anticipated that a
503 future extension might add a header field that allows per-column
504 format codes to be specified.
505
506 To determine the appropriate binary format for the actual tuple
507 data you should consult the PostgreSQL source, in particular the
508 *send and *recv functions for each column's data type (typically
509 these functions are found in the src/backend/utils/adt/ directory
510 of the source distribution).
511
512 If OIDs are included in the file, the OID field immediately follows
513 the field-count word. It is a normal field except that it's not
514 included in the field-count. In particular it has a length word —
515 this will allow handling of 4-byte vs. 8-byte OIDs without too much
516 pain, and will allow OIDs to be shown as null if that ever proves
517 desirable.
518
519 File Trailer
520 The file trailer consists of a 16-bit integer word containing -1.
521 This is easily distinguished from a tuple's field-count word.
522
523 A reader should report an error if a field-count word is neither -1
524 nor the expected number of columns. This provides an extra check
525 against somehow getting out of sync with the data.
526
528 The following example copies a table to the client using the vertical
529 bar (|) as the field delimiter:
530
531 COPY country TO STDOUT (DELIMITER '|');
532
533 To copy data from a file into the country table:
534
535 COPY country FROM '/usr1/proj/bray/sql/country_data';
536
537 To copy into a file just the countries whose names start with 'A':
538
539 COPY (SELECT * FROM country WHERE country_name LIKE 'A%') TO '/usr1/proj/bray/sql/a_list_countries.copy';
540
541 To copy into a compressed file, you can pipe the output through an
542 external compression program:
543
544 COPY country TO PROGRAM 'gzip > /usr1/proj/bray/sql/country_data.gz';
545
546 Here is a sample of data suitable for copying into a table from STDIN:
547
548 AF AFGHANISTAN
549 AL ALBANIA
550 DZ ALGERIA
551 ZM ZAMBIA
552 ZW ZIMBABWE
553
554 Note that the white space on each line is actually a tab character.
555
556 The following is the same data, output in binary format. The data is
557 shown after filtering through the Unix utility od -c. The table has
558 three columns; the first has type char(2), the second has type text,
559 and the third has type integer. All the rows have a null value in the
560 third column.
561
562 0000000 P G C O P Y \n 377 \r \n \0 \0 \0 \0 \0 \0
563 0000020 \0 \0 \0 \0 003 \0 \0 \0 002 A F \0 \0 \0 013 A
564 0000040 F G H A N I S T A N 377 377 377 377 \0 003
565 0000060 \0 \0 \0 002 A L \0 \0 \0 007 A L B A N I
566 0000100 A 377 377 377 377 \0 003 \0 \0 \0 002 D Z \0 \0 \0
567 0000120 007 A L G E R I A 377 377 377 377 \0 003 \0 \0
568 0000140 \0 002 Z M \0 \0 \0 006 Z A M B I A 377 377
569 0000160 377 377 \0 003 \0 \0 \0 002 Z W \0 \0 \0 \b Z I
570 0000200 M B A B W E 377 377 377 377 377 377
571
573 There is no COPY statement in the SQL standard.
574
575 The following syntax was used before PostgreSQL version 9.0 and is
576 still supported:
577
578 COPY table_name [ ( column_name [, ...] ) ]
579 FROM { 'filename' | STDIN }
580 [ [ WITH ]
581 [ BINARY ]
582 [ OIDS ]
583 [ DELIMITER [ AS ] 'delimiter' ]
584 [ NULL [ AS ] 'null string' ]
585 [ CSV [ HEADER ]
586 [ QUOTE [ AS ] 'quote' ]
587 [ ESCAPE [ AS ] 'escape' ]
588 [ FORCE NOT NULL column_name [, ...] ] ] ]
589
590 COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }
591 TO { 'filename' | STDOUT }
592 [ [ WITH ]
593 [ BINARY ]
594 [ OIDS ]
595 [ DELIMITER [ AS ] 'delimiter' ]
596 [ NULL [ AS ] 'null string' ]
597 [ CSV [ HEADER ]
598 [ QUOTE [ AS ] 'quote' ]
599 [ ESCAPE [ AS ] 'escape' ]
600 [ FORCE QUOTE { column_name [, ...] | * } ] ] ]
601
602 Note that in this syntax, BINARY and CSV are treated as independent
603 keywords, not as arguments of a FORMAT option.
604
605 The following syntax was used before PostgreSQL version 7.3 and is
606 still supported:
607
608 COPY [ BINARY ] table_name [ WITH OIDS ]
609 FROM { 'filename' | STDIN }
610 [ [USING] DELIMITERS 'delimiter' ]
611 [ WITH NULL AS 'null string' ]
612
613 COPY [ BINARY ] table_name [ WITH OIDS ]
614 TO { 'filename' | STDOUT }
615 [ [USING] DELIMITERS 'delimiter' ]
616 [ WITH NULL AS 'null string' ]
617
618
619
620
621PostgreSQL 10.7 2019 COPY(7)