1PDFTK(1) General Commands Manual PDFTK(1)
2
3
4
6 pdftk - A handy tool for manipulating PDF
7
9 pdftk <input PDF files | - | PROMPT>
10 [ input_pw <input PDF owner passwords | PROMPT> ]
11 [ <operation> <operation arguments> ]
12 [ output <output filename | - | PROMPT> ]
13 [ encrypt_40bit | encrypt_128bit | encrypt_aes128 ]
14 [ allow <permissions> ]
15 [ owner_pw <owner password | PROMPT> ]
16 [ user_pw <user password | PROMPT> ]
17 [ flatten ] [ need_appearances ]
18 [ compress | uncompress ]
19 [ keep_first_id | keep_final_id ] [ drop_xfa ] [ drop_xmp ]
20 [ replacement_font <font name> ]
21 [ verbose ] [ dont_ask | do_ask ]
22 Where:
23 <operation> may be empty, or:
24 [ cat | shuffle | burst | rotate |
25 generate_fdf | fill_form |
26 background | multibackground |
27 stamp | multistamp |
28 dump_data | dump_data_utf8 |
29 dump_data_fields | dump_data_fields_utf8 |
30 dump_data_annots |
31 update_info | update_info_utf8 |
32 attach_files | unpack_files ]
33
34 For Complete Help: pdftk --help
35
37 If PDF is electronic paper, then pdftk is an electronic staple-remover,
38 hole-punch, binder, secret-decoder-ring, and X-Ray-glasses. Pdftk is a
39 simple tool for doing everyday things with PDF documents. Use it to:
40
41 * Merge PDF Documents or Collate PDF Page Scans
42 * Split PDF Pages into a New Document
43 * Rotate PDF Documents or Pages
44 * Decrypt Input as Necessary (Password Required)
45 * Encrypt Output as Desired
46 * Fill PDF Forms with X/FDF Data and/or Flatten Forms
47 * Generate FDF Data Stencils from PDF Forms
48 * Apply a Background Watermark or a Foreground Stamp
49 * Report PDF Metrics, Bookmarks and Metadata
50 * Add/Update PDF Metrics, Bookmarks or Metadata
51 * Attach Files to PDF Pages or the PDF Document
52 * Unpack PDF Attachments
53 * Burst a PDF Document into Single Pages
54 * Uncompress and Re-Compress Page Streams
55 * Repair Corrupted PDF (Where Possible)
56
58 A summary of options is included below.
59
60 --help, -h
61 Show this summary of options.
62
63 <input PDF files | - | PROMPT>
64 A list of the input PDF files. If you plan to combine these PDFs
65 (without using handles) then list files in the order you want
66 them combined. Use - to pass a single PDF into pdftk via stdin.
67 Input files can be associated with handles, where a handle is
68 one or more upper-case letters:
69
70 <input PDF handle>=<input PDF filename>
71
72 Handles are often omitted. They are useful when specifying PDF
73 passwords or page ranges, later.
74
75 For example: A=input1.pdf QT=input2.pdf M=input3.pdf
76
77 [input_pw <input PDF owner passwords | PROMPT>]
78 Input PDF owner passwords, if necessary, are associated with
79 files by using their handles:
80
81 <input PDF handle>=<input PDF file owner password>
82
83 If handles are not given, then passwords are associated with in‐
84 put files by order.
85
86 Most pdftk features require that encrypted input PDF are accom‐
87 panied by the ~owner~ password. If the input PDF has no owner
88 password, then the user password must be given, instead. If the
89 input PDF has no passwords, then no password should be given.
90
91 When running in do_ask mode, pdftk will prompt you for a pass‐
92 word if the supplied password is incorrect or none was given.
93
94 [<operation> <operation arguments>]
95 Available operations are: cat, shuffle, burst, rotate, gener‐
96 ate_fdf, fill_form, background, multibackground, stamp, multi‐
97 stamp, dump_data, dump_data_utf8, dump_data_fields,
98 dump_data_fields_utf8, dump_data_annots, update_info, up‐
99 date_info_utf8, attach_files, unpack_files. Some operations
100 takes additional arguments, described below.
101
102 If this optional argument is omitted, then pdftk runs in 'fil‐
103 ter' mode. Filter mode takes only one PDF input and creates a
104 new PDF after applying all of the output options, like encryp‐
105 tion and compression.
106
107 cat [<page ranges>]
108 Assembles (catenates) pages from input PDFs to create a new
109 PDF. Use cat to merge PDF pages or to split PDF pages from
110 documents. You can also use it to rotate PDF pages. Page or‐
111 der in the new PDF is specified by the order of the given
112 page ranges. Page ranges are described like this:
113
114 <input PDF handle>[<begin page number>[-<end page num‐
115 ber>[<qualifier>]]][<page rotation>]
116
117 Where the handle identifies one of the input PDF files, and
118 the beginning and ending page numbers are one-based refer‐
119 ences to pages in the PDF file. The qualifier can be even,
120 odd, or ~, and the page rotation can be north, south, east,
121 west, left, right, or down.
122
123 If a PDF handle is given but no pages are specified, then the
124 entire PDF is used. If no pages are specified for any of the
125 input PDFs, then the input PDFs' bookmarks are also merged
126 and included in the output.
127
128 If the handle is omitted from the page range, then the pages
129 are taken from the first input PDF.
130
131 The even qualifier causes pdftk to use only the even-numbered
132 PDF pages, so 1-6even yields pages 2, 4 and 6 in that order.
133 6-1even yields pages 6, 4 and 2 in that order.
134
135 The odd qualifier works similarly to the even.
136
137 Pages can be subtracted from a page range using the ~ quali‐
138 fier followed by a page range. For instance, 1-20~5-6 and
139 1-20~5~6 are equivalent to 1-4 7-20, and ~5 yields all pages
140 except page 5. Depending on your shell, you may need to quote
141 this argument because of the ~ at the beginning.
142
143 The page rotation setting can cause pdftk to rotate pages and
144 documents. Each option sets the page rotation as follows (in
145 degrees): north: 0, east: 90, south: 180, west: 270, left:
146 -90, right: +90, down: +180. left, right, and down make rela‐
147 tive adjustments to a page's rotation.
148
149 If no arguments are passed to cat, then pdftk combines all
150 input PDFs in the order they were given to create the output.
151
152 NOTES:
153 * <end page number> may be less than <begin page number>.
154 * The keyword end may be used to reference the final page of
155 a document instead of a page number.
156 * Reference a single page by omitting the ending page number.
157 * The handle may be used alone to represent the entire PDF
158 document, e.g., B1-end is the same as B.
159 * You can reference page numbers in reverse order by prefix‐
160 ing them with the letter r. For example, page r1 is the last
161 page of the document, r2 is the next-to-last page of the doc‐
162 ument, and rend is the first page of the document. You can
163 use this prefix in ranges, too, for example r3-r1 is the last
164 three pages of a PDF.
165
166 Page Range Examples without Handles:
167 1\-endeast – rotate entire document 90 degrees
168 5 11 20 – take single pages from input PDF
169 5-25oddwest – take odd pages in range, rotate 90 degrees
170 6-1 – reverse pages in range from input PDF
171
172 Page Range Examples Using Handles:
173 Say A=in1.pdf B=in2.pdf, then:
174 A1-21 – take range from in1.pdf
175 Bend-1odd – take all odd pages from in2.pdf in reverse order
176 A72 – take a single page from in1.pdf
177 A1-21 Beven A72 – assemble pages from both in1.pdf and
178 in2.pdf
179 Awest – rotate entire in1.pdf document 90 degrees
180 B – use all of in2.pdf
181 A2-30evenleft – take the even pages from the range, remove 90
182 degrees from each page's rotation
183 A A – catenate in1.pdf with in1.pdf
184 Aevenwest Aoddeast – apply rotations to even pages, odd pages
185 from in1.pdf
186 Awest Bwest Bdown – catenate rotated documents
187
188 shuffle [<page ranges>]
189 Collates pages from input PDFs to create a new PDF. Works
190 like the cat operation except that it takes one page at a
191 time from each page range to assemble the output PDF. If one
192 range runs out of pages, it continues with the remaining
193 ranges. Ranges can use all of the features described above
194 for cat, like reverse page ranges, multiple ranges from a
195 single PDF, and page rotation. This feature was designed to
196 help collate PDF pages after scanning paper documents.
197
198 burst Splits a single input PDF document into individual pages.
199 Also creates a report named doc_data.txt which is the same as
200 the output from dump_data. The output section can contain a
201 printf-styled format string to name these pages. For exam‐
202 ple, if you want pages named page_01.pdf, page_02.pdf, etc.,
203 pass output page_%02d.pdf to pdftk. If the pattern is omit‐
204 ted, then a default pattern g_%04d.pdf is appended and pro‐
205 duces pages named pg_0001.pdf, pg_0002.pdf, etc. Encryption
206 can be applied to the output by appending output options such
207 as owner_pw, e.g.:
208
209 pdftk in.pdf burst owner_pw foopass
210
211 rotate [<page ranges>]
212 Takes a single input PDF and rotates just the specified
213 pages. All other pages remain unchanged. The page order re‐
214 mains unchanged. Specify the pages to rotate using the same
215 notation as you would with cat, except you omit the pages
216 that you aren't rotating:
217
218 [<begin page number>[-<end page number>[<qualifier>]]][<page
219 rotation>]
220
221 The qualifier can be even or odd, and the page rotation can
222 be north, south, east, west, left, right, or down.
223
224 Each option sets the page rotation as follows (in degrees):
225 north: 0, east: 90, south: 180, west: 270, left: -90, right:
226 +90, down: +180. left, right, and down make relative adjust‐
227 ments to a page's rotation.
228
229 The given order of the pages doesn't change the page order in
230 the output.
231
232 generate_fdf
233 Reads a single input PDF file and generates an FDF file suit‐
234 able for fill_form out of it to the given output filename or
235 (if no output is given) to stdout. Does not create a new
236 PDF.
237
238 fill_form <FDF data filename | XFDF data filename | - | PROMPT>
239 Fills the single input PDF's form fields with the data from
240 an FDF file, XFDF file or stdin. Enter the data filename af‐
241 ter fill_form, or use - to pass the data via stdin, like so:
242
243 pdftk form.pdf fill_form data.fdf output form.filled.pdf
244
245 If the input FDF file includes Rich Text formatted data in
246 addition to plain text, then the Rich Text data is packed
247 into the form fields as well as the plain text. Pdftk also
248 sets a flag that cues Reader/Acrobat to generate new field
249 appearances based on the Rich Text data. So when the user
250 opens the PDF, the viewer will create the Rich Text appear‐
251 ance on the spot. If the user's PDF viewer does not support
252 Rich Text, then the user will see the plain text data in‐
253 stead. If you flatten this form before Acrobat has a chance
254 to create (and save) new field appearances, then the plain
255 text field data is what you'll see.
256
257 Also see the flatten, need_appearances, and replacement_font
258 options.
259
260 background <background PDF filename | - | PROMPT>
261 Applies a PDF watermark to the background of a single input
262 PDF. Pass the background PDF's filename after background
263 like so:
264
265 pdftk in.pdf background back.pdf output out.pdf
266
267 Pdftk uses only the first page from the background PDF and
268 applies it to every page of the input PDF. This page is
269 scaled and rotated as needed to fit the input page. You can
270 use - to pass a background PDF into pdftk via stdin.
271
272 If the input PDF does not have a transparent background (such
273 as a PDF created from page scans) then the resulting back‐
274 ground won't be visible – use the stamp operation instead.
275
276 multibackground <background PDF filename | - | PROMPT>
277 Same as the background operation, but applies each page of
278 the background PDF to the corresponding page of the input
279 PDF. If the input PDF has more pages than the stamp PDF,
280 then the final stamp page is repeated across these remaining
281 pages in the input PDF.
282
283 stamp <stamp PDF filename | - | PROMPT>
284 This behaves just like the background operation except it
285 overlays the stamp PDF page on top of the input PDF docu‐
286 ment's pages. This works best if the stamp PDF page has a
287 transparent background.
288
289 multistamp <stamp PDF filename | - | PROMPT>
290 Same as the stamp operation, but applies each page of the
291 background PDF to the corresponding page of the input PDF.
292 If the input PDF has more pages than the stamp PDF, then the
293 final stamp page is repeated across these remaining pages in
294 the input PDF.
295
296 dump_data
297 Reads a single input PDF file and reports its metadata, book‐
298 marks (a/k/a outlines), page metrics (media, rotation and la‐
299 bels), data embedded by STAMPtk (see STAMPtk's embed option)
300 and other data to the given output filename or (if no output
301 is given) to stdout. Non-ASCII characters are encoded as XML
302 numerical entities. Does not create a new PDF.
303
304 dump_data_utf8
305 Same as dump_data except that the output is encoded as UTF-8.
306
307 dump_data_fields
308 Reads a single input PDF file and reports form field statis‐
309 tics to the given output filename or (if no output is given)
310 to stdout. Non-ASCII characters are encoded as XML numerical
311 entities. Does not create a new PDF.
312
313 dump_data_fields_utf8
314 Same as dump_data_fields except that the output is encoded as
315 UTF-8.
316
317 dump_data_annots
318 This operation currently reports only link annotations.
319 Reads a single input PDF file and reports annotation informa‐
320 tion to the given output filename or (if no output is given)
321 to stdout. Non-ASCII characters are encoded as XML numerical
322 entities. Does not create a new PDF.
323
324 update_info <info data filename | - | PROMPT>
325 Changes the bookmarks, page labels, page sizes, page rota‐
326 tions, and metadata in a single PDF's Info dictionary to
327 match the input data file. The input data file uses the same
328 syntax as the output from dump_data. Non-ASCII characters
329 should be encoded as XML numerical entities.
330
331 This operation does not change the metadata stored in the
332 PDF's XMP stream, if it has one. (For this reason you should
333 include a ModDate entry in your updated info with a current
334 date/timestamp, format: D:YYYYMMDDHHmmSS, e.g. D:201307241346
335 – omitted data after YYYY revert to default values.)
336
337 For example:
338
339 pdftk in.pdf update_info in.info output out.pdf
340
341 update_info_utf8 <info data filename | - | PROMPT>
342 Same as update_info except that the input is encoded as
343 UTF-8.
344
345 attach_files <attachment filenames | PROMPT> [to_page <page number |
346 PROMPT> | relation <relationship>]
347 Packs arbitrary files into a PDF using PDF's file attachment
348 features. More than one attachment may be listed after at‐
349 tach_files. Attachments are added at the document level un‐
350 less the optional to_page option is given, in which case the
351 files are attached to the given page number (the first page
352 is 1, the final page is end). Attachments at the document
353 level may be tagged with a relationship among Source, Data,
354 Alternative, Supplement, and Unspecified (default).
355
356 For example:
357
358 pdftk in.pdf attach_files table1.html table2.html to_page 6
359 output out.pdf
360
361 pdftk in.pdf attach_files in.tex relation Source output
362 out.pdf
363
364 unpack_files
365 Copies all of the attachments from the input PDF into the
366 current folder or to an output directory given after output.
367 For example:
368
369 pdftk report.pdf unpack_files output ~/atts/
370
371 or, interactively:
372
373 pdftk report.pdf unpack_files output PROMPT
374
375 [output <output filename | - | PROMPT>]
376 The output PDF filename may not be set to the name of an input
377 filename. Use - to output to stdout. When using the dump_data
378 operation, use output to set the name of the output data file.
379 When using the unpack_files operation, use output to set the
380 name of an output directory. When using the burst operation,
381 you can use output to control the resulting PDF page filenames
382 (described above).
383
384 [encrypt_40bit | encrypt_128bit | encrypt_aes128]
385 If an output PDF user or owner password is given, the output PDF
386 encryption algorithm defaults to AES-128. The weaker RC4 40-bit
387 and RC4 128-bit algorithms can be chosen by specifying en‐
388 crypt_40bit or encrypt_128bit (discouraged).
389
390 [allow <permissions>]
391 Permissions are applied to the output PDF only if an encryption
392 strength is specified or an owner or user password is given. If
393 permissions are not specified, they default to 'none,' which
394 means all of the following features are disabled.
395
396 The permissions section may include one or more of the following
397 features:
398
399 Printing
400 Top Quality Printing
401
402 DegradedPrinting
403 Lower Quality Printing
404
405 ModifyContents
406 Also allows Assembly
407
408 Assembly
409
410 CopyContents
411 Also allows ScreenReaders
412
413 ScreenReaders
414
415 ModifyAnnotations
416 Also allows FillIn
417
418 FillIn
419
420 AllFeatures
421 Allows the user to perform all of the above, and top
422 quality printing.
423
424 [owner_pw <owner password | PROMPT>]
425
426 [user_pw <user password | PROMPT>]
427 If an encryption strength is given but no passwords are sup‐
428 plied, then the owner and user passwords remain empty, which
429 means that the resulting PDF may be opened and its security pa‐
430 rameters altered by anybody.
431
432 [compress | uncompress]
433 These are only useful when you want to edit PDF code in a text
434 editor like vim or emacs. Remove PDF page stream compression by
435 applying the uncompress filter. Use the compress filter to re‐
436 store compression.
437
438 [flatten]
439 Use this option to merge an input PDF's interactive form fields
440 (and their data) with the PDF's pages. Only one input PDF may be
441 given. Sometimes used with the fill_form operation.
442
443 [need_appearances]
444 Sets a flag that cues Reader/Acrobat to generate new field ap‐
445 pearances based on the form field values. Use this when filling
446 a form with non-ASCII text to ensure the best presentation in
447 Adobe Reader or Acrobat. It won't work when combined with the
448 flatten option.
449
450 [replacement_font <font name>]
451 Use the specified font to display text in form fields. This op‐
452 tion is useful when filling a form with non-ASCII text that is
453 not supported by the fonts included in the input PDF. font name
454 may be either the file name or the family name of a font, but
455 using a file name is more reliable. Currently only TrueType
456 fonts with Unicode text are supported.
457
458 [keep_first_id | keep_final_id]
459 When combining pages from multiple PDFs, use one of these op‐
460 tions to copy the document ID from either the first or final in‐
461 put document into the new output PDF. Otherwise pdftk creates a
462 new document ID for the output PDF. When no operation is given,
463 pdftk always uses the ID from the (single) input PDF.
464
465 [drop_xfa]
466 If your input PDF is a form created using Acrobat 7 or Adobe De‐
467 signer, then it probably has XFA data. Filling such a form us‐
468 ing pdftk yields a PDF with data that fails to display in Acro‐
469 bat 7 (and 6?). The workaround solution is to remove the form's
470 XFA data, either before you fill the form using pdftk or at the
471 time you fill the form. Using this option causes pdftk to omit
472 the XFA data from the output PDF form.
473
474 This option is only useful when running pdftk on a single input
475 PDF. When assembling a PDF from multiple inputs using pdftk,
476 any XFA data in the input is automatically omitted.
477
478 [drop_xmp]
479 Many PDFs store document metadata using both an Info dictionary
480 (old school) and an XMP stream (new school). Pdftk's up‐
481 date_info operation can update the Info dictionary, but not the
482 XMP stream. The proper remedy for this is to include a ModDate
483 entry in your updated info with a current date/timestamp. The
484 date/timestamp format is: D:YYYYMMDDHHmmSS, e.g. D:201307241346
485 – omitted data after YYYY revert to default values. This newer
486 ModDate should cue PDF viewers that the Info metadata is more
487 current than the XMP data.
488
489 Alternatively, you might prefer to remove the XMP stream from
490 the PDF altogether – that's what this option does. Note that
491 objects inside the PDF might have their own, separate XMP meta‐
492 data streams, and that drop_xmp does not remove those. It only
493 removes the PDF's document-level XMP stream.
494
495 [verbose]
496 By default, pdftk runs quietly. Append verbose to the end and it
497 will speak up.
498
499 [dont_ask | do_ask]
500 Depending on the compile-time settings (see ASK_ABOUT_WARNINGS),
501 pdftk might prompt you for further input when it encounters a
502 problem, such as a bad password. Override this default behavior
503 by adding dont_ask (so pdftk won't ask you what to do) or do_ask
504 (so pdftk will ask you what to do).
505
506 When running in dont_ask mode, pdftk will over-write files with
507 its output without notice.
508
510 Collate scanned pages
511 pdftk A=even.pdf B=odd.pdf shuffle A B output collated.pdf
512 or if odd.pdf is in reverse order:
513 pdftk A=even.pdf B=odd.pdf shuffle A Bend-1 output collated.pdf
514
515 The following examples use actual passwords as command line parameters,
516 which is discouraged (see the SECURITY CONSIDERATIONS section).
517
518 Decrypt a PDF
519 pdftk secured.pdf input_pw foopass output unsecured.pdf
520
521 Encrypt a PDF using AES-128 (the default), withhold all permissions
522 (the default)
523 pdftk 1.pdf output 1.128.pdf owner_pw foopass
524
525 Same as above, except password 'baz' must also be used to open output
526 PDF
527 pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz
528
529 Same as above, except printing is allowed (once the PDF is open)
530 pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz allow printing
531
532 Apply RCA 40-bit encryption to output, revoking all permissions (the
533 default). Set the owner PW to 'foopass'.
534 pdftk 1.pdf 2.pdf cat output 3.pdf encrypt_40bit owner_pw foopass
535
536 Join two files, one of which requires the password 'foopass'. The out‐
537 put is not encrypted.
538 pdftk A=secured.pdf 2.pdf input_pw A=foopass cat output 3.pdf
539
540 Join in1.pdf and in2.pdf into a new PDF, out1.pdf
541 pdftk in1.pdf in2.pdf cat output out1.pdf
542 or (using handles):
543 pdftk A=in1.pdf B=in2.pdf cat A B output out1.pdf
544 or (using wildcards):
545 pdftk *.pdf cat output combined.pdf
546
547 Remove page 13 from in1.pdf to create out1.pdf
548 pdftk in.pdf cat 1-12 14-end output out1.pdf
549 or:
550 pdftk A=in1.pdf cat A1-12 A14-end output out1.pdf
551
552 Uncompress PDF page streams for editing the PDF in a text editor (e.g.,
553 vim, emacs)
554 pdftk doc.pdf output doc.unc.pdf uncompress
555
556 Repair a PDF's corrupted XREF table and stream lengths, if possible
557 pdftk broken.pdf output fixed.pdf
558
559 Burst a single PDF document into pages and dump its data to
560 doc_data.txt
561 pdftk in.pdf burst
562
563 Burst a single PDF document into encrypted pages. Allow low-quality
564 printing
565 pdftk in.pdf burst owner_pw foopass allow DegradedPrinting
566
567 Write a report on PDF document metadata and bookmarks to report.txt
568 pdftk in.pdf dump_data output report.txt
569
570 Rotate the first PDF page to 90 degrees clockwise
571 pdftk in.pdf cat 1east 2-end output out.pdf
572
573 Rotate an entire PDF document to 180 degrees
574 pdftk in.pdf cat 1-endsouth output out.pdf
575
577 This is a port of pdftk to java. See
578 https://gitlab.com/pdftk-java/pdftk
579 The original program can be found at www.pdftk.com
580
582 Original author of pdftk is Sid Steward (sid.steward at pdflabs dot
583 com).
584
586 Passing a password as a command line parameter is insecure because it
587 can get saved into the shell's history and be accessible by other users
588 via /proc. Use the keyword PROMPT and input any passwords via standard
589 input instead.
590
591
592
593 December 7, 2020 PDFTK(1)