1PDFTK(1)                    General Commands Manual                   PDFTK(1)
2
3
4

NAME

6       pdftk - A handy tool for manipulating PDF
7

SYNOPSIS

9       pdftk <input PDF files | - | PROMPT>
10            [ input_pw <input PDF owner passwords | PROMPT> ]
11            [ <operation> <operation arguments> ]
12            [ output <output filename | - | PROMPT> ]
13            [ encrypt_40bit | encrypt_128bit | encrypt_aes128 ]
14            [ allow <permissions> ]
15            [ owner_pw <owner password | PROMPT> ]
16            [ user_pw <user password | PROMPT> ]
17            [ flatten ] [ need_appearances ]
18            [ compress | uncompress ]
19            [ keep_first_id | keep_final_id ] [ drop_xfa ] [ drop_xmp ]
20            [ replacement_font <font name> ]
21            [ verbose ] [ dont_ask | do_ask ]
22       Where:
23            <operation> may be empty, or:
24            [ cat | shuffle | burst | rotate |
25              generate_fdf | fill_form |
26              background | multibackground |
27              stamp | multistamp |
28              dump_data | dump_data_utf8 |
29              dump_data_fields | dump_data_fields_utf8 |
30              dump_data_annots |
31              update_info | update_info_utf8 |
32              attach_files | unpack_files ]
33
34       For Complete Help: pdftk --help
35

DESCRIPTION

37       If PDF is electronic paper, then pdftk is an electronic staple-remover,
38       hole-punch, binder, secret-decoder-ring, and X-Ray-glasses.  Pdftk is a
39       simple tool for doing everyday things with PDF documents.  Use it to:
40
41       * Merge PDF Documents or Collate PDF Page Scans
42       * Split PDF Pages into a New Document
43       * Rotate PDF Documents or Pages
44       * Decrypt Input as Necessary (Password Required)
45       * Encrypt Output as Desired
46       * Fill PDF Forms with X/FDF Data and/or Flatten Forms
47       * Generate FDF Data Stencils from PDF Forms
48       * Apply a Background Watermark or a Foreground Stamp
49       * Report PDF Metrics, Bookmarks and Metadata
50       * Add/Update PDF Metrics, Bookmarks or Metadata
51       * Attach Files to PDF Pages or the PDF Document
52       * Unpack PDF Attachments
53       * Burst a PDF Document into Single Pages
54       * Uncompress and Re-Compress Page Streams
55       * Repair Corrupted PDF (Where Possible)
56

OPTIONS

58       A summary of options is included below.
59
60       --help, -h
61              Show this summary of options.
62
63       <input PDF files | - | PROMPT>
64              A list of the input PDF files. If you plan to combine these PDFs
65              (without using handles) then list files in the order you want
66              them combined.  Use - to pass a single PDF into pdftk via stdin.
67              Input files can be associated with handles, where a handle is
68              one or more upper-case letters:
69
70              <input PDF handle>=<input PDF filename>
71
72              Handles are often omitted.  They are useful when specifying PDF
73              passwords or page ranges, later.
74
75              For example: A=input1.pdf QT=input2.pdf M=input3.pdf
76
77       [input_pw <input PDF owner passwords | PROMPT>]
78              Input PDF owner passwords, if necessary, are associated with
79              files by using their handles:
80
81              <input PDF handle>=<input PDF file owner password>
82
83              If handles are not given, then passwords are associated with in‐
84              put files by order.
85
86              Most pdftk features require that encrypted input PDF are accom‐
87              panied by the ~owner~ password. If the input PDF has no owner
88              password, then the user password must be given, instead.  If the
89              input PDF has no passwords, then no password should be given.
90
91              When running in do_ask mode, pdftk will prompt you for a pass‐
92              word if the supplied password is incorrect or none was given.
93
94       [<operation> <operation arguments>]
95              Available operations are: cat, shuffle, burst, rotate, gener‐
96              ate_fdf, fill_form, background, multibackground, stamp, multi‐
97              stamp, dump_data, dump_data_utf8, dump_data_fields,
98              dump_data_fields_utf8, dump_data_annots, update_info, up‐
99              date_info_utf8, attach_files, unpack_files. Some operations
100              takes additional arguments, described below.
101
102              If this optional argument is omitted, then pdftk runs in 'fil‐
103              ter' mode.  Filter mode takes only one PDF input and creates a
104              new PDF after applying all of the output options, like encryp‐
105              tion and compression.
106
107          cat [<page ranges>]
108                 Assembles (catenates) pages from input PDFs to create a new
109                 PDF. Use cat to merge PDF pages or to split PDF pages from
110                 documents. You can also use it to rotate PDF pages. Page or‐
111                 der in the new PDF is specified by the order of the given
112                 page ranges. Page ranges are described like this:
113
114                 <input PDF handle>[<begin page number>[-<end page num‐
115                 ber>[<qualifier>]]][<page rotation>]
116
117                 Where the handle identifies one of the input PDF files, and
118                 the beginning and ending page numbers are one-based refer‐
119                 ences to pages in the PDF file.  The qualifier can be even,
120                 odd, or ~, and the page rotation can be north, south, east,
121                 west, left, right, or down.
122
123                 If a PDF handle is given but no pages are specified, then the
124                 entire PDF is used. If no pages are specified for any of the
125                 input PDFs, then the input PDFs' bookmarks are also merged
126                 and included in the output.
127
128                 If the handle is omitted from the page range, then the pages
129                 are taken from the first input PDF.
130
131                 The even qualifier causes pdftk to use only the even-numbered
132                 PDF pages, so 1-6even yields pages 2, 4 and 6 in that order.
133                 6-1even yields pages 6, 4 and 2 in that order.
134
135                 The odd qualifier works similarly to the even.
136
137                 Pages can be subtracted from a page range using the ~ quali‐
138                 fier followed by a page range. For instance, 1-20~5-6 and
139                 1-20~5~6 are equivalent to 1-4 7-20, and ~5 yields all pages
140                 except page 5. Depending on your shell, you may need to quote
141                 this argument because of the ~ at the beginning.
142
143                 The page rotation setting can cause pdftk to rotate pages and
144                 documents.  Each option sets the page rotation as follows (in
145                 degrees): north: 0, east: 90, south: 180, west: 270, left:
146                 -90, right: +90, down: +180. left, right, and down make rela‐
147                 tive adjustments to a page's rotation.
148
149                 If no arguments are passed to cat, then pdftk combines all
150                 input PDFs in the order they were given to create the output.
151
152                 NOTES:
153                 * <end page number> may be less than <begin page number>.
154                 * The keyword end may be used to reference the final page of
155                 a document instead of a page number.
156                 * Reference a single page by omitting the ending page number.
157                 * The handle may be used alone to represent the entire PDF
158                 document, e.g., B1-end is the same as B.
159                 * You can reference page numbers in reverse order by prefix‐
160                 ing them with the letter r. For example, page r1 is the last
161                 page of the document, r2 is the next-to-last page of the doc‐
162                 ument, and rend is the first page of the document. You can
163                 use this prefix in ranges, too, for example r3-r1 is the last
164                 three pages of a PDF.
165
166                 Page Range Examples without Handles:
167                 1\-endeast – rotate entire document 90 degrees
168                 5 11 20 – take single pages from input PDF
169                 5-25oddwest – take odd pages in range, rotate 90 degrees
170                 6-1 – reverse pages in range from input PDF
171
172                 Page Range Examples Using Handles:
173                 Say A=in1.pdf B=in2.pdf, then:
174                 A1-21 – take range from in1.pdf
175                 Bend-1odd – take all odd pages from in2.pdf in reverse order
176                 A72 – take a single page from in1.pdf
177                 A1-21 Beven A72 – assemble pages from both in1.pdf and
178                 in2.pdf
179                 Awest – rotate entire in1.pdf document 90 degrees
180                 B – use all of in2.pdf
181                 A2-30evenleft – take the even pages from the range, remove 90
182                 degrees from each page's rotation
183                 A A – catenate in1.pdf with in1.pdf
184                 Aevenwest Aoddeast – apply rotations to even pages, odd pages
185                 from in1.pdf
186                 Awest Bwest Bdown – catenate rotated documents
187
188          shuffle [<page ranges>]
189                 Collates pages from input PDFs to create a new PDF.  Works
190                 like the cat operation except that it takes one page at a
191                 time from each page range to assemble the output PDF.  If one
192                 range runs out of pages, it continues with the remaining
193                 ranges.  Ranges can use all of the features described above
194                 for cat, like reverse page ranges, multiple ranges from a
195                 single PDF, and page rotation.  This feature was designed to
196                 help collate PDF pages after scanning paper documents.
197
198          burst  Splits a single input PDF document into individual pages.
199                 Also creates a report named doc_data.txt which is the same as
200                 the output from dump_data.  The output section can contain a
201                 printf-styled format string to name these pages.  For exam‐
202                 ple, if you want pages named page_01.pdf, page_02.pdf, etc.,
203                 pass output page_%02d.pdf to pdftk. If the pattern is omit‐
204                 ted, then a default pattern g_%04d.pdf is appended and pro‐
205                 duces pages named pg_0001.pdf, pg_0002.pdf, etc.  Encryption
206                 can be applied to the output by appending output options such
207                 as owner_pw, e.g.:
208
209                 pdftk in.pdf burst owner_pw foopass
210
211          rotate [<page ranges>]
212                 Takes a single input PDF and rotates just the specified
213                 pages.  All other pages remain unchanged.  The page order re‐
214                 mains unchanged.  Specify the pages to rotate using the same
215                 notation as you would with cat, except you omit the pages
216                 that you aren't rotating:
217
218                 [<begin page number>[-<end page number>[<qualifier>]]][<page
219                 rotation>]
220
221                 The qualifier can be even or odd, and the page rotation can
222                 be north, south, east, west, left, right, or down.
223
224                 Each option sets the page rotation as follows (in degrees):
225                 north: 0, east: 90, south: 180, west: 270, left: -90, right:
226                 +90, down: +180. left, right, and down make relative adjust‐
227                 ments to a page's rotation.
228
229                 The given order of the pages doesn't change the page order in
230                 the output.
231
232          generate_fdf
233                 Reads a single input PDF file and generates an FDF file suit‐
234                 able for fill_form out of it to the given output filename or
235                 (if no output is given) to stdout.  Does not create a new
236                 PDF.
237
238          fill_form <FDF data filename | XFDF data filename | - | PROMPT>
239                 Fills the single input PDF's form fields with the data from
240                 an FDF file, XFDF file or stdin. Enter the data filename af‐
241                 ter fill_form, or use - to pass the data via stdin, like so:
242
243                 pdftk form.pdf fill_form data.fdf output form.filled.pdf
244
245                 If the input FDF file includes Rich Text formatted data in
246                 addition to plain text, then the Rich Text data is packed
247                 into the form fields as well as the plain text.  Pdftk also
248                 sets a flag that cues Reader/Acrobat to generate new field
249                 appearances based on the Rich Text data.  So when the user
250                 opens the PDF, the viewer will create the Rich Text appear‐
251                 ance on the spot.  If the user's PDF viewer does not support
252                 Rich Text, then the user will see the plain text data in‐
253                 stead.  If you flatten this form before Acrobat has a chance
254                 to create (and save) new field appearances, then the plain
255                 text field data is what you'll see.
256
257                 Also see the flatten, need_appearances, and replacement_font
258                 options.
259
260          background <background PDF filename | - | PROMPT>
261                 Applies a PDF watermark to the background of a single input
262                 PDF.  Pass the background PDF's filename after background
263                 like so:
264
265                 pdftk in.pdf background back.pdf output out.pdf
266
267                 Pdftk uses only the first page from the background PDF and
268                 applies it to every page of the input PDF.  This page is
269                 scaled and rotated as needed to fit the input page.  You can
270                 use - to pass a background PDF into pdftk via stdin.
271
272                 If the input PDF does not have a transparent background (such
273                 as a PDF created from page scans) then the resulting back‐
274                 ground won't be visible – use the stamp operation instead.
275
276          multibackground <background PDF filename | - | PROMPT>
277                 Same as the background operation, but applies each page of
278                 the background PDF to the corresponding page of the input
279                 PDF.  If the input PDF has more pages than the stamp PDF,
280                 then the final stamp page is repeated across these remaining
281                 pages in the input PDF.
282
283          stamp <stamp PDF filename | - | PROMPT>
284                 This behaves just like the background operation except it
285                 overlays the stamp PDF page on top of the input PDF docu‐
286                 ment's pages.  This works best if the stamp PDF page has a
287                 transparent background.
288
289          multistamp <stamp PDF filename | - | PROMPT>
290                 Same as the stamp operation, but applies each page of the
291                 background PDF to the corresponding page of the input PDF.
292                 If the input PDF has more pages than the stamp PDF, then the
293                 final stamp page is repeated across these remaining pages in
294                 the input PDF.
295
296          dump_data
297                 Reads a single input PDF file and reports its metadata, book‐
298                 marks (a/k/a outlines), page metrics (media, rotation and la‐
299                 bels), data embedded by STAMPtk (see STAMPtk's embed option)
300                 and other data to the given output filename or (if no output
301                 is given) to stdout.  Non-ASCII characters are encoded as XML
302                 numerical entities.  Does not create a new PDF.
303
304          dump_data_utf8
305                 Same as dump_data except that the output is encoded as UTF-8.
306
307          dump_data_fields
308                 Reads a single input PDF file and reports form field statis‐
309                 tics to the given output filename or (if no output is given)
310                 to stdout. Non-ASCII characters are encoded as XML numerical
311                 entities. Does not create a new PDF.
312
313          dump_data_fields_utf8
314                 Same as dump_data_fields except that the output is encoded as
315                 UTF-8.
316
317          dump_data_annots
318                 This operation currently reports only link annotations.
319                 Reads a single input PDF file and reports annotation informa‐
320                 tion to the given output filename or (if no output is given)
321                 to stdout. Non-ASCII characters are encoded as XML numerical
322                 entities. Does not create a new PDF.
323
324          update_info <info data filename | - | PROMPT>
325                 Changes the bookmarks, page labels, page sizes, page rota‐
326                 tions, and metadata in a single PDF's Info dictionary to
327                 match the input data file. The input data file uses the same
328                 syntax as the output from dump_data. Non-ASCII characters
329                 should be encoded as XML numerical entities.
330
331                 This operation does not change the metadata stored in the
332                 PDF's XMP stream, if it has one. (For this reason you should
333                 include a ModDate entry in your updated info with a current
334                 date/timestamp, format: D:YYYYMMDDHHmmSS, e.g. D:201307241346
335                 – omitted data after YYYY revert to default values.)
336
337                 For example:
338
339                 pdftk in.pdf update_info in.info output out.pdf
340
341          update_info_utf8 <info data filename | - | PROMPT>
342                 Same as update_info except that the input is encoded as
343                 UTF-8.
344
345          attach_files <attachment filenames | PROMPT> [to_page <page number |
346          PROMPT> | relation <relationship>]
347                 Packs arbitrary files into a PDF using PDF's file attachment
348                 features. More than one attachment may be listed after at‐
349                 tach_files. Attachments are added at the document level un‐
350                 less the optional to_page option is given, in which case the
351                 files are attached to the given page number (the first page
352                 is 1, the final page is end). Attachments at the document
353                 level may be tagged with a relationship among Source, Data,
354                 Alternative, Supplement, and Unspecified (default).
355
356                 For example:
357
358                 pdftk in.pdf attach_files table1.html table2.html to_page 6
359                 output out.pdf
360
361                 pdftk in.pdf attach_files in.tex relation Source output
362                 out.pdf
363
364          unpack_files
365                 Copies all of the attachments from the input PDF into the
366                 current folder or to an output directory given after output.
367                 For example:
368
369                 pdftk report.pdf unpack_files output ~/atts/
370
371                 or, interactively:
372
373                 pdftk report.pdf unpack_files output PROMPT
374
375       [output <output filename | - | PROMPT>]
376              The output PDF filename may not be set to the name of an input
377              filename. Use - to output to stdout.  When using the dump_data
378              operation, use output to set the name of the output data file.
379              When using the unpack_files operation, use output to set the
380              name of an output directory.  When using the burst operation,
381              you can use output to control the resulting PDF page filenames
382              (described above).
383
384       [encrypt_40bit | encrypt_128bit | encrypt_aes128]
385              If an output PDF user or owner password is given, the output PDF
386              encryption algorithm defaults to AES-128. The weaker RC4 40-bit
387              and RC4 128-bit algorithms can be chosen by specifying en‐
388              crypt_40bit or encrypt_128bit (discouraged).
389
390       [allow <permissions>]
391              Permissions are applied to the output PDF only if an encryption
392              strength is specified or an owner or user password is given.  If
393              permissions are not specified, they default to 'none,' which
394              means all of the following features are disabled.
395
396              The permissions section may include one or more of the following
397              features:
398
399              Printing
400                     Top Quality Printing
401
402              DegradedPrinting
403                     Lower Quality Printing
404
405              ModifyContents
406                     Also allows Assembly
407
408              Assembly
409
410              CopyContents
411                     Also allows ScreenReaders
412
413              ScreenReaders
414
415              ModifyAnnotations
416                     Also allows FillIn
417
418              FillIn
419
420              AllFeatures
421                     Allows the user to perform all of the above, and top
422                     quality printing.
423
424       [owner_pw <owner password | PROMPT>]
425
426       [user_pw <user password | PROMPT>]
427              If an encryption strength is given but no passwords are sup‐
428              plied, then the owner and user passwords remain empty, which
429              means that the resulting PDF may be opened and its security pa‐
430              rameters altered by anybody.
431
432       [compress | uncompress]
433              These are only useful when you want to edit PDF code in a text
434              editor like vim or emacs.  Remove PDF page stream compression by
435              applying the uncompress filter. Use the compress filter to re‐
436              store compression.
437
438       [flatten]
439              Use this option to merge an input PDF's interactive form fields
440              (and their data) with the PDF's pages. Only one input PDF may be
441              given. Sometimes used with the fill_form operation.
442
443       [need_appearances]
444              Sets a flag that cues Reader/Acrobat to generate new field ap‐
445              pearances based on the form field values.  Use this when filling
446              a form with non-ASCII text to ensure the best presentation in
447              Adobe Reader or Acrobat.  It won't work when combined with the
448              flatten option.
449
450       [replacement_font <font name>]
451              Use the specified font to display text in form fields. This op‐
452              tion is useful when filling a form with non-ASCII text that is
453              not supported by the fonts included in the input PDF. font name
454              may be either the file name or the family name of a font, but
455              using a file name is more reliable. Currently only TrueType
456              fonts with Unicode text are supported.
457
458       [keep_first_id | keep_final_id]
459              When combining pages from multiple PDFs, use one of these op‐
460              tions to copy the document ID from either the first or final in‐
461              put document into the new output PDF. Otherwise pdftk creates a
462              new document ID for the output PDF. When no operation is given,
463              pdftk always uses the ID from the (single) input PDF.
464
465       [drop_xfa]
466              If your input PDF is a form created using Acrobat 7 or Adobe De‐
467              signer, then it probably has XFA data.  Filling such a form us‐
468              ing pdftk yields a PDF with data that fails to display in Acro‐
469              bat 7 (and 6?).  The workaround solution is to remove the form's
470              XFA data, either before you fill the form using pdftk or at the
471              time you fill the form. Using this option causes pdftk to omit
472              the XFA data from the output PDF form.
473
474              This option is only useful when running pdftk on a single input
475              PDF.  When assembling a PDF from multiple inputs using pdftk,
476              any XFA data in the input is automatically omitted.
477
478       [drop_xmp]
479              Many PDFs store document metadata using both an Info dictionary
480              (old school) and an XMP stream (new school).  Pdftk's up‐
481              date_info operation can update the Info dictionary, but not the
482              XMP stream.  The proper remedy for this is to include a ModDate
483              entry in your updated info with a current date/timestamp. The
484              date/timestamp format is: D:YYYYMMDDHHmmSS, e.g. D:201307241346
485              – omitted data after YYYY revert to default values. This newer
486              ModDate should cue PDF viewers that the Info metadata is more
487              current than the XMP data.
488
489              Alternatively, you might prefer to remove the XMP stream from
490              the PDF altogether – that's what this option does.  Note that
491              objects inside the PDF might have their own, separate XMP meta‐
492              data streams, and that drop_xmp does not remove those.  It only
493              removes the PDF's document-level XMP stream.
494
495       [verbose]
496              By default, pdftk runs quietly. Append verbose to the end and it
497              will speak up.
498
499       [dont_ask | do_ask]
500              Depending on the compile-time settings (see ASK_ABOUT_WARNINGS),
501              pdftk might prompt you for further input when it encounters a
502              problem, such as a bad password. Override this default behavior
503              by adding dont_ask (so pdftk won't ask you what to do) or do_ask
504              (so pdftk will ask you what to do).
505
506              When running in dont_ask mode, pdftk will over-write files with
507              its output without notice.
508

EXAMPLES

510       Collate scanned pages
511         pdftk A=even.pdf B=odd.pdf shuffle A B output collated.pdf
512         or if odd.pdf is in reverse order:
513         pdftk A=even.pdf B=odd.pdf shuffle A Bend-1 output collated.pdf
514
515       The following examples use actual passwords as command line parameters,
516       which is discouraged (see the SECURITY CONSIDERATIONS section).
517
518       Decrypt a PDF
519         pdftk secured.pdf input_pw foopass output unsecured.pdf
520
521       Encrypt a PDF using AES-128 (the default), withhold all permissions
522       (the default)
523         pdftk 1.pdf output 1.128.pdf owner_pw foopass
524
525       Same as above, except password 'baz' must also be used to open output
526       PDF
527         pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz
528
529       Same as above, except printing is allowed (once the PDF is open)
530         pdftk 1.pdf output 1.128.pdf owner_pw foo user_pw baz allow printing
531
532       Apply RCA 40-bit encryption to output, revoking all permissions (the
533       default). Set the owner PW to 'foopass'.
534         pdftk 1.pdf 2.pdf cat output 3.pdf encrypt_40bit owner_pw foopass
535
536       Join two files, one of which requires the password 'foopass'. The out‐
537       put is not encrypted.
538         pdftk A=secured.pdf 2.pdf input_pw A=foopass cat output 3.pdf
539
540       Join in1.pdf and in2.pdf into a new PDF, out1.pdf
541         pdftk in1.pdf in2.pdf cat output out1.pdf
542         or (using handles):
543         pdftk A=in1.pdf B=in2.pdf cat A B output out1.pdf
544         or (using wildcards):
545         pdftk *.pdf cat output combined.pdf
546
547       Remove page 13 from in1.pdf to create out1.pdf
548         pdftk in.pdf cat 1-12 14-end output out1.pdf
549         or:
550         pdftk A=in1.pdf cat A1-12 A14-end output out1.pdf
551
552       Uncompress PDF page streams for editing the PDF in a text editor (e.g.,
553       vim, emacs)
554         pdftk doc.pdf output doc.unc.pdf uncompress
555
556       Repair a PDF's corrupted XREF table and stream lengths, if possible
557         pdftk broken.pdf output fixed.pdf
558
559       Burst a single PDF document into pages and dump its data to
560       doc_data.txt
561         pdftk in.pdf burst
562
563       Burst a single PDF document into encrypted pages. Allow low-quality
564       printing
565         pdftk in.pdf burst owner_pw foopass allow DegradedPrinting
566
567       Write a report on PDF document metadata and bookmarks to report.txt
568         pdftk in.pdf dump_data output report.txt
569
570       Rotate the first PDF page to 90 degrees clockwise
571         pdftk in.pdf cat 1east 2-end output out.pdf
572
573       Rotate an entire PDF document to 180 degrees
574         pdftk in.pdf cat 1-endsouth output out.pdf
575

NOTES

577       This is a port of pdftk to java. See
578       https://gitlab.com/pdftk-java/pdftk
579       The original program can be found at www.pdftk.com
580

AUTHOR

582       Original author of pdftk is Sid Steward (sid.steward at pdflabs dot
583       com).
584

SECURITY CONSIDERATIONS

586       Passing a password as a command line parameter is insecure because it
587       can get saved into the shell's history and be accessible by other users
588       via /proc. Use the keyword PROMPT and input any passwords via standard
589       input instead.
590
591
592
593                               December 7, 2020                       PDFTK(1)
Impressum