1pdftosrc(1) General Commands Manual pdftosrc(1)
2
3
4
6 pdftosrc - extract source file or stream from PDF file
7
9 pdftosrc PDF-file [stream-object-number]
10
12 If only PDF-file is given as argument, pdftosrc extracts the embedded
13 source file from the first found stream object with /Type /SourceFile
14 within the PDF-file and writes it to a file with the name /SourceName
15 as defined in that PDF stream object (see application example below).
16
17 If both PDF-file and stream-object-number are given as arguments, and
18 stream-object-number is positive, pdftosrc extracts and uncompresses
19 the PDF stream of the object given by its stream-object-number from the
20 PDF-file and writes it to a file named PDF-file.stream-object-number
21 with the ending .pdf or .PDF stripped from the original PDF-file name.
22
23 A special case is related to XRef object streams that are part of the
24 PDF standard from PDF-1.5 onward: If stream-object-number equals -1,
25 then pdftosrc decompresses the XRef stream from the PDF file and writes
26 it in human-readable PDF cross-reference table format to a file named
27 PDF-file.xref (these XRef streams can not be extracted just by giving
28 their object number).
29
30 In any case an existing file with the output file name will be over‐
31 written.
32
34 None.
35
37 Just the executable pdftosrc.
38
40 None.
41
43 At success the exit code of pdftosrc is 0, else 1.
44
45 All messages go to stderr. At program invocation, pdftosrc issues the
46 current version number of the program xpdf, on which pdftosrc is based:
47
48 pdftosrc version 3.01
49
50 When pdftosrc was successful with the output file writing, one of the
51 following messages will be issued:
52
53 Source file extracted to source-file-name
54
55 or
56
57 Stream object extracted to PDF-file.stream-object-number
58
59 or
60
61 Cross-reference table extracted to PDF-file.xref
62
63
64 When the object given by the stream-object-number does not contain a
65 stream, pdftosrc issues the following error message:
66
67 Not a Stream object
68
69 When the PDF-file can't be opened, the error message is:
70
71 Error: Couldn't open file 'PDF-file'.
72
73 When pdftosrc encounters an invalid PDF file, the error message (sev‐
74 eral lines) is:
75
76 Error: May not be a PDF file (continuing anyway)
77 (more lines)
78 Invalid PDF file
79
80 There are also more error messages from pdftosrc for various kinds of
81 broken PDF files.
82
84 An embedded source file will be written out unchanged, i. e. it will
85 not be uncompressed in this process.
86
87 Only the stream of the object will be written, i. e. not the dictionary
88 of that object.
89
90 Knowing which stream-object-number to query requires information about
91 the PDF file that has to be gained elsewhere, e. g. by looking into the
92 PDF file with an editor.
93
94 The stream extraction capabilities of pdftosrc (e. g. regarding under‐
95 stood PDF versions and filter types) follow the capabilities of the
96 underlying xpdf program version.
97
98 Currently the generation number of the stream object is not supported.
99 The default value 0 (zero) is taken.
100
101 The wording stream-object-number has nothing to do with the `object
102 streams' introduced by the Adobe PDF Reference, 5th edition, version
103 1.6.
104
106 When using pdftex, a source file can be embedded into some PDF-file by
107 using pdftex primitives, as illustrated by the following example:
108
109 \immediate\pdfobj
110 stream attr {/Type /SourceFile /SourceName (myfile.zip)}
111 file{myfile.zip}
112 \pdfcatalog{/SourceObject \the\pdflastobj\space 0 R}
113
114 Then this zip file can be extracted from the PDF-file by calling
115 pdftosrc PDF-file.
116
118 Not all embedded source files will be extracted, only the first found
119 one.
120
121 Email bug reports to pdftex@tug.org.
122
124 xpdf(1), pdfimages(1), pdftotext(1), pdftex(1),
125
127 pdftosrc written by Han The Thanh, using xpdf functionality from Derek
128 Noonburg.
129
130 Man page written by Hartmut Henkel.
131
133 Copyright (c) 1996-2006 Han The Thanh, <thanh@pdftex.org>
134
135 This file is part of pdfTeX.
136
137 pdfTeX is free software; you can redistribute it and/or modify it under
138 the terms of the GNU General Public License as published by the Free
139 Software Foundation; either version 2 of the License, or (at your
140 option) any later version.
141
142 pdfTeX is distributed in the hope that it will be useful, but WITHOUT
143 ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
144 FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
145 for more details.
146
147 You should have received a copy of the GNU General Public License along
148 with pdfTeX; if not, write to the Free Software Foundation, Inc., 59
149 Temple Place, Suite 330, Boston, MA 02111-1307 USA
150
151
152
153Web2C 2019 16 June 2015 pdftosrc(1)