1PDF::Builder::Basic::PDUFs:e:rFiCloen(t3r)ibuted Perl DoPcDuFm:e:nBtuaitlidoenr::Basic::PDF::File(3)
2
3
4
6 PDF::Builder::Basic::PDF::File - Holds the trailers and cross-reference
7 tables for a PDF file
8
10 $p = PDF::Builder::Basic::PDF::File->open("filename.pdf", 1);
11 $p->new_obj($obj_ref);
12 $p->free_obj($obj_ref);
13 $p->append_file();
14 $p->close_file();
15 $p->release(); # IMPORTANT!
16
18 This class keeps track of the directory aspects of a PDF file. There
19 are two parts to the directory: the main directory object, which is the
20 parent to all other objects, and a chain of cross-reference tables and
21 corresponding trailer dictionaries, starting with the main directory
22 object.
23
25 Within this class hierarchy, rather than making everything visible via
26 methods, which would be a lot of work, there are various instance
27 variables which are accessible via associative array referencing. To
28 distinguish instance variables from content variables (which may come
29 from the PDF content itself), each such variable name will start with a
30 space.
31
32 Variable names which do not start with a space directly reflect
33 elements in a PDF dictionary. In the case of a
34 "PDF::Builder::Basic::PDF::File", the elements reflect those in the
35 trailer dictionary.
36
37 Since some variables are not designed for class users to access,
38 variables are marked in the documentation with (R) to indicate that
39 such an entry should only be used as read-only information. (P)
40 indicates that the information is private, and not designed for user
41 use at all, but is included in the documentation for completeness and
42 to ensure that nobody else tries to use it.
43
44 newroot
45 This variable allows the user to create a new root entry to occur
46 in the trailer dictionary which is output when the file is written
47 or appended. If you wish to override the root element in the
48 dictionary you have, use this entry to indicate that without losing
49 the current Root entry. Notice that newroot should point to a PDF
50 level object and not just to a dictionary, which does not have
51 object status.
52
53 INFILE (R)
54 Contains the filehandle used to read this information into this PDF
55 directory. It is an IO object.
56
57 fname (R)
58 This is the filename which is reflected by INFILE, or the original
59 IO object passed in.
60
61 update (R)
62 This indicates that the read file has been opened for update and
63 that at some point, "$p->appendfile()" can be called to update the
64 file with the changes that have been made to the memory
65 representation.
66
67 maxobj (R)
68 Contains the first usable object number above any that have already
69 appeared in the file so far.
70
71 outlist (P)
72 This is a list of Objind which are to be output when the next
73 appendfile() or outfile() occurs.
74
75 firstfree (P)
76 Contains the first free object in the free object list. Free
77 objects are removed from the front of the list and added to the
78 end.
79
80 lastfree (P)
81 Contains the last free object in the free list. It may be the same
82 as the "firstfree" if there is only one free object.
83
84 objcache (P)
85 All objects are held in the cache to ensure that a system only has
86 one occurrence of each object. In effect, the objind class acts as
87 a container type class to hold the PDF object structure, and it
88 would be unfortunate if there were two identical place-holders
89 floating around a system.
90
91 epos (P)
92 The end location of the read-file.
93
94 Each trailer dictionary contains a number of private instance variables
95 which hold the chain together.
96
97 loc (P)
98 Contains the location of the start of the cross-reference table
99 preceding the trailer.
100
101 xref (P)
102 Contains an anonymous array of each cross-reference table entry.
103
104 prev (P)
105 A reference to the previous table. Note this differs from the Prev
106 entry which is in PDF, which contains the location of the previous
107 cross-reference table.
108
110 PDF::Builder::Basic::PDF::File->new()
111 Creates a new, empty file object which can act as the host to other
112 PDF objects. Since there is no file associated with this object,
113 it is assumed that the object is created in readiness for creating
114 a new PDF file.
115
116 $p = PDF::Builder::Basic::PDF::File->open($filename, $update, %options)
117 Opens the file and reads all the trailers and cross reference
118 tables to build a complete directory of objects.
119
120 $filename may be a string or an IO object.
121
122 $update specifies whether this file is being opened for updating
123 and editing (TRUE value), or simply to be read (FALSE or undefined
124 value).
125
126 %options may include
127
128 diags => 1
129 If "diags" is set to 1, various warning messages will be given
130 if a suspicious PDF structure is found, and some fixup may be
131 attempted. There is no guarantee that any fixup will change the
132 PDF to legitimate, or that there won't be other problems found
133 further down the line. If this flag is not given, and a
134 structural problem is found, it is fairly likely that errors
135 (and even a program crash) may happen further along. If you
136 experience crashes when reading in a PDF file, try running with
137 "diags" and see what is reported.
138
139 There are many PDF files out "in the wild" which, while failing
140 to conform to Adobe's standards, appear to be tolerated by PDF
141 Readers. Thus, Builder will no longer fail on them, but merely
142 comment on their existence.
143
144 $new_version = $p->version($version, %opts) # Set
145 $ver = $p->version() # Get
146 Gets/sets the PDF version (e.g., 1.5). Setting sets both the header
147 and trailer versions. Getting returns the higher of header and
148 trailer versions.
149
150 For compatibility with earlier releases, if no decimal point is
151 given, assume "1." precedes the number given.
152
153 A warning message is given if you attempt to decrease the PDF
154 version, as you might have already read in a higher level file, or
155 used a higher level feature. This message is suppressed if the
156 'silent' option is given with any value.
157
158 $new_version = $p->header_version($version, %opts) # Set
159 $version = $p->header_version() # Get
160 Gets/sets the PDF version stored in the file header.
161
162 For compatibility with earlier releases, if no decimal point is
163 given, assume "1." precedes the number given.
164
165 A warning message is given if you attempt to decrease the PDF
166 version, as you might have already read in a higher level file, or
167 used a higher level feature. This message is suppressed if the
168 'silent' option is given with any value.
169
170 $new_version = $p->trailer_version($version, %opts) # Set
171 $version = $p->trailer_version() # Get
172 Gets/sets the PDF version stored in the document catalog.
173
174 Note that the minimum PDF level for a trailer version is 1.4. It is
175 not permitted to set a PDF level of 1.3 or lower. An existing PDF
176 (read in) of 1.3 or below returns undefined.
177
178 For compatibility with earlier releases, if no decimal point is
179 given, assume "1." precedes the number given.
180
181 A warning message is given if you attempt to decrease the PDF
182 version, as you might have already read in a higher level file, or
183 used a higher level feature. This message is suppressed if the
184 'silent' option is given with any value.
185
186 $prev_version = $p->require_version($version)
187 Ensures that the PDF version is at least $version. Silently sets
188 the version to the higher level.
189
190 $p->release()
191 Releases ALL of the memory used by the PDF document and all of its
192 component objects. After calling this method, do NOT expect to
193 have anything left in the "PDF::Builder::Basic::PDF::File" object
194 (so if you need to save, be sure to do it before calling this
195 method).
196
197 NOTE, that it is important that you call this method on any
198 "PDF::Builder::Basic::PDF::File" object when you wish to destroy it
199 and free up its memory. Internally, PDF files have an enormous
200 number of cross-references, and this causes circular references
201 within the internal data structures. Calling release() causes a
202 brute-force cleanup of the data structures, freeing up all of the
203 memory. Once you've called this method, though, don't expect to be
204 able to do anything else with the "PDF::Builder::Basic::PDF::File"
205 object; it'll have no internal state whatsoever.
206
207 $p->append_file()
208 Appends the objects for output to the read file and then appends
209 the appropriate table.
210
211 $p->out_file($fname)
212 Writes a PDF file to a file of the given filename, based on the
213 current list of objects to be output. It creates the trailer
214 dictionary based on information in $self.
215
216 $fname may be a string or an IO object.
217
218 $p->create_file($fname)
219 Creates a new output file (no check is made of an existing open
220 file) of the given filename or IO object. Note: make sure that
221 "$p->{' version'}" is set correctly before calling this function.
222
223 $p->close_file()
224 Closes up the open file for output, by outputting the trailer, etc.
225
226 ($value, $str) = $p->readval($str, %opts)
227 Reads a PDF value from the current position in the file. If $str is
228 too short, read some more from the current location in the file
229 until the whole object is read. This is a recursive call which may
230 slurp in a whole big stream (unprocessed).
231
232 Returns the recursive data structure read and also the current $str
233 that has been read from the file.
234
235 $ref = $p->read_obj($objind, %opts)
236 Given an indirect object reference, locate it and read the object
237 returning the read in object.
238
239 $ref = $p->read_objnum($num, $gen, %opts)
240 Returns a fully read object of given number and generation in this
241 file
242
243 $objind = $p->new_obj($obj)
244 Creates a new, free object reference based on free space in the
245 cross reference chain. If nothing is free, then think up a new
246 number. If $obj, then turns that object into this new object rather
247 than returning a new object.
248
249 $p->out_obj($obj)
250 Indicates that the given object reference should appear in the
251 output xref table whether with data or freed.
252
253 $p->free_obj($obj)
254 Marks an object reference for output as being freed.
255
256 $p->remove_obj($objind)
257 Removes the object from all places where we might remember it.
258
259 $p->ship_out(@objects)
260 $p->ship_out()
261 Ships the given objects (or all objects for output if @objects is
262 empty) to the currently open output file (assuming there is one).
263 Freed objects are not shipped, and once an object is shipped it is
264 switched such that this file becomes its source and it will not be
265 shipped again unless out_obj is called again. Notice that a shipped
266 out object can be re-output or even freed, but that it will not
267 cause the data already output to be changed.
268
269 $p->copy($outpdf, \&filter)
270 Iterates over every object in the file reading the object, calling
271 "filter" with the object, and outputting the result. If "filter" is
272 not defined, just copies input to output.
273
275 The following methods and functions are considered private to this
276 class. This does not mean you cannot use them if you have a need, just
277 that they aren't really designed for users of this class.
278
279 $offset = $p->locate_obj($num, $gen)
280 Returns a file offset to the object asked for by following the
281 chain of cross reference tables until it finds the one you want.
282
283 update($fh, $str, $instream)
284 Keeps reading $fh for more data to ensure that $str has at least a
285 line full for "readval" to work on. At this point we also take the
286 opportunity to ignore comments.
287
288 $objind = $p->test_obj($num, $gen)
289 Tests the cache to see whether an object reference (which may or
290 may not have been getobj()ed) has been cached. Returns it if it
291 has.
292
293 $p->add_obj($objind)
294 Adds the given object to the internal object cache.
295
296 $tdict = $p->readxrtr($xpos, %options)
297 Recursive function which reads each of the cross-reference and
298 trailer tables in turn until there are no more.
299
300 Returns a dictionary corresponding to the trailer chain. Each
301 trailer also includes the corresponding cross-reference table.
302
303 The structure of the xref private element in a trailer dictionary
304 is of an anonymous hash of cross reference elements by object
305 number. Each element consists of an array of 3 elements
306 corresponding to the three elements read in [location, generation
307 number, free or used]. See the PDF specification for details.
308
309 See "open" for options allowed.
310
311 $p->out_trailer($tdict, $update)
312 $p->out_trailer($tdict)
313 Outputs the body and trailer for a PDF file by outputting all the
314 objects in the ' outlist' and then outputting a xref table for
315 those objects and any freed ones. It then outputs the trailing
316 dictionary and the trailer code.
317
318 PDF::Builder::Basic::PDF::File->_new()
319 Creates a very empty PDF file object (used by new() and open())
320
322 Martin Hosken Martin_Hosken@sil.org
323
324 Copyright Martin Hosken 1999 and onwards
325
326 No warranty or expression of effectiveness, least of all regarding
327 anyone's safety, is implied in this software or documentation.
328
329
330
331perl v5.38.0 2023-07-21 PDF::Builder::Basic::PDF::File(3)