1PDF::Builder::Docs(3) User Contributed Perl DocumentationPDF::Builder::Docs(3)
2
3
4
6 PDF::Builder::Docs - additional documentation for Builder module
7
9 Software Development Kit
10 There are four levels of involvement with PDF::Builder. Depending on
11 what you want to do, different kinds of installs are recommended.
12
13 1. Simply installing PDF::Builder as a prerequisite for running some
14 other package. All you need to do is install the CPAN package for
15 PDF::Builder, and it will load the .pm files into your Perl library. If
16 the other package prereqs PDF::Builder, its installer may download and
17 install PDF::Builder automatically.
18
19 2. You want to write a Perl program that uses PDF::Builder functions.
20 In addition to installing PDF::Builder from CPAN, you will want
21 documentation on it. Obtain a copy of the product from GitHub
22 (https://github.com/PhilterPaper/Perl-PDF-Builder) or as a gzipped tar
23 file from CPAN. This includes a utility to build (from POD) a library
24 of HTML documents, as well as examples (examples/ directory) and
25 contributed sample programs (contrib/ directory).
26
27 3. You want to modify PDF::Builder files. In addition to the CPAN and
28 GitHub distributions, you may choose to keep a local Git repository for
29 tracking your changes. Depending on whether or not your PDF::Builder
30 copy is being used for production purposes, you may want to do your
31 editing and testing in the Perl library installation (live) or in a
32 different place. The "t" tests (t/ directory) and examples provide good
33 regression tests to ensure that you haven't broken anything. If you do
34 your editing on the live code, don't forget when done to copy the
35 changes back into the master version you keep!
36
37 4. You want to contribute to the development of PDF::Builder. You will
38 need a local Git repository (and a GitHub account), so that when you've
39 got it all done, you can issue a "Pull Request" to bring it to our
40 attention. We can't guarantee that your work will be incorporated into
41 the project, but at least we will look at it. From time to time, a new
42 CPAN version will be issued.
43
44 If you want to make substantial changes for public use, and can't come
45 to a meeting of minds with us, you can even start your own GitHub
46 project and register a new CPAN project (that's what we did, forking
47 PDF::API2). Please don't just assume that we don't want your changes --
48 at least propose what you want to do in writing, so we can consider it.
49 We're always looking for people to help out and expand PDF::Builder.
50
51 Optional Libraries
52 PDF::Builder can make use of some optional libraries, which are not
53 required for a successful installation. If you want improved speed and
54 capabilities for certain functions, you may want to install and use
55 these libraries:
56
57 * Graphics::TIFF -- PDF::Builder inherited a rather slow, buggy, and
58 limited TIFF image library from PDF::API2. If Graphics::TIFF (available
59 on CPAN, uses libtiff.a) is installed, PDF::Builder will use that
60 instead, unless you specify that it is to use the old, pure Perl
61 library. The only time you might want to consider this is when you need
62 to pass an open filehandle to "image_tiff" instead of a file name. See
63 resolved bug reports RT 84665 and RT 118047, as well as "image_tiff",
64 for more information.
65
66 * Image::PNG::Libpng -- PDF::Builder inherited a rather slow and buggy
67 pure Perl PNG image library from PDF::API2. If Image::PNG::Libpng
68 (available on CPAN, uses libpng.a) is installed, PDF::Builder will use
69 that instead, unless you specify that it is to use the old, pure Perl
70 library. Using the new library will give you improved speed, the
71 ability to use 16 bit samples, and the ability to read interlaced PNG
72 files. See resolved bug report RT 124349, as well as "image_png", for
73 more information.
74
75 * HarfBuzz::Shaper -- This library enables PDF::Builder to handle
76 complex scripts (Arabic, Devanagari, etc.) as well as non-LTR writing
77 systems. It is also useful for Latin and other simple scripts, for
78 ligatures and improved kerning. HarfBuzz::Shaper is based on a set of
79 HarfBuzz libraries, which it will attempt to build if they are not
80 found. See "textHS" for more information.
81
82 Note that the installation process will attempt to install these
83 libraries automatically. If you don't wish to use one or more of them,
84 you are free to uninstall the optional librarie(s). If one or more
85 failed to install, no need to panic -- you simply won't be able to use
86 some advanced features, unless you are able to manually install the
87 modules (e.g., with "cpan install").
88
89 Strings (Character Text)
90 Perl, and hence PDF::Builder, use strings that support the full range
91 of Unicode characters. When importing strings into a Perl program, for
92 example by reading text from a file, you must be aware of what their
93 character encoding is. Single-byte encodings (default is 'latin1'),
94 represented as bytes of value 0x00 through 0xFF (0..255), will produce
95 different results if you do something that depends on the encoding,
96 such as sorting, searching, or comparing any two non-ASCII characters.
97 This also applies to any characters (text) hard coded into the Perl
98 program.
99
100 You can always decode the text from external encoding (ASCII, UTF-8,
101 Latin-3, etc.) into the Perl (internal) UTF-8 multibyte encoding. This
102 uses one to four bytes to represent each character. See pragma "utf8"
103 and module "Encode" for details about decoding text. Note that only
104 TrueType fonts ("ttfont") can make direct use of UTF-8-encoded text.
105 Other font types (core, T1, etc.) can only use single-byte encoded
106 text. If your text is ASCII, Latin-1, or CP-1252, you can just leave
107 the Perl strings as the default single-byte encoding.
108
109 Then, there is the matter of encoding the output to match up with
110 available font character sets. You're not actually translating the text
111 on output, but are telling the output system (and Reader) what encoding
112 the output byte stream represents, and what character glyphs they
113 should generate.
114
115 If you confine your text to plain ASCII (0x00 .. 0x7F byte values) or
116 even Latin-1 or CP-1252 (0x00 .. 0xFF byte values), you can use default
117 (non-UTF-8) Perl strings and use the default output encoding
118 (WinAnsiEncoding), which is more-or-less Windows CP-1252 (a superset in
119 turn, of ISO-8859-1 Latin-1). If your text uses any other characters,
120 you will need to be aware of what encoding your text strings are (in
121 the Perl string and for declaring output glyph generation). See "Core
122 Fonts", "PS Fonts" and "TrueType Fonts" in "FONT METHODS" for
123 additional information.
124
125 Some Internal Details
126
127 Some of the following may be a bit scary or confusing to beginners, so
128 don't be afraid to skip over it until you're ready for it...
129
130 Perl (and PDF::Builder) internally use strings which are either single-
131 byte (ISO-8859-1/Latin-1) or multibyte UTF-8 encoded (there is an
132 internal flag marking the string as UTF-8 or not). If you work
133 strictly in ASCII or Latin-1 or CP-1252 (each a superset of the
134 previous), you should be OK in not doing anything special about your
135 string encoding. You can just use the default Perl single byte strings
136 (internally marked as not UTF-8) and the default output encoding
137 (WinAnsiEncoding).
138
139 If you intend to use input from a variety of sources, you should
140 consider decoding (converting) your text to UTF-8, which will provide
141 an internally consistent representation (and your Perl code itself
142 should be saved in UTF-8, in case you want to use any hard coded non-
143 ASCII characters). In any string, non-ASCII characters (0x80 or higher)
144 would be converted to the Perl UTF-8 internal representation, via
145 "$string = Encode::decode(MY_ENCODING, $input);". "MY_ENCODING" would
146 be a string like 'latin1', 'cp-1252', 'utf8', etc. Similar capabilities
147 are available for declaring a file to be in a certain encoding.
148
149 Be aware that if you use UTF-8 encoding for your text, that only
150 TrueType font output ("ttfont") can handle it directly. Corefont and
151 Type1 output will require that the text will have to be converted back
152 into a single-byte encoding (using "Encode::encode"), which may need to
153 be declared with "-encode" (for "corefont" or "psfont"). If you have
154 any characters not found in the selected single-byte encoding (but are
155 found in the font itself), you will need to use "automap" to break up
156 the font glyphs into 256 character planes, map such characters to 0x00
157 .. 0xFF in the appropriate plane, and switch between font planes as
158 necessary.
159
160 Core and Type1 fonts (output) use the byte values in the string
161 (single-byte encoding only!) and provide a byte-to-glyph mapping record
162 for each plane. TrueType outputs a group of four hexadecimal digits
163 representing the "CId" (character ID) of each character. The CId does
164 not correspond to either the single-byte or UTF-8 internal
165 representations of the characters.
166
167 The bottom line is that you need to know what the internal
168 representation of your text is, so that the output routines can tell
169 the PDF reader about it (via the PDF file). The text will not be
170 translated upon output, but the PDF reader needs to know what the
171 encoding in use is, so it knows what glyph to associate with each byte
172 (or byte sequence).
173
174 Note that some operating systems and Perl flavors are reputed to be
175 strict about encoding names. For example, latin1 (an alias) may be
176 rejected as invalid, while iso-8859-1 (a canonical value) will work.
177
178 By the way, it is recommended that you be using at least Perl 5.10 if
179 you are going to be using any non-ASCII characters. Perl 5.8 may be a
180 little unpredictable in handling such text.
181
182 Rendering Order
183 For better or worse, for compatibility purposes, PDF::Builder continues
184 the same rendering model as used by PDF::API2 (and possibly its
185 predecessors). That is, all graphics for one graphics object are put
186 into one record, and all text output for one text object goes into
187 another record. Which one is output first, is whichever is declared
188 first. This can lead to unexpected results, where items are rendered in
189 (apparently) the wrong order. That is, text and graphics items are not
190 necessarily output (rendered) in the same order as they were created in
191 code. Two items in the same object (e.g., $text) will be rendered in
192 the same order as they were coded, but items from different objects may
193 not be rendered in the expected order. The following example (source
194 code and annotated PDF excerpts) will hopefully illustrate the issue:
195
196 use strict;
197 use warnings;
198 use PDF::Builder;
199
200 # demonstrate text and graphics object order
201 #
202 my $fname = "objorder";
203
204 my $paper_size = "Letter";
205
206 # see the text and graphics stream contents
207 my $pdf = PDF::Builder->new(-compress => 'none');
208 $pdf->mediabox($paper_size);
209 my $page = $pdf->page();
210 # adjust path for your operating system
211 my $fontTR = $pdf->ttfont('C:\\Windows\\Fonts\\timesbd.ttf');
212
213 For the first group, you might expect the "under" line to be output,
214 then the filled circle (disc) partly covering it, then the "over" line
215 covering the disc, and finally a filled rectangle (bar) over both
216 lines. What actually happened is that the $grfx graphics object was
217 declared first, so everything in that object (the disc and bar) is
218 output first, and the text object $text (both lines) comes afterwards.
219 The result is that the text lines are on top of the graphics drawings.
220
221 # ----------------------------
222 # 1. text, orange ball over, text over, bar over
223
224 my $grfx1 = $page->gfx();
225 my $text1 = $page->text();
226 $text1->font($fontTR, 20); # 20 pt Times Roman bold
227
228 $text1->fillcolor('black');
229 $grfx1->strokecolor('blue');
230 $grfx1->fillcolor('orange');
231
232 $text1->translate(50,700);
233 $text1->text_left("This text should be under everything.");
234
235 $grfx1->circle(100,690, 30);
236 $grfx1->fillstroke();
237
238 $text1->translate(50,670);
239 $text1->text_left("This text should be over the ball and under the bar.");
240
241 $grfx1->rect(160,660, 20,70);
242 $grfx1->fillstroke();
243
244 % ---------------- group 1: define graphics object first, then text
245 11 0 obj << /Length 690 >> stream % obj 11 is graphics for (1)
246 0 0 1 RG % stroke blue
247 1 0.647059 0 rg % fill orange
248 130 690 m ... c h B % draw and fill circle
249 160 660 20 70 re B % draw and fill bar
250 endstream endobj
251
252 12 0 obj << /Length 438 >> stream % obj 12 is text for (1)
253 BT
254 /TiCBA 20 Tf % Times Roman Bold 20pt
255 0 0 0 rg % fill black
256 1 0 0 1 50 700 Tm % position text
257 <0037 ... 0011> Tj % "under" line
258 1 0 0 1 50 670 Tm % position text
259 <0037 ... 0011> Tj % "over" line
260 ET
261 endstream endobj
262
263 The second group is the same as the first, with the only difference
264 being that the text object was declared first, and then the graphics
265 object. The result is that the two text lines are rendered first, and
266 then the disc and bar are drawn over them.
267
268 # ----------------------------
269 # 2. (1) again, with graphics and text order reversed
270
271 my $text2 = $page->text();
272 my $grfx2 = $page->gfx();
273 $text2->font($fontTR, 20); # 20 pt Times Roman bold
274
275 $text2->fillcolor('black');
276 $grfx2->strokecolor('blue');
277 $grfx2->fillcolor('orange');
278
279 $text2->translate(50,600);
280 $text2->text_left("This text should be under everything.");
281
282 $grfx2->circle(100,590, 30);
283 $grfx2->fillstroke();
284
285 $text2->translate(50,570);
286 $text2->text_left("This text should be over the ball and under the bar.");
287
288 $grfx2->rect(160,560, 20,70);
289 $grfx2->fillstroke();
290
291 % ---------------- group 2: define text object first, then graphics
292 13 0 obj << /Length 438 >> stream % obj 13 is text for (2)
293 BT
294 /TiCBA 20 Tf % Times Roman Bold 20pt
295 0 0 0 rg % fill black
296 1 0 0 1 50 600 Tm % position text
297 <0037 ... 0011> Tj % "under" line
298 1 0 0 1 50 570 Tm % position text
299 <0037 ... 0011> Tj % "over" line
300 ET
301 endstream endobj
302
303 14 0 obj << /Length 690 >> stream % obj 14 is graphics for (2)
304 0 0 1 RG % stroke blue
305 1 0.647059 0 rg % fill orange
306 130 590 m ... h B % draw and fill circle
307 160 560 20 70 re B % draw and fill bar
308 endstream endobj
309
310 The third group defines two text and two graphics objects, in the order
311 that they are expected in. The "under" text line is output first, then
312 the orange disc graphics is output, partly covering the text. The
313 "over" text line is now output -- it's actually over the disc, but is
314 orange because the previous object stream (first graphics object) left
315 the fill color (also used for text) as orange, because we didn't
316 explicitly set the fill color before outputting the second text line.
317 This is not "inheritance" so much as it is whatever the graphics
318 (drawing) state (used for both "graphics" and "text") is left in at the
319 end of one object, it's the state at the beginning of the next object.
320 If you wish to control this, consider surrounding the graphics or text
321 calls with "save()" and "restore()" calls to save and restore (push and
322 pop) the graphics state to what it was at the "save()". Finally, the
323 bar is drawn over everything.
324
325 # ----------------------------
326 # 3. (2) again, with two graphics and two text objects
327
328 my $text3 = $page->text();
329 my $grfx3 = $page->gfx();
330 $text3->font($fontTR, 20); # 20 pt Times Roman bold
331 my $text4 = $page->text();
332 my $grfx4 = $page->gfx();
333 $text4->font($fontTR, 20); # 20 pt Times Roman bold
334
335 $text3->fillcolor('black');
336 $grfx3->strokecolor('blue');
337 $grfx3->fillcolor('orange');
338 # $text4->fillcolor('yellow');
339 # $grfx4->strokecolor('red');
340 # $grfx4->fillcolor('purple');
341
342 $text3->translate(50,500);
343 $text3->text_left("This text should be under everything.");
344
345 $grfx3->circle(100,490, 30);
346 $grfx3->fillstroke();
347
348 $text4->translate(50,470);
349 $text4->text_left("This text should be over the ball and under the bar.");
350
351 $grfx4->rect(160,460, 20,70);
352 $grfx4->fillstroke();
353
354 % ---------------- group 3: define text1, graphics1, text2, graphics2
355 15 0 obj << /Length 206 >> stream % obj 15 is text1 for (3)
356 BT
357 /TiCBA 20 Tf % Times Roman Bold 20pt
358 0 0 0 rg % fill black
359 1 0 0 1 50 500 Tm % position text
360 <0037 ... 0011> Tj % "under" line
361 ET
362 endstream endobj
363
364 16 0 obj << /Length 671 >> stream % obj 16 is graphics1 for (3) circle
365 0 0 1 RG % stroke blue
366 1 0.647059 0 rg % fill orange
367 130 490 m ... h B % draw and fill circle
368 endstream endobj
369
370 17 0 obj << /Length 257 >> stream % obj 17 is text2 for (3)
371 BT
372 /TiCBA 20 Tf % Times Roman Bold 20pt
373 1 0 0 1 50 470 Tm % position text
374 <0037 ... 0011> Tj % "over" line
375 ET
376 endstream endobj
377
378 18 0 obj << /Length 20 >> stream % obj 18 is graphics for (3) bar
379 160 460 20 70 re B % draw and fill bar
380 endstream endobj
381
382 The fourth group is the same as the third, except that we define the
383 fill color for the text in the second line. This makes it clear that
384 the "over" line (in yellow) was written after the orange disc, and
385 still before the bar.
386
387 # ----------------------------
388 # 4. (3) again, a new set of colors for second group
389
390 my $text3 = $page->text();
391 my $grfx3 = $page->gfx();
392 $text3->font($fontTR, 20); # 20 pt Times Roman bold
393 my $text4 = $page->text();
394 my $grfx4 = $page->gfx();
395 $text4->font($fontTR, 20); # 20 pt Times Roman bold
396
397 $text3->fillcolor('black');
398 $grfx3->strokecolor('blue');
399 $grfx3->fillcolor('orange');
400 $text4->fillcolor('yellow');
401 $grfx4->strokecolor('red');
402 $grfx4->fillcolor('purple');
403
404 $text3->translate(50,400);
405 $text3->text_left("This text should be under everything.");
406
407 $grfx3->circle(100,390, 30);
408 $grfx3->fillstroke();
409
410 $text4->translate(50,370);
411 $text4->text_left("This text should be over the ball and under the bar.");
412
413 $grfx4->rect(160,360, 20,70);
414 $grfx4->fillstroke();
415
416 % ---------------- group 4: define text1, graphics1, text2, graphics2 with colors for 2
417 19 0 obj << /Length 206 >> stream % obj 19 is text1 for (4)
418 BT
419 /TiCBA 20 Tf % Times Roman Bold 20pt
420 0 0 0 rg % fill black
421 1 0 0 1 50 400 Tm % position text
422 <0037 ... 0011> Tj % "under" line
423 ET
424 endstream endobj
425
426 20 0 obj << /Length 671 >> stream % obj 20 is graphics1 for (4) circle
427 0 0 1 RG % stroke blue
428 1 0.647059 0 rg % fill orange
429 130 390 m ... h B % draw and fill circle
430 endstream endobj
431
432 21 0 obj << /Length 266 >> stream % obj 21 is text2 for (4)
433 BT
434 /TiCBA 20 Tf % Times Roman Bold 20pt
435 1 1 0 rg % fill yellow
436 1 0 0 1 50 370 Tm % position text
437 <0037 ... 0011> Tj % "over" line
438 ET
439 endstream endobj
440
441 22 0 obj << /Length 52 >> stream % obj 22 is graphics for (4) bar
442 1 0 0 RG % stroke red
443 0.498039 0 0.498039 rg % fill purple
444 160 360 20 70 re B % draw and fill rectangle (bar)
445 endstream endobj
446
447 # ----------------------------
448 $pdf->saveas("$fname.pdf");
449
450 The separation of text and graphics means that only some text methods
451 are available in a graphics object, and only some graphics methods are
452 available in a text object. There is much overlap, but they differ.
453 There's really no reason the code couldn't have been written (in
454 PDF::API2, or earlier) as outputting to a single object, which would
455 keep everything in the same order as the method calls. An advantage
456 would be less object and stream overhead in the PDF file. The only
457 drawback might be that an object might more easily overflow and require
458 splitting into multiple objects, but that should be rare.
459
460 You should always be able to manually split an object by simply ending
461 output to the first object, and picking up with output to the second
462 object, so long as it was created immediately after the first object.
463 The graphics state at the end of the first object should be the initial
464 state at the beginning of the second object. However, use caution when
465 dealing with text objects -- the PDF specification states that the Text
466 matrices are not carried over from one object to the next (BT resets
467 them), so you may need to reset some settings.
468
469 $grfx1 = $page->gfx();
470 $grfx2 = $page->gfx();
471 # write a huge amount of stuff to $grfx1
472 # write a huge amount of stuff to $grfx2, picking up where $grfx1 left off
473
474 In any case, now that you understand the rendering order and how the
475 order of object declarations affects it, how text and graphics are
476 drawn can now be completely controlled as desired. There is really no
477 need to add another "both" type object that will handle all graphics
478 and text objects, as that would probably be a major code bloat for very
479 little benefit. However, it could be considered in the future if there
480 is a demonstrated need for it, such as serious PDF file size bloat due
481 to the extra object overhead when interleaving text and graphics
482 output.
483
484 PDF Versions Supported
485 When creating a PDF file using the functions in PDF::Builder, the
486 output is marked as PDF 1.4. This does not mean that all PDF
487 functionality up through 1.4 is supported! There are almost surely
488 features missing as far back as the PDF 1.0 standard.
489
490 The big problem is when a PDF of version 1.5 or higher is imported or
491 opened in PDF::Builder. If it contains content that is actually
492 unsupported by this software, there is a chance that something will
493 break. This does not guarantee that a PDF marked as "1.7" will go down
494 in flames when read by PDF::Builder, or that a PDF written back out
495 will break in a Reader, but the possibility is there. Much PDF writer
496 software simply marks its output as the highest version of PDF at the
497 time (usually 1.7), even if there is no content beyond, say, 1.2.
498 There is some handling of PDF 1.5 items in PDF::Builder, such as cross
499 reference streams, but support beyond 1.4 is very limited. All we can
500 say is to be careful when handling PDFs whose version is above 1.4, and
501 test thoroughly, as they may break at some point.
502
503 PDF::Builder includes a simple version control mechanism, where the
504 initial PDF version to be output (default 1.4) can be set by the
505 programmer. Input PDFs greater than 1.4 (current output level) will
506 receive a warning (can be suppressed) that the output level will be
507 raised to that level. The use of PDF features greater than the current
508 output level will likewise trigger a warning that the output level is
509 to be raised to the necessary level. If this is not desired, you should
510 avoid using those PDF features which are higher than the desired PDF
511 output level.
512
513 History
514 PDF::API2 was originally written by Alfred Reibenschuh, derived from
515 Martin Hosken's Text::PDF via the Text::PDF::API wrapper. In 2009,
516 Otto Hirr started the PDF::API3 fork, but it never went anywhere. In
517 2011, PDF::API2 maintenance was taken over by Steve Simms. In 2017,
518 PDF::Builder was forked by Phil M. Perry, who desired a more aggressive
519 schedule of new features and bug fixes than Simms was providing.
520
521 At Simms's request, the name of the new offering was changed from
522 PDF::API4 to PDF::Builder, to reduce the chance of confusion due to
523 parallel development. Perry's intent is to keep all internal methods
524 as upwardly compatible with PDF::API2 as possible, although it is
525 likely that there will be some drift (incompatibilities) over time. At
526 least initially, any program written based on PDF::API2 should be
527 convertible to PDF::Builder simply by changing "API2" anywhere it
528 occurs to "Builder". See the INFO/KNOWN_INCOMP known incompatibilities
529 file for further information.
530
532 After saving a file...
533 Note that a PDF object such as $pdf cannot continue to be used after
534 saving an output PDF file or string with $pdf->"save()", "saveas()", or
535 "stringify()". There is some cleanup and other operations done
536 internally which make the object unusable for further operations. You
537 will likely receive an error message about can't call method new_obj on
538 an undefined value if you try to keep using a PDF object.
539
540 IntegrityCheck
541 The PDF::Builder methods that open an existing PDF file, pass it by the
542 integrity checker method, "$self->IntegrityCheck(level, content)". This
543 method servers two purposes: 1) to find any "/Version" settings that
544 override the PDF version found in the PDF heading, and 2) perform some
545 basic validations on the contents of the PDF.
546
547 The "level" parameter accepts the following values:
548
549 0 = Do not output any diagnostic messages; just return any version
550 override.
551 1 = Output error-level (serious) diagnostic messages, as well as
552 returning any version override.
553 Errors include, in no place was the /Root object specified, or if
554 it was, the indicated object was not found. An object claims
555 another object as its child (/Kids list), but another object has
556 already claimed that child. An object claims a child, but that
557 child does not list a Parent, or the child lists a different
558 Parent.
559
560 2 = Output error- (serious) and warning- (less serious) level
561 diagnostic messages, as well as returning any version override. This is
562 the default.
563 3 = Output error- (serious), warning- (less serious), and note-
564 (informational) level diagnostic messages, as well as returning any
565 version override.
566 Notes include, in no place was the (optional) /Info object
567 specified, or if it was, the indicated object was not found. An
568 object was referenced, but no entry for it was found among the
569 objects. (This may be OK if the object is not defined, or is on the
570 free list, as the reference will then be ignored.) An object is
571 defined, but it appears that no other object is referencing it.
572
573 4 = Output error-, warning-, and note-level diagnostic messages, as
574 well as returning any version override. Also dump the diagnostic data
575 structure.
576 5 = Output error-, warning-, and note-level diagnostic messages, as
577 well as returning any version override. Also dump the diagnostic data
578 structure and the $self data structure (generally useful only if you
579 have already read in the PDF file).
580
581 The version is a string (e.g., '1.5') if found, otherwise "undef"
582 (undefined value) is returned.
583
584 For controlling the "automatic" call to IntegrityCheck (via opens), the
585 level may be given with the option (flag) "-diaglevel => n", where "n"
586 is between 0 and 5.
587
588 Preferences - set user display preferences
589 $pdf->preferences(%options)
590 Controls viewing preferences for the PDF.
591
592 Page Mode Options
593
594 -fullscreen
595 Full-screen mode, with no menu bar, window controls, or any
596 other window visible.
597
598 -thumbs
599 Thumbnail images visible.
600
601 -outlines
602 Document outline visible.
603
604 Page Layout Options
605
606 -singlepage
607 Display one page at a time.
608
609 -onecolumn
610 Display the pages in one column.
611
612 -twocolumnleft
613 Display the pages in two columns, with oddnumbered pages on the
614 left.
615
616 -twocolumnright
617 Display the pages in two columns, with oddnumbered pages on the
618 right.
619
620 Viewer Options
621
622 -hidetoolbar
623 Specifying whether to hide tool bars.
624
625 -hidemenubar
626 Specifying whether to hide menu bars.
627
628 -hidewindowui
629 Specifying whether to hide user interface elements.
630
631 -fitwindow
632 Specifying whether to resize the document's window to the size
633 of the displayed page.
634
635 -centerwindow
636 Specifying whether to position the document's window in the
637 center of the screen.
638
639 -displaytitle
640 Specifying whether the window's title bar should display the
641 document title taken from the Title entry of the document
642 information dictionary.
643
644 -afterfullscreenthumbs
645 Thumbnail images visible after Full-screen mode.
646
647 -afterfullscreenoutlines
648 Document outline visible after Full-screen mode.
649
650 -printscalingnone
651 Set the default print setting for page scaling to none.
652
653 -simplex
654 Print single-sided by default.
655
656 -duplexflipshortedge
657 Print duplex by default and flip on the short edge of the
658 sheet.
659
660 -duplexfliplongedge
661 Print duplex by default and flip on the long edge of the sheet.
662
663 Initial Page Options
664
665 -firstpage => [ $page, %options ]
666 Specifying the page (either a page number or a page object) to be
667 displayed, plus one of the following options:
668
669 -fit => 1
670 Display the page designated by page, with its contents
671 magnified just enough to fit the entire page within the window
672 both horizontally and vertically. If the required horizontal
673 and vertical magnification factors are different, use the
674 smaller of the two, centering the page within the window in the
675 other dimension.
676
677 -fith => $top
678 Display the page designated by page, with the vertical
679 coordinate top positioned at the top edge of the window and the
680 contents of the page magnified just enough to fit the entire
681 width of the page within the window.
682
683 -fitv => $left
684 Display the page designated by page, with the horizontal
685 coordinate left positioned at the left edge of the window and
686 the contents of the page magnified just enough to fit the
687 entire height of the page within the window.
688
689 -fitr => [ $left, $bottom, $right, $top ]
690 Display the page designated by page, with its contents
691 magnified just enough to fit the rectangle specified by the
692 coordinates left, bottom, right, and top entirely within the
693 window both horizontally and vertically. If the required
694 horizontal and vertical magnification factors are different,
695 use the smaller of the two, centering the rectangle within the
696 window in the other dimension.
697
698 -fitb => 1
699 Display the page designated by page, with its contents
700 magnified just enough to fit its bounding box entirely within
701 the window both horizontally and vertically. If the required
702 horizontal and vertical magnification factors are different,
703 use the smaller of the two, centering the bounding box within
704 the window in the other dimension.
705
706 -fitbh => $top
707 Display the page designated by page, with the vertical
708 coordinate top positioned at the top edge of the window and the
709 contents of the page magnified just enough to fit the entire
710 width of its bounding box within the window.
711
712 -fitbv => $left
713 Display the page designated by page, with the horizontal
714 coordinate left positioned at the left edge of the window and
715 the contents of the page magnified just enough to fit the
716 entire height of its bounding box within the window.
717
718 -xyz => [ $left, $top, $zoom ]
719 Display the page designated by page, with the coordinates
720 (left, top) positioned at the top-left corner of the window and
721 the contents of the page magnified by the factor zoom. A zero
722 (0) value for any of the parameters left, top, or zoom
723 specifies that the current value of that parameter is to be
724 retained unchanged.
725
726 Example
727
728 $pdf->preferences(
729 -fullscreen => 1,
730 -onecolumn => 1,
731 -afterfullscreenoutlines => 1,
732 -firstpage => [$page, -fit => 1],
733 );
734
735 info Example
736 %h = $pdf->info(
737 'Author' => "Alfred Reibenschuh",
738 'CreationDate' => "D:20020911000000+01'00'",
739 'ModDate' => "D:YYYYMMDDhhmmssOHH'mm'",
740 'Creator' => "fredos-script.pl",
741 'Producer' => "PDF::Builder",
742 'Title' => "some Publication",
743 'Subject' => "perl ?",
744 'Keywords' => "all good things are pdf"
745 );
746 print "Author: $h{'Author'}\n";
747
748 XMP XML example
749 $xml = $pdf->xmpMetadata();
750 print "PDFs Metadata reads: $xml\n";
751 $xml=<<EOT;
752 <?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
753 <?adobe-xap-filters esc="CRLF"?>
754 <x:xmpmeta
755 xmlns:x='adobe:ns:meta/'
756 x:xmptk='XMP toolkit 2.9.1-14, framework 1.6'>
757 <rdf:RDF
758 xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
759 xmlns:iX='http://ns.adobe.com/iX/1.0/'>
760 <rdf:Description
761 rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
762 xmlns:pdf='http://ns.adobe.com/pdf/1.3/'
763 pdf:Producer='Acrobat Distiller 6.0.1 for Macintosh'></rdf:Description>
764 <rdf:Description
765 rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
766 xmlns:xap='http://ns.adobe.com/xap/1.0/'
767 xap:CreateDate='2004-11-14T08:41:16Z'
768 xap:ModifyDate='2004-11-14T16:38:50-08:00'
769 xap:CreatorTool='FrameMaker 7.0'
770 xap:MetadataDate='2004-11-14T16:38:50-08:00'></rdf:Description>
771 <rdf:Description
772 rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
773 xmlns:xapMM='http://ns.adobe.com/xap/1.0/mm/'
774 xapMM:DocumentID='uuid:919b9378-369c-11d9-a2b5-000393c97fd8'/></rdf:Description>
775 <rdf:Description
776 rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
777 xmlns:dc='http://purl.org/dc/elements/1.1/'
778 dc:format='application/pdf'>
779 <dc:description>
780 <rdf:Alt>
781 <rdf:li xml:lang='x-default'>Adobe Portable Document Format (PDF)</rdf:li>
782 </rdf:Alt>
783 </dc:description>
784 <dc:creator>
785 <rdf:Seq>
786 <rdf:li>Adobe Systems Incorporated</rdf:li>
787 </rdf:Seq>
788 </dc:creator>
789 <dc:title>
790 <rdf:Alt>
791 <rdf:li xml:lang='x-default'>PDF Reference, version 1.6</rdf:li>
792 </rdf:Alt>
793 </dc:title>
794 </rdf:Description>
795 </rdf:RDF>
796 </x:xmpmeta>
797 <?xpacket end='w'?>
798 EOT
799
800 $xml = $pdf->xmpMetadata($xml);
801 print "PDF metadata now reads: $xml\n";
802
803 "BOX" METHODS
804 A general note: Use care if specifying a different Media Box (or other
805 "box") for a page, than the global "box" setting, to define the whole
806 "chain" of boxes on the page, to avoid surprises. For example, to
807 define a global Media Box (paper size) and a global Crop Box, and then
808 define a new page-level Media Box without defining a new page-level
809 Crop Box, may give odd results in the resultant cropping. Such
810 combinations are not well defined.
811
812 All dimensions in boxes default to the default User Unit, which is
813 points (1/72 inch). Note that the PDF specification limits sizes and
814 coordinates to 14400 User Units (200 inches, for the default User Unit
815 of one point), and Adobe products (so far) follow this limit for
816 Acrobat and Distiller. It is worth noting that other PDF writers and
817 readers may choose to ignore the 14400 unit limit, with or without the
818 use of a specified User Unit. Therefore, PDF::Builder does not enforce
819 any limits on coordinates -- it's your responsibility to consider what
820 readers and other PDF tools may be used with a PDF you produce! Also
821 note that earlier Acrobat readers had coordinate limits as small as
822 3240 User Units (45 inches), and minimum media size of 72 or 3 User
823 Units.
824
825 User Units
826
827 $pdf->userunit($number)
828 The default User Unit in the PDF coordinate system is one point
829 (1/72 inch). You can think of it as a scale factor to enable larger
830 (or even, smaller) documents. This method may be used (for PDF 1.6
831 and higher) to set the User Unit to some number of points. For
832 example, "userunit(72)" will set the scale multiplier to 72.0
833 points per User Unit, or 1 inch to the User Unit. Any number
834 greater than zero is acceptable, although some readers and tools
835 may not handle User Units of less than 1.0 very well.
836
837 Not all readers respect the User Unit, if you give one, or handle
838 it in exactly the same way. Adobe Distiller, for one, does not use
839 it. How User Units are handled may vary from reader to reader.
840 Adobe Acrobat, at this writing, respects User Unit in version 7.0
841 and up, but limits it to 75000 (giving a maximum document size of
842 15 million inches or 236.7 miles or 381 km). Other readers and PDF
843 tools may allow a larger (or smaller) limit.
844
845 Your Mileage May Vary: Some readers ignore a global User Unit
846 setting and do not have pages inherit it (PDF::Builder duplicates
847 it on each page to simulate inheritance). Some readers may give
848 spurious warnings about truncated content when a Media Box is
849 changed while User Units are being used. Some readers do strange
850 things with Crop Boxes when a User Unit is in effect.
851
852 Depending on the reader used, the effect of a larger User Unit
853 (greater than 1) may mean lower resolution (chunkier or coarser
854 appearance) in the rendered document. If you're printing something
855 the size of a highway billboard, this may not matter to you, but
856 you should be aware of the possibility (even with fractional
857 coordinates). Conversely, a User Unit of less than 1.0 (if
858 permitted) reduces the allowable size of your document, but may
859 result in greater resolution.
860
861 A global (PDF level) User Unit setting is inherited by each page
862 (an action by PDF::Builder, not necessarily automatically done by
863 the reader), or can be overridden by calling userunit in the page.
864 Do not give more than one global userunit setting, as only the last
865 one will be used. Setting a page's User Unit (if "$page->"
866 instead) is permitted (overriding the global setting for this
867 page). However, many sources recommend against doing this, as
868 results may not be as expected (once again, depending on the quirks
869 of the reader).
870
871 Remember to call "userunit" before calling anything having to do
872 with page or box sizes, or coordinates. Especially when setting
873 'named' box sizes, the methods need to know the current User Unit
874 so that named page sizes (in points) may be scaled down to the
875 current User Unit.
876
877 Media Box
878
879 $pdf->mediabox($name)
880 $pdf->mediabox($name, -orient => 'orientation' )
881 $pdf->mediabox($w,$h)
882 $pdf->mediabox($llx,$lly, $urx,$ury)
883 ($llx,$lly, $urx,$ury) = $pdf->mediabox()
884 Sets the global Media Box (or page's Media Box, if "$page->"
885 instead). This defines the width and height (or by corner
886 coordinates, or by standard name) of the output page itself, such
887 as the physical paper size. This is normally the largest of the
888 "boxes". If any subsidiary box (within it) exceeds the media box,
889 the portion of the material or boxes outside of the Media Box will
890 be ignored. That is, the Media Box is the One Box to Rule Them All,
891 and is the overall limit for other boxes (some documentation refers
892 to the Media Box as "clipping" other boxes). In addition, the Media
893 Box defines the overall coordinate system for text and graphics
894 operations.
895
896 If no arguments are given, the current Media Box (global or page)
897 coordinates are returned instead. The former "get_mediabox" (page
898 only) function is deprecated and will likely be removed some time
899 in the future. In addition, when setting the Media Box, the
900 resulting coordinates are returned. This permits you to specify the
901 page size by a name (alias) and get the dimensions back, all in one
902 call.
903
904 Note that many printers can not print all the way to the physical
905 edge of the paper, so you should plan to leave some blank margin,
906 even outside of any crop marks and bleeds. Printers and on-screen
907 readers are free to discard any content found outside the Media
908 Box, and printers may discard some material just inside the Media
909 Box.
910
911 A global Media Box is required by the PDF spec; if not explicitly
912 given, PDF::Builder will set the global Media Box to US Letter size
913 (8.5in x 11in). This is the media size that will be used for all
914 pages if you do not specify a "mediabox" call on a page. That is, a
915 global (PDF level) mediabox setting is inherited by each page, or
916 can be overridden by setting mediabox in the page. Do not give more
917 than one global mediabox setting, as only the last one will be
918 used.
919
920 If you give a single string name (e.g., 'A4'), you may optionally
921 add an orientation to turn the page 90 degrees into Landscape mode:
922 "-orient => 'L'" or "-orient => 'l'". "-orient" is the only option
923 recognized, and a string beginning with an 'L' or 'l' (for
924 Landscape) is the only value of interest (anything else is treated
925 as Portrait mode). The y axis still runs from 0 at the bottom of
926 the page to what used to be the page width (now, height) at the
927 top, and likewise for the x axis: 0 at left to (former) height at
928 the right. That is, the coordinate system is the same as before,
929 except that the height and width are different.
930
931 The lower left corner does not have to be 0,0. It can be any values
932 you want, including negative values (so long as the resulting
933 media's sides are at least one point long). "mediabox" sets the
934 coordinate system (including the origin) of the graphics and text
935 that will be drawn, as well as for subsequent "boxes". It's even
936 possible to give any two opposite corners (such as upper left and
937 lower right). The coordinate system will be rearranged (by the
938 Reader) to still be the conventional minimum "x" and "y" in the
939 lower left (i.e., you can't make "y" increase from top to bottom!).
940
941 Example:
942
943 $pdf = PDF::Builder->new();
944 $pdf->mediabox('A4'); # A4 size (595 Pt wide by 842 Pt high)
945 ...
946 $pdf->saveas('our/new.pdf');
947
948 $pdf = PDF::Builder->new();
949 $pdf->mediabox(595, 842); # A4 size, with implicit 0,0 LL corner
950 ...
951 $pdf->saveas('our/new.pdf');
952
953 $pdf = PDF::Builder->new;
954 $pdf->mediabox(0, 0, 595, 842); # A4 size, with explicit 0,0 LL corner
955 ...
956 $pdf->saveas('our/new.pdf');
957
958 See the PDF::Builder::Resource::PaperSizes source code for the full
959 list of supported names (aliases) and their dimensions in points.
960 You are free to add additional paper sizes to this file, if you
961 wish. You might want to do this if you frequently use a standard
962 page size in rotated (Landscape) mode. See also the "getPaperSizes"
963 call in PDF::Builder::Util. These names (aliases) are also usable
964 in other "box" calls, although useful only if the "box" is the same
965 size as the full media (Media Box), and you don't mind their
966 starting at 0,0.
967
968 Crop Box
969
970 $pdf->cropbox($name)
971 $pdf->cropbox($name, -orient => 'orientation')
972 $pdf->cropbox($w,$h)
973 $pdf->cropbox($llx,$lly, $urx,$ury)
974 ($llx,$lly, $urx,$ury) = $pdf->cropbox()
975 Sets the global Crop Box (or page's Crop Box, if "$page->"
976 instead). This will define the media size to which the output will
977 later be clipped. Note that this does not itself output any crop
978 marks to guide cutting of the paper! PDF Readers should consider
979 this to be the visible portion of the page, and anything found
980 outside it may be clipped (invisible). By default, it is equal to
981 the Media Box, but may be defined to be smaller, in the coordinate
982 system set by the Media Box. A global setting will be inherited by
983 each page, but can be overridden on a per-page basis.
984
985 A Reader or Printer may choose to discard any clipped (invisible)
986 part of the page, and show only the area within the Crop Box. For
987 example, if your page Media Box is A4 (0,0 to 595,842 Points), and
988 your Crop Box is (100,100 to 495,742), a reader such as Adobe
989 Acrobat Reader may show you a page 395 by 642 Points in size (i.e.,
990 just the visible area of your page). Other Readers may show you the
991 full media size (Media Box) and a 100 Point wide blank area (in
992 this example) around the visible content.
993
994 If no arguments are given, the current Crop Box (global or page)
995 coordinates are returned instead. The former "get_cropbox" (page
996 only) function is deprecated and will likely be removed some time
997 in the future. If a Crop Box has not been defined, the Media Box
998 coordinates (which always exist) will be returned instead. In
999 addition, when setting the Crop Box, the resulting coordinates are
1000 returned. This permits you to specify the crop box by a name
1001 (alias) and get the dimensions back, all in one call.
1002
1003 Do not confuse the Crop Box with the "Trim Box", which shows where
1004 printed paper is expected to actually be cut. Some PDF Readers may
1005 reduce the visible "paper" background to the size of the crop box;
1006 others may simply omit any content outside it. Either way, you
1007 would lose any trim or crop marks, printer instructions, color
1008 alignment dots, or other content outside the Crop Box. A good use
1009 of the Crop Box would be limit printing to the area where a printer
1010 can reliably put down ink, and leave white the edge areas where
1011 paper-handling mechanisms prevent ink or toner from being applied.
1012 This would keep you from accidentally putting valuable content in
1013 an area where a printer will refuse to print, yet permit you to
1014 include a bleed area and space for printer's marks and
1015 instructions. Needless to say, if your printer cannot print to the
1016 very edge of the paper, you will need to trim (cut) the printed
1017 sheets to get true bleeds.
1018
1019 A global (PDF level) cropbox setting is inherited by each page, or
1020 can be overridden by setting cropbox in the page. As with
1021 "mediabox", only one crop box may be set at this (PDF) level. As
1022 with "mediabox", a named media size may have an orientation (l or
1023 L) for Landscape mode. Note that the PDF level global Crop Box
1024 will be used even if the page gets its own Media Box. That is, the
1025 page's Crop Box inherits the global Crop Box, not the page Media
1026 Box, even if the page has its own media size! If you set the page's
1027 own Media Box, you should consider also explicitly setting the page
1028 Crop Box (and other boxes).
1029
1030 Bleed Box
1031
1032 $pdf->bleedbox($name)
1033 $pdf->bleedbox($name, -orient => 'orientation')
1034 $pdf->bleedbox($w,$h)
1035 $pdf->bleedbox($llx,$lly, $urx,$ury)
1036 ($llx,$lly, $urx,$ury) = $pdf->bleedbox()
1037 Sets the global Bleed Box (or page's Bleed Box, if "$page->"
1038 instead). This is typically used in printing on paper, where you
1039 want ink or color (such as thumb tabs) to be printed a bit beyond
1040 the final paper size, to ensure that the cut paper bleeds (the cut
1041 goes through the ink), rather than accidentally leaving some white
1042 paper visible outside. Allow enough "bleed" over the expected trim
1043 line to account for minor variations in paper handling, folding,
1044 and cutting; to avoid showing white paper at the edge. The Bleed
1045 Box is where printing could actually extend to; the Trim Box is
1046 normally within it, where the paper would actually be cut. The
1047 default value is equal to the Crop Box, but is often a bit smaller.
1048 The space between the Bleed Box and the Crop Box is available for
1049 printer instructions, color alignment dots, etc., while crop marks
1050 (trim guides) are at least partly within the bleed area (and should
1051 be printed after content is printed).
1052
1053 If no arguments are given, the current Bleed Box (global or page)
1054 coordinates are returned instead. The former "get_bleedbox" (page
1055 only) function is deprecated and will likely be removed some time
1056 in the future. If a Bleed Box has not been defined, the Crop Box
1057 coordinates (if defined) will be returned, otherwise the Media Box
1058 coordinates (which always exist) will be returned. In addition,
1059 when setting the Bleed Box, the resulting coordinates are returned.
1060 This permits you to specify the bleed box by a name (alias) and get
1061 the dimensions back, all in one call.
1062
1063 A global (PDF level) bleedbox setting is inherited by each page, or
1064 can be overridden by setting bleedbox in the page. As with
1065 "mediabox", only one bleed box may be set at this (PDF) level. As
1066 with "mediabox", a named media size may have an orientation (l or
1067 L) for Landscape mode. Note that the PDF level global Bleed Box
1068 will be used even if the page gets its own Crop Box. That is, the
1069 page's Bleed Box inherits the global Bleed Box, not the page Crop
1070 Box, even if the page has its own media size! If you set the page's
1071 own Media Box or Crop Box, you should consider also explicitly
1072 setting the page Bleed Box (and other boxes).
1073
1074 Trim Box
1075
1076 $pdf->trimbox($name)
1077 $pdf->trimbox($name, -orient => 'orientation')
1078 $pdf->trimbox($w,$h)
1079 $pdf->trimbox($llx,$lly, $urx,$ury)
1080 ($llx,$lly, $urx,$ury) = $pdf->trimbox()
1081 Sets the global Trim Box (or page's Trim Box, if "$page->"
1082 instead). This is supposed to be the actual dimensions of the
1083 finished page (after trimming of the paper). In some production
1084 environments, it is useful to have printer's instructions, cut
1085 marks, and so on outside of the trim box. The default value is
1086 equal to Crop Box, but is often a bit smaller than any Bleed Box,
1087 to allow the desired "bleed" effect.
1088
1089 If no arguments are given, the current Trim Box (global or page)
1090 coordinates are returned instead. The former "get_trimbox" (page
1091 only) function is deprecated and will likely be removed some time
1092 in the future. If a Trim Box has not been defined, the Crop Box
1093 coordinates (if defined) will be returned, otherwise the Media Box
1094 coordinates (which always exist) will be returned. In addition,
1095 when setting the Trim Box, the resulting coordinates are returned.
1096 This permits you to specify the trim box by a name (alias) and get
1097 the dimensions back, all in one call.
1098
1099 A global (PDF level) trimbox setting is inherited by each page, or
1100 can be overridden by setting trimbox in the page. As with
1101 "mediabox", only one trim box may be set at this (PDF) level. As
1102 with "mediabox", a named media size may have an orientation (l or
1103 L) for Landscape mode. Note that the PDF level global Trim Box
1104 will be used even if the page gets its own Crop Box. That is, the
1105 page's Trim Box inherits the global Trim Box, not the page Crop
1106 Box, even if the page has its own media size! If you set the page's
1107 own Media Box or Crop Box, you should consider also explicitly
1108 setting the page Trim Box (and other boxes).
1109
1110 Art Box
1111
1112 $pdf->artbox($name)
1113 $pdf->artbox($name, -orient => 'orientation')
1114 $pdf->artbox($w,$h)
1115 $pdf->artbox($llx,$lly, $urx,$ury)
1116 ($llx,$lly, $urx,$ury) = $pdf->artbox()
1117 Sets the global Art Box (or page's Art Box, if "$page->" instead).
1118 This is supposed to define "the extent of the page's meaningful
1119 content (including [margins])". It might exclude some content, such
1120 as Headlines or headings. Any binding or punched-holes margin would
1121 typically be outside of the Art Box, as would be page numbers and
1122 running headers and footers. The default value is equal to the Crop
1123 Box, although normally it would be no larger than any Trim Box. The
1124 Art Box may often be used for defining "important" content (e.g.,
1125 excluding advertisements) that may or may not be brought over to
1126 another page (e.g., N-up printing).
1127
1128 If no arguments are given, the current Art Box (global or page)
1129 coordinates are returned instead. The former "get_artbox" (page
1130 only) function is deprecated and will likely be removed some time
1131 in the future. If an Art Box has not been defined, the Crop Box
1132 coordinates (if defined) will be returned, otherwise the Media Box
1133 coordinates (which always exist) will be returned. In addition,
1134 when setting the Art Box, the resulting coordinates are returned.
1135 This permits you to specify the art box by a name (alias) and get
1136 the dimensions back, all in one call.
1137
1138 A global (PDF level) artbox setting is inherited by each page, or
1139 can be overridden by setting artbox in the page. As with
1140 "mediabox", only one art box may be set at this (PDF) level. As
1141 with "mediabox", a named media size may have an orientation (l or
1142 L) for Landscape mode. Note that the PDF level global Art Box will
1143 be used even if the page gets its own Crop Box. That is, the page's
1144 Art Box inherits the global Art Box, not the page Crop Box, even if
1145 the page has its own media size! If you set the page's own Media
1146 Box or Crop Box, you should consider also explicitly setting the
1147 page Art Box (and other boxes).
1148
1149 Suggested Box Usage
1150
1151 See "examples/Boxes.pl" for an example of using boxes.
1152
1153 How you define your boxes (or let them default) is up to you, depending
1154 on whether you're duplex printing US Letter or A4 on your laser
1155 printer, to be spiral bound on the bind margin, or engaging a
1156 professional printer. In the latter case, discuss in advance with the
1157 print firm what capabilities (and limitations) they have and what
1158 information they need from a PDF file. For instance, they may not want
1159 a Crop Box defined, and may call for very specific box sizes. For large
1160 press runs, they may print multiple pages (N-up) duplexed on large web
1161 roll "signatures", which are then intricately folded and guillotined
1162 (trimmed) and bound together into books or magazines. You would usually
1163 just supply a PDF with all the pages; they would take care of the
1164 signature layout (which includes offsets and 180 degree rotations).
1165
1166 (As an aside, don't count on a printer having any particular font
1167 available, so be sure to ask. Usually they will want you to embed all
1168 fonts used, but ask first, and double-check before handing over the
1169 print job! TTF/OTF fonts ("ttfont()") are embedded by default, but
1170 other fonts (core, ps, bdf, cjk) are not! A printer may have a core
1171 font collection, but they are free to substitute a "workalike" font for
1172 any given core font, and the results may not match what you saw on your
1173 PC!)
1174
1175 On the assumption that you're using a single sheet (US Letter or A4)
1176 laser or inkjet printer, are you planning to trim each sheet down to a
1177 smaller final size? If so, you can do true bleeds by defining a Trim
1178 Box and a slightly larger Bleed Box. You would print bleeds (all the
1179 way to the finished edge) out to the Bleed Box, but nothing is enforced
1180 about the Bleed Box. At the other end of the spectrum, you would define
1181 the Media Box to be the physical paper size being printed on. Most
1182 printers reserve a little space on the sides (and possibly top and
1183 bottom) for paper handling, so it is often good to define your Crop Box
1184 as the printable area. Remember that the Media Box sets the coordinate
1185 system used, so you still need to avoid going outside the Crop Box with
1186 content (most readers and printers will not show any ink outside of the
1187 Crop Box). Whether or not you define a Crop Box, you're going to almost
1188 always end up with white paper on at least the sides.
1189
1190 For small in-house jobs, you probably won't need color alignment dots
1191 and other such professional instructions and information between the
1192 Bleed Box and the Crop Box, but crop marks for trimming (if used)
1193 should go just outside the Trim Box (partly or wholly within the Bleed
1194 Box), and be drawn after all content. If you're not trimming the paper,
1195 don't try to do any bleed effects (including solid background color
1196 pages/covers), as you will usually have a white edge around the sheet
1197 anyway. Don't count on a PDF document never being physically printed,
1198 and not just displayed (where you can do things like bleed all the way
1199 to the media edge). Finally, for single sheet printing, an Art Box is
1200 probably unnecessary, but if you're combining pages into N-up prints,
1201 or doing other manipulations, it may be useful.
1202
1203 Box Inheritance
1204
1205 What Media, Crop, Bleed, Trim, and Art Boxes a page gets can be a
1206 little complicated. Note that usually, only the Media and Crop Boxes
1207 will have a clear visual effect. The visual effect of the other boxes
1208 (if any) may be very subtle.
1209
1210 First, everything is set at the global (PDF) level. The Media Box is
1211 always defined, and defaults to US Letter (8.5 inches wide by 11 inches
1212 high). The global Crop Box inherits the Media Box, unless explicitly
1213 defined. The Bleed, Trim, and Art Boxes inherit the Crop Box, unless
1214 explicitly defined. A global box should only be defined once, as the
1215 last one defined is the one that will be written to the PDF!
1216
1217 Second, a page inherits the global boxes, for its initial settings. You
1218 may call any of the box set methods ("cropbox", "trimbox", etc.) to
1219 explicitly set (override) any box for this page. Note that setting a
1220 new Media Box for the page does not reset the page's Crop Box -- it
1221 still uses whatever it inherited from the global Crop Box. You would
1222 need to explicitly set the page's Crop Box if you want a different
1223 setting. Likewise, the page's Bleed, Trim, and Art Boxes will not be
1224 reset by a new page Crop Box -- they will still inherit from the global
1225 (PDF) settings.
1226
1227 Third, the page Media Box (the one actually used for output pages),
1228 clips or limits all the other boxes to extend no larger than its size.
1229 For example, if the Media Box is US Letter, and you set a Crop Box of
1230 A4 size, the smaller of the two heights (11 inches) would be effective,
1231 and the smaller of the two widths (8.26 inches, 595 Points) would be
1232 effective. The given dimensions of a box are returned on query (get),
1233 not the effective dimensions clipped by the Media Box.
1234
1235 FONT METHODS
1236 Core Fonts
1237
1238 Core fonts are limited to single byte encodings. You cannot use UTF-8
1239 or other multibyte encodings with core fonts. The default encoding for
1240 the core fonts is WinAnsiEncoding (roughly the CP-1252 superset of
1241 ISO-8859-1). See the "-encode" option below to change this encoding.
1242 See "font automap" in PDF::Builder::Resource::Font method for
1243 information on accessing more than 256 glyphs in a font, using planes,
1244 although there is no guarantee that future changes to font files will
1245 permit consistent results.
1246
1247 Note that core fonts use fixed lists of expected glyphs, along with
1248 metrics such as their widths. This may not exactly match up with
1249 whatever local font file is used by the PDF reader. It's usually pretty
1250 close, but many cases have been found where the list of glyphs is
1251 different between the core fonts and various local font files, so be
1252 aware of this.
1253
1254 To allow UTF-8 text and extended glyph counts, you should consider
1255 replacing your use of core fonts with TrueType (.ttf) and OpenType
1256 (.otf) fonts. There are tools, such as FontForge, which can do a fairly
1257 good (though, not perfect) job of converting a Type1 font library to
1258 OTF.
1259
1260 Examples:
1261
1262 $font1 = $pdf->corefont('Times-Roman', -encode => 'latin2');
1263 $font2 = $pdf->corefont('Times-Bold');
1264 $font3 = $pdf->corefont('Helvetica');
1265 $font4 = $pdf->corefont('ZapfDingbats');
1266
1267 Valid %options are:
1268
1269 -encode
1270 Changes the encoding of the font from its default. Notice that the
1271 encoding (not the entire font's glyph list) is shown in a PDF
1272 object (record), listing 256 glyphs associated with this encoding
1273 (and that are available in this font).
1274
1275 -dokern
1276 Enables kerning if data is available.
1277
1278 Notes:
1279
1280 Even though these are called "core" fonts, they are not shipped with
1281 PDF::Builder, but are expected to be found on the machine with the PDF
1282 reader. Most core fonts are installed with a PDF reader, and thus are
1283 not coordinated with PDF::Builder. PDF::Builder does ship with core
1284 font metrics files (width, glyph names, etc.), but these cannot be
1285 guaranteed to be in sync with what the PDF reader has installed!
1286
1287 There are some 14 core fonts (regular, italic, bold, and bold-italic
1288 for Times [serif], Helvetica [sans serif], Courier [fixed pitch]; plus
1289 two symbol fonts) that are supposed to be available on any PDF reader,
1290 although other fonts with very similar metrics are often substituted.
1291 You should not count on any of the 15 Windows core fonts (Bank Gothic,
1292 Georgia, Trebuchet, Verdana, and two more symbol fonts) being present,
1293 especially on Linux, Mac, or other non-Windows platforms. Be aware if
1294 you are producing PDFs to be read on a variety of different systems!
1295
1296 If you want to ensure the widest portability for a PDF document you
1297 produce, you should consider using TTF fonts (instead of core fonts)
1298 and embedding them in the document. This ensures that there will be no
1299 substitutions, that all metrics are known and match the glyphs, UTF-8
1300 encoding can be used, and that the glyphs will be available on the
1301 reader's machine. At least on Windows platforms, most of the fonts are
1302 TTF anyway, which are used behind the scenes for "core" fonts, while
1303 missing most of the capabilities of TTF (now or possibly later in
1304 PDF::Builder) such as embedding, ligatures, UTF-8, etc. The downside
1305 is, obviously, that the resulting PDF file will be larger because it
1306 includes the font(s). There might also be copyright or licensing issues
1307 with the redistribution of font files in this manner (you might want to
1308 check, before widely distributing a PDF document with embedded fonts,
1309 although many do permit the part of the font used, to be embedded.).
1310
1311 See also PDF::Builder::Resource::Font::CoreFont.
1312
1313 PS Fonts
1314
1315 PS (T1) fonts are limited to single byte encodings. You cannot use
1316 UTF-8 or other multibyte encodings with T1 fonts. The default encoding
1317 for the T1 fonts is WinAnsiEncoding (roughly the CP-1252 superset of
1318 ISO-8859-1). See the "-encode" option below to change this encoding.
1319 See "font automap" in PDF::Builder::Resource::Font method for
1320 information on accessing more than 256 glyphs in a font, using planes,
1321 although there is no guarantee that future changes to font files will
1322 permit consistent results. Note: many Type1 fonts are limited to 256
1323 glyphs, but some are available with more than 256 glyphs. Still, a
1324 maximum of 256 at a time are usable.
1325
1326 "psfont" accepts both ASCII (.pfa) and binary (.pfb) Type1 glyph files.
1327 Font metrics can be supplied in either ASCII (.afm) or binary (.pfm)
1328 format, as can be seen in the examples given below. It is possible to
1329 use .pfa with .pfm and .pfb with .afm if that's what's available. The
1330 ASCII and binary files have the same content, just in different
1331 formats.
1332
1333 To allow UTF-8 text and extended glyph counts in one font, you should
1334 consider replacing your use of Type1 fonts with TrueType (.ttf) and
1335 OpenType (.otf) fonts. There are tools, such as FontForge, which can do
1336 a fairly good (though, not perfect) job of converting your font library
1337 to OTF.
1338
1339 Examples:
1340
1341 $font1 = $pdf->psfont('Times-Book.pfa', -afmfile => 'Times-Book.afm');
1342 $font2 = $pdf->psfont('/fonts/Synest-FB.pfb', -pfmfile => '/fonts/Synest-FB.pfm');
1343
1344 Valid %options are:
1345
1346 -encode
1347 Changes the encoding of the font from its default. Notice that the
1348 encoding (not the entire font's glyph list) is shown in a PDF
1349 object (record), listing 256 glyphs associated with this encoding
1350 (and that are available in this font).
1351
1352 -afmfile
1353 Specifies the location of the ASCII font metrics file (.afm). It
1354 may be used with either an ASCII (.pfa) or binary (.pfb) glyph
1355 file.
1356
1357 -pfmfile
1358 Specifies the location of the binary font metrics file (.pfm). It
1359 may be used with either an ASCII (.pfa) or binary (.pfb) glyph
1360 file.
1361
1362 -dokern
1363 Enables kerning if data is available.
1364
1365 Note: these T1 (Type1) fonts are not shipped with PDF::Builder, but are
1366 expected to be found on the machine with the PDF reader. Most PDF
1367 readers do not install T1 fonts, and it is up to the user of the PDF
1368 reader to install the needed fonts. Unlike TrueType fonts, PS (T1)
1369 fonts are not embedded in the PDF, and must be supplied on the Reader
1370 end.
1371
1372 See also PDF::Builder::Resource::Font::Postscript.
1373
1374 TrueType Fonts
1375
1376 Warning: BaseEncoding is not set by default for TrueType fonts, so text
1377 in the PDF isn't searchable (by the PDF reader) unless a ToUnicode CMap
1378 is included. A ToUnicode CMap is included by default (-unicodemap set
1379 to 1) by PDF::Builder, but allows it to be disabled (for performance
1380 and file size reasons) by setting -unicodemap to 0. This will produce
1381 non-searchable text, which, besides being annoying to users, may
1382 prevent screen readers and other aids to disabled users from working
1383 correctly!
1384
1385 Examples:
1386
1387 $font1 = $pdf->ttfont('Times.ttf');
1388 $font2 = $pdf->ttfont('Georgia.otf');
1389
1390 Valid %options are:
1391
1392 -encode
1393 Changes the encoding of the font from its default
1394 (WinAnsiEncoding).
1395
1396 Note that for a single byte encoding (e.g., 'latin1'), you are
1397 limited to 256 characters defined for that encoding. 'automap' does
1398 not work with TrueType. If you want more characters than that, use
1399 'utf8' encoding with a UTF-8 encoded text string.
1400
1401 -isocmap
1402 Use the ISO Unicode Map instead of the default MS Unicode Map.
1403
1404 -unicodemap
1405 If 1 (default), output ToUnicode CMap to permit text searches and
1406 screen readers. Set to 0 to save space by not including the
1407 ToUnicode CMap, but text searching and screen reading will not be
1408 possible.
1409
1410 -dokern
1411 Enables kerning if data is available.
1412
1413 -noembed
1414 Disables embedding of the font file. Note that this is potentially
1415 hazardous, as the glyphs provided on the PDF reader machine may not
1416 match what was used on the PDF writer machine (the one running
1417 PDF::Builder)! If you know for sure that all PDF readers will be
1418 using the same TTF or OTF file you're using with PDF::Builder; not
1419 embedding the font may be acceptable, in return for a smaller PDF
1420 file size. Note that the Reader needs to know where to find the
1421 font file -- it can't be in any random place, but typically needs
1422 to be listed in a path that the Reader follows. Otherwise, it will
1423 be unable to render the text!
1424
1425 The only value for the "-noembed" flag currently checked for is 1,
1426 which means to not embed the font file in the PDF. Any other value
1427 currently results in the font file being embedded (by default),
1428 although in the future, other values might be given significance
1429 (such as checking permission bits).
1430
1431 Some additional comments on embedding font file(s) into the PDF:
1432 besides substantially increasing the size of the PDF (even if the
1433 font is subsetted, by default), PDF::Builder does not check the
1434 font file for any flags indicating font licensing issues and
1435 limitations on use. A font foundry may not permit embedding at all,
1436 may permit a subset of the font to be embedded, may permit a full
1437 font to be embedded, and may specify what can be done with an
1438 embedded font (e.g., may or may not be extracted for further use
1439 beyond displaying this one PDF). When you choose to use (and embed)
1440 a font, you should be aware of any such licensing issues.
1441
1442 -nosubset
1443 Disables subsetting of a TTF/OTF font, when embedded. By default,
1444 only the glyphs used by a document are included in the file, and
1445 not the entire font. This can result in a tremendous savings in
1446 PDF file size. If you intend to allow the PDF to be edited by
1447 users, not having the entire font glyph set available may cause
1448 problems, so be aware of that (and consider using "-nosubset => 1".
1449 Setting this flag to any value results in the entire font glyph set
1450 being embedded in the file. It might be a good idea to use only the
1451 value 1, in case other values are assigned roles in the future.
1452
1453 -debug
1454 If set to 1 (default is 0), diagnostic information is output about
1455 the CMap processing.
1456
1457 -usecmf
1458 If set to 1 (default is 0), the first priority is to make use of
1459 one of the four ".cmap" files for CJK fonts. This is the old way of
1460 processing TTF files. If, after all is said and done, a working
1461 internal CMap hasn't been found (for -usecmf=>0), "ttfont()" will
1462 fall back to using a ".cmap" file if possible.
1463
1464 -cmaps
1465 This flag may be set to a string listing the Platform/Encoding
1466 pairs to look for of any internal CMaps in the font file, in the
1467 desired order (highest priority first). If one list (comma and/or
1468 space-separated pairs) is given, it is used for both Windows and
1469 non-Windows platforms (on which PDF::Builder is running, not the
1470 PDF reader's). Two lists, separated by a semicolon ; may be given,
1471 with the first being used for a Windows platform and the second for
1472 non-Windows. The default list is "0/6 3/10 0/4 3/1 0/3; 0/6 0/4
1473 3/10 0/3 3/1". Finally, instead of a P/E list, a string "find_ms"
1474 may be given to tell it to simply call the Font::TTF "find_ms()"
1475 method to find a (preferably Windows) internal CMap. "-cmaps" set
1476 to 'find_ms' would emulate the old way of looking for CMaps. Symbol
1477 fonts (3/0) always use find_ms(), and the new default lookup is (if
1478 ".cmap" isn't used, see "-usecmf") to try to get a match with the
1479 default list for the appropriate OS. If none can be found,
1480 find_ms() is tried, and as last resort use the ".cmap" (if
1481 available), even if "-usecmf" is not 1.
1482
1483 CJK Fonts
1484
1485 Examples:
1486
1487 $font = $pdf->cjkfont('korean');
1488 $font = $pdf->cjkfont('traditional');
1489
1490 Valid %options are:
1491
1492 -encode
1493 Changes the encoding of the font from its default.
1494
1495 Warning: Unlike "ttfont", the font file is not embedded in the output
1496 PDF file. This is evidently behavior left over from the early days of
1497 CJK fonts, where the "Cmap" and "Data" were always external files,
1498 rather than internal tables. If you need a CJK-using PDF file to embed
1499 the font, for portability, you can create a PDF using "cjkfont", and
1500 then use an external utility (e.g., "pdfcairo") to embed the font in
1501 the PDF. It may also be possible to use "ttfont" instead, to produce
1502 the PDF, provided you can deduce the correct font file name from
1503 examining the PDF file (e.g., on my Windows system, the "Ming" font
1504 would be "$font = $pdf->ttfont("C:/Program Files (x86)/Adobe/Acrobat
1505 Reader DC/Resource/CIDFont/AdobeMingStd-Light.otf")". Of course, the
1506 font file used would have to be ".ttf" or ".otf". It may act a little
1507 differently than "cjkfont" (due a a different Cmap), but you should be
1508 able to embed the font file into the PDF.
1509
1510 See also PDF::Builder::Resource::CIDFont::CJKFont
1511
1512 Synthetic Fonts
1513
1514 Warning: BaseEncoding is not set by default for these fonts, so text in
1515 the PDF isn't searchable (by the PDF reader) unless a ToUnicode CMap is
1516 included. A ToUnicode CMap is included by default (-unicodemap set to
1517 1) by PDF::Builder, but allows it to be disabled (for performance and
1518 file size reasons) by setting -unicodemap to 0. This will produce non-
1519 searchable text, which, besides being annoying to users, may prevent
1520 screen readers and other aids to disabled users from working correctly!
1521
1522 Examples:
1523
1524 $cf = $pdf->corefont('Times-Roman', -encode => 'latin1');
1525 $sf = $pdf->synfont($cf, -condense => 0.85); # compressed 85%
1526 $sfb = $pdf->synfont($cf, -bold => 1); # embolden by 10em
1527 $sfi = $pdf->synfont($cf, -oblique => -12); # italic at -12 degrees
1528
1529 Valid %options are:
1530
1531 -condense
1532 Character width condense/expand factor (0.1-0.9 = condense, 1 =
1533 normal/default, 1.1+ = expand). It is the multiplier to apply to
1534 the width of each character.
1535
1536 -oblique
1537 Italic angle (+/- degrees, default 0), sets skew of character box.
1538
1539 -bold
1540 Emboldening factor (0.1+, bold = 1, heavy = 2, ...), additional
1541 thickness to draw outline of character (with a heavier line width)
1542 before filling.
1543
1544 -space
1545 Additional character spacing in milliems (0-1000)
1546
1547 -caps
1548 0 for normal text, 1 for small caps. Implemented by asking the
1549 font what the uppercased translation (single character) is for a
1550 given character, and outputting it at 80% height and 88% width
1551 (heavier vertical stems are better looking than a straight 80%
1552 scale).
1553
1554 Note that only lower case letters which appear in the "standard"
1555 font (plane 0 for core fonts and PS fonts) will be small-capped.
1556 This may include eszett (German sharp s), which becomes SS, and
1557 dotless i and j which become I and J respectively. There are many
1558 other accented Latin alphabet letters which may show up in planes 1
1559 and higher. Ligatures (e.g., ij and ffl) do not have uppercase
1560 equivalents, nor does a long s. If you have text which includes
1561 such characters, you may want to consider preprocessing it to
1562 replace them with Latin character expansions (e.g., i+j and f+f+l)
1563 before small-capping.
1564
1565 Note that CJK fonts (created with the "cjkfont" method) do not work
1566 properly with "synfont". This is due to a different internal structure
1567 of the CJK fonts, as compared to corefont, ttfont, and psfont base
1568 fonts. If you require a synthesized (modified) CJK font, you might try
1569 finding the TTF or OTF original, use "ttfont" to create the base font,
1570 and running "synfont" against that, in the manner described for
1571 embedding "CJK Fonts".
1572
1573 See also PDF::Builder::Resource::Font::SynFont
1574
1575 IMAGE METHODS
1576 This is additional information on enhanced libraries available for TIFF
1577 and PNG images. See specific information listings for GD, GIF, JPEG,
1578 and PNM image formats. In addition, see "examples/Content.pl" for an
1579 example of placing an image on a page, as well as using in a "Form".
1580
1581 Why is my image flipped or rotated?
1582
1583 Something not uncommonly seen when using JPEG photos in a PDF is that
1584 the images will be rotated and/or mirrored (flipped). This may happen
1585 when using TIFF images too. What happens is that the camera stores an
1586 image just as it comes off the CCD sensor, regardless of the camera
1587 orientation, and does not rotate it to the correct orientation! It does
1588 store a separate "orientation" flag to suggest how the image might be
1589 corrected, but not all image processing obeys this flag (PDF::Builder
1590 does not.). For example, if you take a "portrait" (tall) photo of a
1591 tree (with the phone held vertically), and then use it in a PDF, the
1592 tree may appear to have been cut down! (appears in landscape mode)
1593
1594 I have found some code that should allow the "image_jpeg" or "image"
1595 routine to auto-rotate to (supposedly) the correct orientation, by
1596 looking for the Exif metadata "Orientation" tag in the file. However,
1597 three problems arise: 1) if a photo has been edited, and rotated or
1598 flipped in the process, there is no guarantee that the Orientation tag
1599 has been corrected. 2) more than one Orientation tag may exist (e.g.,
1600 in the binary APP1/Exif header, and in XML data), and they may not
1601 agree with each other -- which should be used? 3) the code would need
1602 to uncompress the raster data, swap and/or transpose rows and/or
1603 columns, and recompress the raster data for inclusion into the PDF.
1604 This is costly and error-prone. In any case, the user would need to be
1605 able to override any auto-rotate function.
1606
1607 For the time being, PDF::Builder will simply leave it up to the user of
1608 the library to take care of rotating and/or flipping an image which
1609 displays incorrectly. It is possible that we will consider adding some
1610 sort of query or warning that the image appears to not be "normally"
1611 oriented (Orientation value 1 or "Top-left"), according to the
1612 Orientation flag. You can consider either (re-)saving the photo in an
1613 editor such as PhotoShop or GIMP, or using PDF::Builder code similar to
1614 the following (for images rotated 180 degrees):
1615
1616 $pW = 612; $pH = 792; # page dimensions (US Letter)
1617 my $img = $pdf->image_jpeg("AliceLake.jpeg");
1618 # raw size WxH 4032x3024, scaled down to 504x378
1619 $sW = 4032/8; $sH = 3024/8;
1620 # intent is to center on US Letter sized page (LL at 54,207)
1621 # Orientation flag on this image is 3 (rotated 180 degrees).
1622 # if naively displayed (just $gfx->image call), it will be upside down
1623
1624 $gfx->save();
1625
1626 ## method 0: simple display, is rotated 180 degrees!
1627 #$gfx->image($img, ($pW-$sW)/2,($pH-$sH)/2, $sW,$sH);
1628
1629 ## method 1: translate, then rotate
1630 #$gfx->translate($pW,$pH); # to new origin (media UR corner)
1631 #$gfx->rotate(180); # rotate around new origin
1632 #$gfx->image($img, ($pW-$sW)/2,($pH-$sH)/2, $sW,$sH);
1633 # image's UR corner, not LL
1634
1635 # method 2: rotate, then translate
1636 $gfx->rotate(180); # rotate around current origin
1637 $gfx->translate(-$sW,-$sH); # translate in rotated coordinates
1638 $gfx->image($img, -($pW-$sW)/2,-($pH-$sH)/2, $sW,$sH);
1639 # image's UR corner, not LL
1640
1641 ## method 3: flip (mirror) twice
1642 #$scale = 1; # not rescaling here
1643 #$size_page = $pH/$scale;
1644 #$invScale = 1.0/$scale;
1645 #$gfx->add("-$invScale 0 0 -$invScale 0 $size_page cm");
1646 #$gfx->image($img, -($pW-$sW)/2-$sW,($pH-$sH)/2, $sW,$sH);
1647
1648 $gfx->restore();
1649
1650 If your image is also mirrored (flipped about an axis), simple rotation
1651 will not suffice. You could do something with a reversal of the
1652 coordinate system, as in "method 3" above (see "Advanced Methods" in
1653 PDF::Builder::Content). To mirror only left/right, the second $invScale
1654 would be positive; to mirror only top/bottom, the first would be
1655 positive. If all else fails, you could save a mirrored copy in a photo
1656 editor. 90 or 270 degree rotations will require a "rotate" call,
1657 possibly with "cm" usage to reverse mirroring. Incidentally, do not
1658 confuse this issue with the coordinate flipping performed by some
1659 Chrome browsers when printing a page to PDF.
1660
1661 Note that TIFF images may have the same rotation/mirroring problems as
1662 JPEG, which is not surprising, as the Exif format was lifted from TIFF
1663 for use in JPEG. The cure will be similar to JPEG's.
1664
1665 TIFF Images
1666
1667 Note that the Graphics::TIFF support library does not currently permit
1668 a filehandle for $file.
1669
1670 PDF::Builder will use the Graphics::TIFF support library for TIFF
1671 functions, if it is available, unless explicitly told not to. Your code
1672 can test whether Graphics::TIFF is available by examining
1673 "$tiff->usesLib()" or "$pdf->LA_GT()".
1674
1675 = -1
1676 Graphics::TIFF is installed, but your code has specified
1677 "-nouseGT", to not use it. The old, pure Perl, code (buggy!) will
1678 be used instead, as if Graphics::TIFF was not installed.
1679
1680 = 0 Graphics::TIFF is not installed. Not all systems are able to
1681 successfully install this package, as it requires libtiff.a.
1682
1683 = 1 Graphics::TIFF is installed and is being used.
1684
1685 Options:
1686
1687 -nouseGT => 1
1688 Do not use the Graphics::TIFF library, even if it's available.
1689 Normally you would want to use this library, but there may be cases
1690 where you don't, such as when you want to use a file handle instead
1691 of a name.
1692
1693 -silent => 1
1694 Do not give the message that Graphics::TIFF is not installed. This
1695 message will be given only once, but you may want to suppress it,
1696 such as during t-tests.
1697
1698 PNG Images
1699
1700 PDF::Builder will use the Image::PNG::Libpng support library for PNG
1701 functions, if it is available, unless explicitly told not to. Your code
1702 can test whether Image::PNG::Libpng is available by examining
1703 "$png->usesLib()" or "$pdf->LA_IPL()".
1704
1705 = -1
1706 Image::PNG::Libpng is installed, but your code has specified
1707 "-nouseIPL", to not use it. The old, pure Perl, code (slower and
1708 less capable) will be used instead, as if Image::PNG::Libpng was
1709 not installed.
1710
1711 = 0 Image::PNG::Libpng is not installed. Not all systems are able to
1712 successfully install this package, as it requires libpng.a.
1713
1714 = 1 Image::PNG::Libpng is installed and is being used.
1715
1716 Options:
1717
1718 -nouseIPL => 1
1719 Do not use the Image::PNG::Libpng library, even if it's available.
1720 Normally you would want to use this library, when available, but
1721 there may be cases where you don't.
1722
1723 -silent => 1
1724 Do not give the message that Image::PNG::Libpng is not installed.
1725 This message will be given only once, but you may want to suppress
1726 it, such as during t-tests.
1727
1728 -notrans => 1
1729 No transparency -- ignore tRNS chunk if provided, ignore Alpha
1730 channel if provided.
1731
1732 USING SHAPER (HarfBuzz::Shaper library)
1733 # if HarfBuzz::Shaper is not installed, either bail out, or try to
1734 # use regular TTF calls instead
1735 my $rc;
1736 $rc = eval {
1737 require HarfBuzz::Shaper;
1738 1;
1739 };
1740 if (!defined $rc) { $rc = 0; }
1741 if ($rc == 0) {
1742 # bail out in some manner
1743 } else {
1744 # can use Shaper
1745 }
1746
1747 my $fontfile = '/WINDOWS/Fonts/times.ttf'; # used by both Shaper and textHS
1748 my $fontsize = 15; # used by both Shaper and textHS
1749 my $font = $pdf->ttfont($fontfile);
1750 $text->font($font, $fontsize);
1751
1752 my $hb = HarfBuzz::Shaper->new(); # only need to set up once
1753 my %settings; # for textHS(), not Shaper
1754 $settings{'dump'} = 1; # see the diagnostics
1755 $settings{'script'} = 'Latn';
1756 $settings('dir'} = 'L'; # LTR
1757 $settings{'features'} = (); # required
1758
1759 # -- set language (override automatic setting)
1760 #$settings{'language'} = 'en';
1761 #$hb->set_language( 'en_US' );
1762 # -- turn OFF ligatures
1763 #push @{ $settings{'features'} }, '-liga';
1764 #$hb->add_features( '-liga' );
1765 # -- turn OFF kerning
1766 #push @{ $settings{'features'} }, '-kern';
1767 #$hb->add_features( '-kern' );
1768 $hb->set_font($fontfile);
1769 $hb->set_size($fontsize);
1770 $hb->set_text("Let's eat waffles in the field for brunch.");
1771 # expect ffl and fi ligatures, and perhaps some kerning
1772
1773 my $info = $hb->shaper();
1774 $text->textHS($info, \%settings); # -strikethru, -underline allowed
1775
1776 The package HarfBuzz::Shaper may be optionally installed in order to
1777 use the text-shaping capabilities of the HarfBuzz library. These
1778 include kerning and ligatures in Western scripts (such as the Latin
1779 alphabet). More complex scripts can be handled, such as Arabic family
1780 and Indic scripts, where multiple forms of a character may be
1781 automatically selected, characters may be reordered, and other
1782 modifications made. The examples/HarfBuzz.pl script gives some examples
1783 of what may be done.
1784
1785 Keep in mind that HarfBuzz works only with TrueType (.ttf) and OpenType
1786 (.otf) font files. It will not work with PostScript (Type1), core,
1787 bitmapped, or CJK fonts. Not all .ttf fonts have the instructions
1788 necessary to guide HarfBuzz, but most proper .otf fonts do. In other
1789 words, there are no guarantees that a particular font file will work
1790 with Shaper!
1791
1792 The basic idea is to break up text into "chunks" which are of the same
1793 script (alphabet), language, direction, font face, font size, and
1794 variant (italic, bold, etc.). These could range from a single character
1795 to paragraph-length strings of text. These are fed to HarfBuzz::Shaper,
1796 along with flags, the font file to be used, and other supporting
1797 information, to create an array of output glyphs. Each element is a
1798 hash describing the glyph to be output, including its name (if
1799 available), its glyph ID (number) in the selected font, its x and y
1800 displacement (usually 0), and its "advance" x and y values, all in
1801 points. For horizontal languages (LTR and RTL), the y advance is
1802 normally 0 and the x advance is the font's character width, less any
1803 kerning amount.
1804
1805 Shaper will attempt to figure out the script used and the text
1806 direction, based on the Unicode range; and a reasonable guess at the
1807 language used. The language can be overridden, but currently the script
1808 and text direction cannot be overridden.
1809
1810 An important note: the number of glyphs (array elements) may not be
1811 equal to the number of Unicode points (characters) given in the chunk's
1812 text string! Sometimes a character will be decomposed into several
1813 pieces (multiple glyphs); sometimes multiple characters may be combined
1814 into a single ligature glyph; and characters may be reordered
1815 (especially in Indic and Southeast Asian languages). As well, for
1816 Right-to-Left (bidirectional) scripts such as Hebrew and Arabic
1817 families, the text is output in Left-to-Right order (reversed from the
1818 input).
1819
1820 With due care, a Shaper array can be manipulated in code. The elements
1821 are more or less independent of each other, so elements can be
1822 modified, rearranged, inserted, or deleted. You might adjust the
1823 position of a glyph with 'dx' and 'dy' hash elements. The 'ax' value
1824 should be left alone, so that the wrong kerning isn't calculated, but
1825 you might need to adjust the "advance x" value by means of one of the
1826 following:
1827
1828 axs is a value to be substituted for 'ax' (points)
1829 axsp is a substituted value (percentage) of the original 'ax'
1830 axr reduces 'ax' by the value (points). If negative, increase 'ax'
1831 axrp reduces 'ax' by the given percentage. Again, negative increases
1832 'ax'
1833
1834 Caution: a given character's glyph ID is not necessarily going to be
1835 the same between any two fonts! For example, an ASCII space (U+0020)
1836 might be "<0001>" in one font, and "<0003>" in another font (even one
1837 closely related!). A U+00A0 required blank (non-breaking space) may be
1838 output as a regular ASCII space U+0020. Take care if you need to find a
1839 particular glyph in the array, especially if the number of elements
1840 don't match. Consider making a text string of "marker" characters
1841 (space, nbsp, hyphen, soft hyphen, etc.) and processing it through
1842 HarfBuzz::Shaper to get the corresponding glyph numbers. You may have
1843 to count spaces, say, to see where you could break a glyph array to fit
1844 a line.
1845
1846 The "advancewidthHS()" method uses the same inputs as does "textHS()".
1847 Like "advancewidth()", it returns the chunk length in points. Unlike
1848 "advancewidth()", you cannot override the glyph array's font, font
1849 size, etc.
1850
1851 Once you have your (possibly modified) array of glyphs, you feed it to
1852 the "textHS()" method to render it to the page. Remember that this
1853 method handles only a single line of text; it does not do line
1854 splitting or fitting -- that you currently need to do manually. For
1855 Western scripts (e.g., Latin), that might not be too difficult, but for
1856 other scripts that involve extensive modification of the raw
1857 characters, it may be quite difficult to split words, but you still may
1858 be able to split at inter-word spaces.
1859
1860 A useful, but not exhaustive, set of functions are allowed by
1861 "textHS()" use. Support includes direction setting (top-to-bottom and
1862 bottom-to-top directions, e.g., for Far Eastern languages in
1863 traditional orientation), and explicit script names and language
1864 (depending on what support HarfBuzz itself gives). Not yet supported
1865 are features such as discretionary ligatures and manual selection of
1866 glyphs (e.g., swashes and alternate forms).
1867
1868 Currently, "textHS()" can only handle a single text string. We are
1869 looking at how fitting to a line length (splitting up an array) could
1870 be done, as well as how words might be split on hard and soft hyphens.
1871 At some point, full paragraph and page shaping could be possible.
1872
1873
1874
1875perl v5.32.1 2021-03-29 PDF::Builder::Docs(3)