1PDF::Builder::Docs(3) User Contributed Perl DocumentationPDF::Builder::Docs(3)
2
3
4
6 PDF::Builder::Docs - additional documentation for Builder module
7
9 Software Development Kit
10 There are four levels of involvement with PDF::Builder. Depending on
11 what you want to do, different kinds of installs are recommended.
12
13 1. Simply installing PDF::Builder as a prerequisite for running some
14 other package. All you need to do is install the CPAN package for
15 PDF::Builder, and it will load the .pm files into your Perl library. If
16 the other package prereqs PDF::Builder, its installer may download and
17 install PDF::Builder automatically.
18
19 2. You want to write a Perl program that uses PDF::Builder functions.
20 In addition to installing PDF::Builder from CPAN, you will want
21 documentation on it. Obtain a copy of the product from GitHub
22 (https://github.com/PhilterPaper/Perl-PDF-Builder) or as a gzipped tar
23 file from CPAN. This includes a utility to build (from POD) a library
24 of HTML documents, as well as examples (examples/ directory) and
25 contributed sample programs (contrib/ directory).
26
27 3. You want to modify PDF::Builder files. In addition to the CPAN and
28 GitHub distributions, you may choose to keep a local Git repository for
29 tracking your changes. Depending on whether or not your PDF::Builder
30 copy is being used for production purposes, you may want to do your
31 editing and testing in the Perl library installation (live) or in a
32 different place. The "t" tests (t/ directory) and examples provide good
33 regression tests to ensure that you haven't broken anything. If you do
34 your editing on the live code, don't forget when done to copy the
35 changes back into the master version you keep!
36
37 4. You want to contribute to the development of PDF::Builder. You will
38 need a local Git repository (and a GitHub account), so that when you've
39 got it all done, you can issue a "Pull Request" to bring it to our
40 attention. We can't guarantee that your work will be incorporated into
41 the project, but at least we will look at it. From time to time, a new
42 CPAN version will be issued.
43
44 If you want to make substantial changes for public use, and can't come
45 to a meeting of minds with us, you can even start your own GitHub
46 project and register a new CPAN project (that's what we did, forking
47 PDF::API2). Please don't just assume that we don't want your changes --
48 at least propose what you want to do in writing, so we can consider it.
49 We're always looking for people to help out and expand PDF::Builder.
50
51 Optional Libraries
52 PDF::Builder can make use of some optional libraries, which are not
53 required for a successful installation. If you want improved speed and
54 capabilities for certain functions, you may want to install and use
55 these libraries:
56
57 * Graphics::TIFF -- PDF::Builder inherited a rather slow, buggy, and
58 limited TIFF image library from PDF::API2. If Graphics::TIFF (available
59 on CPAN, uses libtiff.a) is installed, PDF::Builder will use that
60 instead, unless you specify that it is to use the old, pure Perl
61 library. The only time you might want to consider this is when you need
62 to pass an open filehandle to "image_tiff" instead of a file name. See
63 resolved bug reports RT 84665 and RT 118047, as well as "image_tiff",
64 for more information.
65
66 * Image::PNG::Libpng -- PDF::Builder inherited a rather slow and buggy
67 pure Perl PNG image library from PDF::API2. If Image::PNG::Libpng
68 (available on CPAN, uses libpng.a) is installed, PDF::Builder will use
69 that instead, unless you specify that it is to use the old, pure Perl
70 library. Using the new library will give you improved speed, the
71 ability to use 16 bit samples, and the ability to read interlaced PNG
72 files. See resolved bug report RT 124349, as well as "image_png", for
73 more information.
74
75 * HarfBuzz::Shaper -- This library enables PDF::Builder to handle
76 complex scripts (Arabic, Devanagari, etc.) as well as non-LTR writing
77 systems. It is also useful for Latin and other simple scripts, for
78 ligatures and improved kerning. HarfBuzz::Shaper is based on a set of
79 HarfBuzz libraries, which it will attempt to build if they are not
80 found. See "textHS" for more information.
81
82 Note that the installation process will attempt to install these
83 libraries automatically. If you don't wish to use one or more of them,
84 you are free to uninstall the optional librarie(s). If one or more
85 failed to install, no need to panic -- you simply won't be able to use
86 some advanced features, unless you are able to manually install the
87 modules (e.g., with "cpan install").
88
89 Strings (Character Text)
90 Perl, and hence PDF::Builder, use strings that support the full range
91 of Unicode characters. When importing strings into a Perl program, for
92 example by reading text from a file, you must be aware of what their
93 character encoding is. Single-byte encodings (default is 'latin1'),
94 represented as bytes of value 0x00 through 0xFF (0..255), will produce
95 different results if you do something that depends on the encoding,
96 such as sorting, searching, or comparing any two non-ASCII characters.
97 This also applies to any characters (text) hard coded into the Perl
98 program.
99
100 You can always decode the text from external encoding (ASCII, UTF-8,
101 Latin-3, etc.) into the Perl (internal) UTF-8 multibyte encoding. This
102 uses one to four bytes to represent each character. See pragma "utf8"
103 and module "Encode" for details about decoding text. Note that only
104 TrueType fonts ("ttfont") can make direct use of UTF-8-encoded text.
105 Other font types (core, T1, etc.) can only use single-byte encoded
106 text. If your text is ASCII, Latin-1, or CP-1252, you can just leave
107 the Perl strings as the default single-byte encoding.
108
109 Then, there is the matter of encoding the output to match up with
110 available font character sets. You're not actually translating the text
111 on output, but are telling the output system (and Reader) what encoding
112 the output byte stream represents, and what character glyphs they
113 should generate.
114
115 If you confine your text to plain ASCII (0x00 .. 0x7F byte values) or
116 even Latin-1 or CP-1252 (0x00 .. 0xFF byte values), you can use default
117 (non-UTF-8) Perl strings and use the default output encoding
118 (WinAnsiEncoding), which is more-or-less Windows CP-1252 (a superset in
119 turn, of ISO-8859-1 Latin-1). If your text uses any other characters,
120 you will need to be aware of what encoding your text strings are (in
121 the Perl string and for declaring output glyph generation). See "Core
122 Fonts", "PS Fonts" and "TrueType Fonts" in "FONT METHODS" for
123 additional information.
124
125 Some Internal Details
126
127 Some of the following may be a bit scary or confusing to beginners, so
128 don't be afraid to skip over it until you're ready for it...
129
130 Perl (and PDF::Builder) internally use strings which are either single-
131 byte (ISO-8859-1/Latin-1) or multibyte UTF-8 encoded (there is an
132 internal flag marking the string as UTF-8 or not). If you work
133 strictly in ASCII or Latin-1 or CP-1252 (each a superset of the
134 previous), you should be OK in not doing anything special about your
135 string encoding. You can just use the default Perl single byte strings
136 (internally marked as not UTF-8) and the default output encoding
137 (WinAnsiEncoding).
138
139 If you intend to use input from a variety of sources, you should
140 consider decoding (converting) your text to UTF-8, which will provide
141 an internally consistent representation (and your Perl code itself
142 should be saved in UTF-8, in case you want to use any hard coded non-
143 ASCII characters). In any string, non-ASCII characters (0x80 or higher)
144 would be converted to the Perl UTF-8 internal representation, via
145 "$string = Encode::decode(MY_ENCODING, $input);". "MY_ENCODING" would
146 be a string like 'latin1', 'cp-1252', 'utf8', etc. Similar capabilities
147 are available for declaring a file to be in a certain encoding.
148
149 Be aware that if you use UTF-8 encoding for your text, that only
150 TrueType font output ("ttfont") can handle it directly. Corefont and
151 Type1 output will require that the text will have to be converted back
152 into a single-byte encoding (using "Encode::encode"), which may need to
153 be declared with "-encode" (for "corefont" or "psfont"). If you have
154 any characters not found in the selected single-byte encoding (but are
155 found in the font itself), you will need to use "automap" to break up
156 the font glyphs into 256 character planes, map such characters to 0x00
157 .. 0xFF in the appropriate plane, and switch between font planes as
158 necessary.
159
160 Core and Type1 fonts (output) use the byte values in the string
161 (single-byte encoding only!) and provide a byte-to-glyph mapping record
162 for each plane. TrueType outputs a group of four hexadecimal digits
163 representing the "CId" (character ID) of each character. The CId does
164 not correspond to either the single-byte or UTF-8 internal
165 representations of the characters.
166
167 The bottom line is that you need to know what the internal
168 representation of your text is, so that the output routines can tell
169 the PDF reader about it (via the PDF file). The text will not be
170 translated upon output, but the PDF reader needs to know what the
171 encoding in use is, so it knows what glyph to associate with each byte
172 (or byte sequence).
173
174 Note that some operating systems and Perl flavors are reputed to be
175 strict about encoding names. For example, latin1 (an alias) may be
176 rejected as invalid, while iso-8859-1 (a canonical value) will work.
177
178 By the way, it is recommended that you be using at least Perl 5.10 if
179 you are going to be using any non-ASCII characters. Perl 5.8 may be a
180 little unpredictable in handling such text.
181
182 Rendering Order
183 For better or worse, for compatibility purposes, PDF::Builder continues
184 the same rendering model as used by PDF::API2 (and possibly its
185 predecessors). That is, all graphics for one graphics object are put
186 into one record, and all text output for one text object goes into
187 another record. Which one is output first, is whichever is declared
188 first. This can lead to unexpected results, where items are rendered in
189 (apparently) the wrong order. That is, text and graphics items are not
190 necessarily output (rendered) in the same order as they were created in
191 code. Two items in the same object (e.g., $text) will be rendered in
192 the same order as they were coded, but items from different objects may
193 not be rendered in the expected order. The following example (source
194 code and annotated PDF excerpts) will hopefully illustrate the issue:
195
196 use strict;
197 use warnings;
198 use PDF::Builder;
199
200 # demonstrate text and graphics object order
201 #
202 my $fname = "objorder";
203
204 my $paper_size = "Letter";
205
206 # see the text and graphics stream contents
207 my $pdf = PDF::Builder->new(-compress => 'none');
208 $pdf->mediabox($paper_size);
209 my $page = $pdf->page();
210 # adjust path for your operating system
211 my $fontTR = $pdf->ttfont('C:\\Windows\\Fonts\\timesbd.ttf');
212
213 For the first group, you might expect the "under" line to be output,
214 then the filled circle (disc) partly covering it, then the "over" line
215 covering the disc, and finally a filled rectangle (bar) over both
216 lines. What actually happened is that the $grfx graphics object was
217 declared first, so everything in that object (the disc and bar) is
218 output first, and the text object $text (both lines) comes afterwards.
219 The result is that the text lines are on top of the graphics drawings.
220
221 # ----------------------------
222 # 1. text, orange ball over, text over, bar over
223
224 my $grfx1 = $page->gfx();
225 my $text1 = $page->text();
226 $text1->font($fontTR, 20); # 20 pt Times Roman bold
227
228 $text1->fillcolor('black');
229 $grfx1->strokecolor('blue');
230 $grfx1->fillcolor('orange');
231
232 $text1->translate(50,700);
233 $text1->text_left("This text should be under everything.");
234
235 $grfx1->circle(100,690, 30);
236 $grfx1->fillstroke();
237
238 $text1->translate(50,670);
239 $text1->text_left("This text should be over the ball and under the bar.");
240
241 $grfx1->rect(160,660, 20,70);
242 $grfx1->fillstroke();
243
244 % ---------------- group 1: define graphics object first, then text
245 11 0 obj << /Length 690 >> stream % obj 11 is graphics for (1)
246 0 0 1 RG % stroke blue
247 1 0.647059 0 rg % fill orange
248 130 690 m ... c h B % draw and fill circle
249 160 660 20 70 re B % draw and fill bar
250 endstream endobj
251
252 12 0 obj << /Length 438 >> stream % obj 12 is text for (1)
253 BT
254 /TiCBA 20 Tf % Times Roman Bold 20pt
255 0 0 0 rg % fill black
256 1 0 0 1 50 700 Tm % position text
257 <0037 ... 0011> Tj % "under" line
258 1 0 0 1 50 670 Tm % position text
259 <0037 ... 0011> Tj % "over" line
260 ET
261 endstream endobj
262
263 The second group is the same as the first, with the only difference
264 being that the text object was declared first, and then the graphics
265 object. The result is that the two text lines are rendered first, and
266 then the disc and bar are drawn over them.
267
268 # ----------------------------
269 # 2. (1) again, with graphics and text order reversed
270
271 my $text2 = $page->text();
272 my $grfx2 = $page->gfx();
273 $text2->font($fontTR, 20); # 20 pt Times Roman bold
274
275 $text2->fillcolor('black');
276 $grfx2->strokecolor('blue');
277 $grfx2->fillcolor('orange');
278
279 $text2->translate(50,600);
280 $text2->text_left("This text should be under everything.");
281
282 $grfx2->circle(100,590, 30);
283 $grfx2->fillstroke();
284
285 $text2->translate(50,570);
286 $text2->text_left("This text should be over the ball and under the bar.");
287
288 $grfx2->rect(160,560, 20,70);
289 $grfx2->fillstroke();
290
291 % ---------------- group 2: define text object first, then graphics
292 13 0 obj << /Length 438 >> stream % obj 13 is text for (2)
293 BT
294 /TiCBA 20 Tf % Times Roman Bold 20pt
295 0 0 0 rg % fill black
296 1 0 0 1 50 600 Tm % position text
297 <0037 ... 0011> Tj % "under" line
298 1 0 0 1 50 570 Tm % position text
299 <0037 ... 0011> Tj % "over" line
300 ET
301 endstream endobj
302
303 14 0 obj << /Length 690 >> stream % obj 14 is graphics for (2)
304 0 0 1 RG % stroke blue
305 1 0.647059 0 rg % fill orange
306 130 590 m ... h B % draw and fill circle
307 160 560 20 70 re B % draw and fill bar
308 endstream endobj
309
310 The third group defines two text and two graphics objects, in the order
311 that they are expected in. The "under" text line is output first, then
312 the orange disc graphics is output, partly covering the text. The
313 "over" text line is now output -- it's actually over the disc, but is
314 orange because the previous object stream (first graphics object) left
315 the fill color (also used for text) as orange, because we didn't
316 explicitly set the fill color before outputting the second text line.
317 This is not "inheritance" so much as it is whatever the graphics
318 (drawing) state (used for both "graphics" and "text") is left in at the
319 end of one object, it's the state at the beginning of the next object.
320 If you wish to control this, consider surrounding the graphics or text
321 calls with "save()" and "restore()" calls to save and restore (push and
322 pop) the graphics state to what it was at the "save()". Finally, the
323 bar is drawn over everything.
324
325 # ----------------------------
326 # 3. (2) again, with two graphics and two text objects
327
328 my $text3 = $page->text();
329 my $grfx3 = $page->gfx();
330 $text3->font($fontTR, 20); # 20 pt Times Roman bold
331 my $text4 = $page->text();
332 my $grfx4 = $page->gfx();
333 $text4->font($fontTR, 20); # 20 pt Times Roman bold
334
335 $text3->fillcolor('black');
336 $grfx3->strokecolor('blue');
337 $grfx3->fillcolor('orange');
338 # $text4->fillcolor('yellow');
339 # $grfx4->strokecolor('red');
340 # $grfx4->fillcolor('purple');
341
342 $text3->translate(50,500);
343 $text3->text_left("This text should be under everything.");
344
345 $grfx3->circle(100,490, 30);
346 $grfx3->fillstroke();
347
348 $text4->translate(50,470);
349 $text4->text_left("This text should be over the ball and under the bar.");
350
351 $grfx4->rect(160,460, 20,70);
352 $grfx4->fillstroke();
353
354 % ---------------- group 3: define text1, graphics1, text2, graphics2
355 15 0 obj << /Length 206 >> stream % obj 15 is text1 for (3)
356 BT
357 /TiCBA 20 Tf % Times Roman Bold 20pt
358 0 0 0 rg % fill black
359 1 0 0 1 50 500 Tm % position text
360 <0037 ... 0011> Tj % "under" line
361 ET
362 endstream endobj
363
364 16 0 obj << /Length 671 >> stream % obj 16 is graphics1 for (3) circle
365 0 0 1 RG % stroke blue
366 1 0.647059 0 rg % fill orange
367 130 490 m ... h B % draw and fill circle
368 endstream endobj
369
370 17 0 obj << /Length 257 >> stream % obj 17 is text2 for (3)
371 BT
372 /TiCBA 20 Tf % Times Roman Bold 20pt
373 1 0 0 1 50 470 Tm % position text
374 <0037 ... 0011> Tj % "over" line
375 ET
376 endstream endobj
377
378 18 0 obj << /Length 20 >> stream % obj 18 is graphics for (3) bar
379 160 460 20 70 re B % draw and fill bar
380 endstream endobj
381
382 The fourth group is the same as the third, except that we define the
383 fill color for the text in the second line. This makes it clear that
384 the "over" line (in yellow) was written after the orange disc, and
385 still before the bar.
386
387 # ----------------------------
388 # 4. (3) again, a new set of colors for second group
389
390 my $text3 = $page->text();
391 my $grfx3 = $page->gfx();
392 $text3->font($fontTR, 20); # 20 pt Times Roman bold
393 my $text4 = $page->text();
394 my $grfx4 = $page->gfx();
395 $text4->font($fontTR, 20); # 20 pt Times Roman bold
396
397 $text3->fillcolor('black');
398 $grfx3->strokecolor('blue');
399 $grfx3->fillcolor('orange');
400 $text4->fillcolor('yellow');
401 $grfx4->strokecolor('red');
402 $grfx4->fillcolor('purple');
403
404 $text3->translate(50,400);
405 $text3->text_left("This text should be under everything.");
406
407 $grfx3->circle(100,390, 30);
408 $grfx3->fillstroke();
409
410 $text4->translate(50,370);
411 $text4->text_left("This text should be over the ball and under the bar.");
412
413 $grfx4->rect(160,360, 20,70);
414 $grfx4->fillstroke();
415
416 % ---------------- group 4: define text1, graphics1, text2, graphics2 with colors for 2
417 19 0 obj << /Length 206 >> stream % obj 19 is text1 for (4)
418 BT
419 /TiCBA 20 Tf % Times Roman Bold 20pt
420 0 0 0 rg % fill black
421 1 0 0 1 50 400 Tm % position text
422 <0037 ... 0011> Tj % "under" line
423 ET
424 endstream endobj
425
426 20 0 obj << /Length 671 >> stream % obj 20 is graphics1 for (4) circle
427 0 0 1 RG % stroke blue
428 1 0.647059 0 rg % fill orange
429 130 390 m ... h B % draw and fill circle
430 endstream endobj
431
432 21 0 obj << /Length 266 >> stream % obj 21 is text2 for (4)
433 BT
434 /TiCBA 20 Tf % Times Roman Bold 20pt
435 1 1 0 rg % fill yellow
436 1 0 0 1 50 370 Tm % position text
437 <0037 ... 0011> Tj % "over" line
438 ET
439 endstream endobj
440
441 22 0 obj << /Length 52 >> stream % obj 22 is graphics for (4) bar
442 1 0 0 RG % stroke red
443 0.498039 0 0.498039 rg % fill purple
444 160 360 20 70 re B % draw and fill rectangle (bar)
445 endstream endobj
446
447 # ----------------------------
448 $pdf->saveas("$fname.pdf");
449
450 The separation of text and graphics means that only some text methods
451 are available in a graphics object, and only some graphics methods are
452 available in a text object. There is much overlap, but they differ.
453 There's really no reason the code couldn't have been written (in
454 PDF::API2, or earlier) as outputting to a single object, which would
455 keep everything in the same order as the method calls. An advantage
456 would be less object and stream overhead in the PDF file. The only
457 drawback might be that an object might more easily overflow and require
458 splitting into multiple objects, but that should be rare.
459
460 You should always be able to manually split an object by simply ending
461 output to the first object, and picking up with output to the second
462 object, so long as it was created immediately after the first object.
463 The graphics state at the end of the first object should be the initial
464 state at the beginning of the second object. However, use caution when
465 dealing with text objects -- the PDF specification states that the Text
466 matrices are not carried over from one object to the next (BT resets
467 them), so you may need to reset some settings.
468
469 $grfx1 = $page->gfx();
470 $grfx2 = $page->gfx();
471 # write a huge amount of stuff to $grfx1
472 # write a huge amount of stuff to $grfx2, picking up where $grfx1 left off
473
474 In any case, now that you understand the rendering order and how the
475 order of object declarations affects it, how text and graphics are
476 drawn can now be completely controlled as desired. There is really no
477 need to add another "both" type object that will handle all graphics
478 and text objects, as that would probably be a major code bloat for very
479 little benefit. However, it could be considered in the future if there
480 is a demonstrated need for it, such as serious PDF file size bloat due
481 to the extra object overhead when interleaving text and graphics
482 output.
483
484 PDF Versions Supported
485 When creating a PDF file using the functions in PDF::Builder, the
486 output is marked as PDF 1.4. This does not mean that all PDF
487 functionality up through 1.4 is supported! There are almost surely
488 features missing as far back as the PDF 1.0 standard.
489
490 The big problem is when a PDF of version 1.5 or higher is imported or
491 opened in PDF::Builder. If it contains content that is actually
492 unsupported by this software, there is a chance that something will
493 break. This does not guarantee that a PDF marked as "1.7" will go down
494 in flames when read by PDF::Builder, or that a PDF written back out
495 will break in a Reader, but the possibility is there. Much PDF writer
496 software simply marks its output as the highest version of PDF at the
497 time (usually 1.7), even if there is no content beyond, say, 1.2.
498 There is some handling of PDF 1.5 items in PDF::Builder, such as cross
499 reference streams, but support beyond 1.4 is very limited. All we can
500 say is to be careful when handling PDFs whose version is above 1.4, and
501 test thoroughly, as they may break at some point.
502
503 PDF::Builder includes a simple version control mechanism, where the
504 initial PDF version to be output (default 1.4) can be set by the
505 programmer. Input PDFs greater than 1.4 (current output level) will
506 receive a warning (can be suppressed) that the output level will be
507 raised to that level. The use of PDF features greater than the current
508 output level will likewise trigger a warning that the output level is
509 to be raised to the necessary level. If this is not desired, you should
510 avoid using those PDF features which are higher than the desired PDF
511 output level.
512
513 History
514 PDF::API2 was originally written by Alfred Reibenschuh, derived from
515 Martin Hosken's Text::PDF via the Text::PDF::API wrapper. In 2009,
516 Otto Hirr started the PDF::API3 fork, but it never went anywhere. In
517 2011, PDF::API2 maintenance was taken over by Steve Simms. In 2017,
518 PDF::Builder was forked by Phil M. Perry, who desired a more aggressive
519 schedule of new features and bug fixes than Simms was providing.
520
521 At Simms's request, the name of the new offering was changed from
522 PDF::API4 to PDF::Builder, to reduce the chance of confusion due to
523 parallel development. Perry's intent is to keep all internal methods
524 as upwardly compatible with PDF::API2 as possible, although it is
525 likely that there will be some drift (incompatibilities) over time. At
526 least initially, any program written based on PDF::API2 should be
527 convertible to PDF::Builder simply by changing "API2" anywhere it
528 occurs to "Builder". See the INFO/KNOWN_INCOMP known incompatibilities
529 file for further information.
530
531 Thanks...
532
533 Many users have helped out by reporting bugs and requesting
534 enhancements. A special shout out goes to those who have contributed
535 code and tests, or coordinated their package development with the needs
536 of PDF::Builder: Ben Bullock, Cary Gravel, Gregor Herrmann, Petr Pisar,
537 Jeffrey Ratcliffe, Steve Simms (via PDF::API2 fixes), and Johan
538 Vromans. Drop me a line if I've overlooked your contribution!
539
541 After saving a file...
542 Note that a PDF object such as $pdf cannot continue to be used after
543 saving an output PDF file or string with $pdf->"save()", "saveas()", or
544 "stringify()". There is some cleanup and other operations done
545 internally which make the object unusable for further operations. You
546 will likely receive an error message about can't call method new_obj on
547 an undefined value if you try to keep using a PDF object.
548
549 IntegrityCheck
550 The PDF::Builder methods that open an existing PDF file, pass it by the
551 integrity checker method, "$self->IntegrityCheck(level, content)". This
552 method servers two purposes: 1) to find any "/Version" settings that
553 override the PDF version found in the PDF heading, and 2) perform some
554 basic validations on the contents of the PDF.
555
556 The "level" parameter accepts the following values:
557
558 0 = Do not output any diagnostic messages; just return any version
559 override.
560 1 = Output error-level (serious) diagnostic messages, as well as
561 returning any version override.
562 Errors include, in no place was the /Root object specified, or if
563 it was, the indicated object was not found. An object claims
564 another object as its child (/Kids list), but another object has
565 already claimed that child. An object claims a child, but that
566 child does not list a Parent, or the child lists a different
567 Parent.
568
569 2 = Output error- (serious) and warning- (less serious) level
570 diagnostic messages, as well as returning any version override. This is
571 the default.
572 3 = Output error- (serious), warning- (less serious), and note-
573 (informational) level diagnostic messages, as well as returning any
574 version override.
575 Notes include, in no place was the (optional) /Info object
576 specified, or if it was, the indicated object was not found. An
577 object was referenced, but no entry for it was found among the
578 objects. (This may be OK if the object is not defined, or is on the
579 free list, as the reference will then be ignored.) An object is
580 defined, but it appears that no other object is referencing it.
581
582 4 = Output error-, warning-, and note-level diagnostic messages, as
583 well as returning any version override. Also dump the diagnostic data
584 structure.
585 5 = Output error-, warning-, and note-level diagnostic messages, as
586 well as returning any version override. Also dump the diagnostic data
587 structure and the $self data structure (generally useful only if you
588 have already read in the PDF file).
589
590 The version is a string (e.g., '1.5') if found, otherwise "undef"
591 (undefined value) is returned.
592
593 For controlling the "automatic" call to IntegrityCheck (via opens), the
594 level may be given with the option (flag) "-diaglevel => n", where "n"
595 is between 0 and 5.
596
597 Preferences - set user display preferences
598 $pdf->preferences(%options)
599 Controls viewing preferences for the PDF.
600
601 Page Mode Options
602
603 -fullscreen
604 Full-screen mode, with no menu bar, window controls, or any
605 other window visible.
606
607 -thumbs
608 Thumbnail images visible.
609
610 -outlines
611 Document outline visible.
612
613 Page Layout Options
614
615 -singlepage
616 Display one page at a time.
617
618 -onecolumn
619 Display the pages in one column.
620
621 -twocolumnleft
622 Display the pages in two columns, with oddnumbered pages on the
623 left.
624
625 -twocolumnright
626 Display the pages in two columns, with oddnumbered pages on the
627 right.
628
629 Viewer Options
630
631 -hidetoolbar
632 Specifying whether to hide tool bars.
633
634 -hidemenubar
635 Specifying whether to hide menu bars.
636
637 -hidewindowui
638 Specifying whether to hide user interface elements.
639
640 -fitwindow
641 Specifying whether to resize the document's window to the size
642 of the displayed page.
643
644 -centerwindow
645 Specifying whether to position the document's window in the
646 center of the screen.
647
648 -displaytitle
649 Specifying whether the window's title bar should display the
650 document title taken from the Title entry of the document
651 information dictionary.
652
653 -afterfullscreenthumbs
654 Thumbnail images visible after Full-screen mode.
655
656 -afterfullscreenoutlines
657 Document outline visible after Full-screen mode.
658
659 -printscalingnone
660 Set the default print setting for page scaling to none.
661
662 -simplex
663 Print single-sided by default.
664
665 -duplexflipshortedge
666 Print duplex by default and flip on the short edge of the
667 sheet.
668
669 -duplexfliplongedge
670 Print duplex by default and flip on the long edge of the sheet.
671
672 Initial Page Options
673
674 -firstpage => [ $page, %options ]
675 Specifying the page (either a page number or a page object) to be
676 displayed, plus one of the following options:
677
678 -fit => 1
679 Display the page designated by page, with its contents
680 magnified just enough to fit the entire page within the window
681 both horizontally and vertically. If the required horizontal
682 and vertical magnification factors are different, use the
683 smaller of the two, centering the page within the window in the
684 other dimension.
685
686 -fith => $top
687 Display the page designated by page, with the vertical
688 coordinate top positioned at the top edge of the window and the
689 contents of the page magnified just enough to fit the entire
690 width of the page within the window.
691
692 -fitv => $left
693 Display the page designated by page, with the horizontal
694 coordinate left positioned at the left edge of the window and
695 the contents of the page magnified just enough to fit the
696 entire height of the page within the window.
697
698 -fitr => [ $left, $bottom, $right, $top ]
699 Display the page designated by page, with its contents
700 magnified just enough to fit the rectangle specified by the
701 coordinates left, bottom, right, and top entirely within the
702 window both horizontally and vertically. If the required
703 horizontal and vertical magnification factors are different,
704 use the smaller of the two, centering the rectangle within the
705 window in the other dimension.
706
707 -fitb => 1
708 Display the page designated by page, with its contents
709 magnified just enough to fit its bounding box entirely within
710 the window both horizontally and vertically. If the required
711 horizontal and vertical magnification factors are different,
712 use the smaller of the two, centering the bounding box within
713 the window in the other dimension.
714
715 -fitbh => $top
716 Display the page designated by page, with the vertical
717 coordinate top positioned at the top edge of the window and the
718 contents of the page magnified just enough to fit the entire
719 width of its bounding box within the window.
720
721 -fitbv => $left
722 Display the page designated by page, with the horizontal
723 coordinate left positioned at the left edge of the window and
724 the contents of the page magnified just enough to fit the
725 entire height of its bounding box within the window.
726
727 -xyz => [ $left, $top, $zoom ]
728 Display the page designated by page, with the coordinates
729 (left, top) positioned at the top-left corner of the window and
730 the contents of the page magnified by the factor zoom. A zero
731 (0) value for any of the parameters left, top, or zoom
732 specifies that the current value of that parameter is to be
733 retained unchanged.
734
735 Example
736
737 $pdf->preferences(
738 -fullscreen => 1,
739 -onecolumn => 1,
740 -afterfullscreenoutlines => 1,
741 -firstpage => [$page, -fit => 1],
742 );
743
744 info Example
745 %h = $pdf->info(
746 'Author' => "Alfred Reibenschuh",
747 'CreationDate' => "D:20020911000000+01'00'",
748 'ModDate' => "D:YYYYMMDDhhmmssOHH'mm'",
749 'Creator' => "fredos-script.pl",
750 'Producer' => "PDF::Builder",
751 'Title' => "some Publication",
752 'Subject' => "perl ?",
753 'Keywords' => "all good things are pdf"
754 );
755 print "Author: $h{'Author'}\n";
756
757 XMP XML example
758 $xml = $pdf->xmpMetadata();
759 print "PDFs Metadata reads: $xml\n";
760 $xml=<<EOT;
761 <?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
762 <?adobe-xap-filters esc="CRLF"?>
763 <x:xmpmeta
764 xmlns:x='adobe:ns:meta/'
765 x:xmptk='XMP toolkit 2.9.1-14, framework 1.6'>
766 <rdf:RDF
767 xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
768 xmlns:iX='http://ns.adobe.com/iX/1.0/'>
769 <rdf:Description
770 rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
771 xmlns:pdf='http://ns.adobe.com/pdf/1.3/'
772 pdf:Producer='Acrobat Distiller 6.0.1 for Macintosh'></rdf:Description>
773 <rdf:Description
774 rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
775 xmlns:xap='http://ns.adobe.com/xap/1.0/'
776 xap:CreateDate='2004-11-14T08:41:16Z'
777 xap:ModifyDate='2004-11-14T16:38:50-08:00'
778 xap:CreatorTool='FrameMaker 7.0'
779 xap:MetadataDate='2004-11-14T16:38:50-08:00'></rdf:Description>
780 <rdf:Description
781 rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
782 xmlns:xapMM='http://ns.adobe.com/xap/1.0/mm/'
783 xapMM:DocumentID='uuid:919b9378-369c-11d9-a2b5-000393c97fd8'/></rdf:Description>
784 <rdf:Description
785 rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
786 xmlns:dc='http://purl.org/dc/elements/1.1/'
787 dc:format='application/pdf'>
788 <dc:description>
789 <rdf:Alt>
790 <rdf:li xml:lang='x-default'>Adobe Portable Document Format (PDF)</rdf:li>
791 </rdf:Alt>
792 </dc:description>
793 <dc:creator>
794 <rdf:Seq>
795 <rdf:li>Adobe Systems Incorporated</rdf:li>
796 </rdf:Seq>
797 </dc:creator>
798 <dc:title>
799 <rdf:Alt>
800 <rdf:li xml:lang='x-default'>PDF Reference, version 1.6</rdf:li>
801 </rdf:Alt>
802 </dc:title>
803 </rdf:Description>
804 </rdf:RDF>
805 </x:xmpmeta>
806 <?xpacket end='w'?>
807 EOT
808
809 $xml = $pdf->xmpMetadata($xml);
810 print "PDF metadata now reads: $xml\n";
811
812 "BOX" METHODS
813 A general note: Use care if specifying a different Media Box (or other
814 "box") for a page, than the global "box" setting, to define the whole
815 "chain" of boxes on the page, to avoid surprises. For example, to
816 define a global Media Box (paper size) and a global Crop Box, and then
817 define a new page-level Media Box without defining a new page-level
818 Crop Box, may give odd results in the resultant cropping. Such
819 combinations are not well defined.
820
821 All dimensions in boxes default to the default User Unit, which is
822 points (1/72 inch). Note that the PDF specification limits sizes and
823 coordinates to 14400 User Units (200 inches, for the default User Unit
824 of one point), and Adobe products (so far) follow this limit for
825 Acrobat and Distiller. It is worth noting that other PDF writers and
826 readers may choose to ignore the 14400 unit limit, with or without the
827 use of a specified User Unit. Therefore, PDF::Builder does not enforce
828 any limits on coordinates -- it's your responsibility to consider what
829 readers and other PDF tools may be used with a PDF you produce! Also
830 note that earlier Acrobat readers had coordinate limits as small as
831 3240 User Units (45 inches), and minimum media size of 72 or 3 User
832 Units.
833
834 User Units
835
836 $pdf->userunit($number)
837 The default User Unit in the PDF coordinate system is one point
838 (1/72 inch). You can think of it as a scale factor to enable larger
839 (or even, smaller) documents. This method may be used (for PDF 1.6
840 and higher) to set the User Unit to some number of points. For
841 example, "userunit(72)" will set the scale multiplier to 72.0
842 points per User Unit, or 1 inch to the User Unit. Any number
843 greater than zero is acceptable, although some readers and tools
844 may not handle User Units of less than 1.0 very well.
845
846 Not all readers respect the User Unit, if you give one, or handle
847 it in exactly the same way. Adobe Distiller, for one, does not use
848 it. How User Units are handled may vary from reader to reader.
849 Adobe Acrobat, at this writing, respects User Unit in version 7.0
850 and up, but limits it to 75000 (giving a maximum document size of
851 15 million inches or 236.7 miles or 381 km). Other readers and PDF
852 tools may allow a larger (or smaller) limit.
853
854 Your Mileage May Vary: Some readers ignore a global User Unit
855 setting and do not have pages inherit it (PDF::Builder duplicates
856 it on each page to simulate inheritance). Some readers may give
857 spurious warnings about truncated content when a Media Box is
858 changed while User Units are being used. Some readers do strange
859 things with Crop Boxes when a User Unit is in effect.
860
861 Depending on the reader used, the effect of a larger User Unit
862 (greater than 1) may mean lower resolution (chunkier or coarser
863 appearance) in the rendered document. If you're printing something
864 the size of a highway billboard, this may not matter to you, but
865 you should be aware of the possibility (even with fractional
866 coordinates). Conversely, a User Unit of less than 1.0 (if
867 permitted) reduces the allowable size of your document, but may
868 result in greater resolution.
869
870 A global (PDF level) User Unit setting is inherited by each page
871 (an action by PDF::Builder, not necessarily automatically done by
872 the reader), or can be overridden by calling userunit in the page.
873 Do not give more than one global userunit setting, as only the last
874 one will be used. Setting a page's User Unit (if "$page->"
875 instead) is permitted (overriding the global setting for this
876 page). However, many sources recommend against doing this, as
877 results may not be as expected (once again, depending on the quirks
878 of the reader).
879
880 Remember to call "userunit" before calling anything having to do
881 with page or box sizes, or coordinates. Especially when setting
882 'named' box sizes, the methods need to know the current User Unit
883 so that named page sizes (in points) may be scaled down to the
884 current User Unit.
885
886 Media Box
887
888 $pdf->mediabox($name)
889 $pdf->mediabox($name, -orient => 'orientation' )
890 $pdf->mediabox($w,$h)
891 $pdf->mediabox($llx,$lly, $urx,$ury)
892 ($llx,$lly, $urx,$ury) = $pdf->mediabox()
893 Sets the global Media Box (or page's Media Box, if "$page->"
894 instead). This defines the width and height (or by corner
895 coordinates, or by standard name) of the output page itself, such
896 as the physical paper size. This is normally the largest of the
897 "boxes". If any subsidiary box (within it) exceeds the media box,
898 the portion of the material or boxes outside of the Media Box will
899 be ignored. That is, the Media Box is the One Box to Rule Them All,
900 and is the overall limit for other boxes (some documentation refers
901 to the Media Box as "clipping" other boxes). In addition, the Media
902 Box defines the overall coordinate system for text and graphics
903 operations.
904
905 If no arguments are given, the current Media Box (global or page)
906 coordinates are returned instead. The former "get_mediabox" (page
907 only) function is deprecated and will likely be removed some time
908 in the future. In addition, when setting the Media Box, the
909 resulting coordinates are returned. This permits you to specify the
910 page size by a name (alias) and get the dimensions back, all in one
911 call.
912
913 Note that many printers can not print all the way to the physical
914 edge of the paper, so you should plan to leave some blank margin,
915 even outside of any crop marks and bleeds. Printers and on-screen
916 readers are free to discard any content found outside the Media
917 Box, and printers may discard some material just inside the Media
918 Box.
919
920 A global Media Box is required by the PDF spec; if not explicitly
921 given, PDF::Builder will set the global Media Box to US Letter size
922 (8.5in x 11in). This is the media size that will be used for all
923 pages if you do not specify a "mediabox" call on a page. That is, a
924 global (PDF level) mediabox setting is inherited by each page, or
925 can be overridden by setting mediabox in the page. Do not give more
926 than one global mediabox setting, as only the last one will be
927 used.
928
929 If you give a single string name (e.g., 'A4'), you may optionally
930 add an orientation to turn the page 90 degrees into Landscape mode:
931 "-orient => 'L'" or "-orient => 'l'". "-orient" is the only option
932 recognized, and a string beginning with an 'L' or 'l' (for
933 Landscape) is the only value of interest (anything else is treated
934 as Portrait mode). The y axis still runs from 0 at the bottom of
935 the page to what used to be the page width (now, height) at the
936 top, and likewise for the x axis: 0 at left to (former) height at
937 the right. That is, the coordinate system is the same as before,
938 except that the height and width are different.
939
940 The lower left corner does not have to be 0,0. It can be any values
941 you want, including negative values (so long as the resulting
942 media's sides are at least one point long). "mediabox" sets the
943 coordinate system (including the origin) of the graphics and text
944 that will be drawn, as well as for subsequent "boxes". It's even
945 possible to give any two opposite corners (such as upper left and
946 lower right). The coordinate system will be rearranged (by the
947 Reader) to still be the conventional minimum "x" and "y" in the
948 lower left (i.e., you can't make "y" increase from top to bottom!).
949
950 Example:
951
952 $pdf = PDF::Builder->new();
953 $pdf->mediabox('A4'); # A4 size (595 Pt wide by 842 Pt high)
954 ...
955 $pdf->saveas('our/new.pdf');
956
957 $pdf = PDF::Builder->new();
958 $pdf->mediabox(595, 842); # A4 size, with implicit 0,0 LL corner
959 ...
960 $pdf->saveas('our/new.pdf');
961
962 $pdf = PDF::Builder->new;
963 $pdf->mediabox(0, 0, 595, 842); # A4 size, with explicit 0,0 LL corner
964 ...
965 $pdf->saveas('our/new.pdf');
966
967 See the PDF::Builder::Resource::PaperSizes source code for the full
968 list of supported names (aliases) and their dimensions in points.
969 You are free to add additional paper sizes to this file, if you
970 wish. You might want to do this if you frequently use a standard
971 page size in rotated (Landscape) mode. See also the "getPaperSizes"
972 call in PDF::Builder::Util. These names (aliases) are also usable
973 in other "box" calls, although useful only if the "box" is the same
974 size as the full media (Media Box), and you don't mind their
975 starting at 0,0.
976
977 Crop Box
978
979 $pdf->cropbox($name)
980 $pdf->cropbox($name, -orient => 'orientation')
981 $pdf->cropbox($w,$h)
982 $pdf->cropbox($llx,$lly, $urx,$ury)
983 ($llx,$lly, $urx,$ury) = $pdf->cropbox()
984 Sets the global Crop Box (or page's Crop Box, if "$page->"
985 instead). This will define the media size to which the output will
986 later be clipped. Note that this does not itself output any crop
987 marks to guide cutting of the paper! PDF Readers should consider
988 this to be the visible portion of the page, and anything found
989 outside it may be clipped (invisible). By default, it is equal to
990 the Media Box, but may be defined to be smaller, in the coordinate
991 system set by the Media Box. A global setting will be inherited by
992 each page, but can be overridden on a per-page basis.
993
994 A Reader or Printer may choose to discard any clipped (invisible)
995 part of the page, and show only the area within the Crop Box. For
996 example, if your page Media Box is A4 (0,0 to 595,842 Points), and
997 your Crop Box is (100,100 to 495,742), a reader such as Adobe
998 Acrobat Reader may show you a page 395 by 642 Points in size (i.e.,
999 just the visible area of your page). Other Readers may show you the
1000 full media size (Media Box) and a 100 Point wide blank area (in
1001 this example) around the visible content.
1002
1003 If no arguments are given, the current Crop Box (global or page)
1004 coordinates are returned instead. The former "get_cropbox" (page
1005 only) function is deprecated and will likely be removed some time
1006 in the future. If a Crop Box has not been defined, the Media Box
1007 coordinates (which always exist) will be returned instead. In
1008 addition, when setting the Crop Box, the resulting coordinates are
1009 returned. This permits you to specify the crop box by a name
1010 (alias) and get the dimensions back, all in one call.
1011
1012 Do not confuse the Crop Box with the "Trim Box", which shows where
1013 printed paper is expected to actually be cut. Some PDF Readers may
1014 reduce the visible "paper" background to the size of the crop box;
1015 others may simply omit any content outside it. Either way, you
1016 would lose any trim or crop marks, printer instructions, color
1017 alignment dots, or other content outside the Crop Box. A good use
1018 of the Crop Box would be limit printing to the area where a printer
1019 can reliably put down ink, and leave white the edge areas where
1020 paper-handling mechanisms prevent ink or toner from being applied.
1021 This would keep you from accidentally putting valuable content in
1022 an area where a printer will refuse to print, yet permit you to
1023 include a bleed area and space for printer's marks and
1024 instructions. Needless to say, if your printer cannot print to the
1025 very edge of the paper, you will need to trim (cut) the printed
1026 sheets to get true bleeds.
1027
1028 A global (PDF level) cropbox setting is inherited by each page, or
1029 can be overridden by setting cropbox in the page. As with
1030 "mediabox", only one crop box may be set at this (PDF) level. As
1031 with "mediabox", a named media size may have an orientation (l or
1032 L) for Landscape mode. Note that the PDF level global Crop Box
1033 will be used even if the page gets its own Media Box. That is, the
1034 page's Crop Box inherits the global Crop Box, not the page Media
1035 Box, even if the page has its own media size! If you set the page's
1036 own Media Box, you should consider also explicitly setting the page
1037 Crop Box (and other boxes).
1038
1039 Bleed Box
1040
1041 $pdf->bleedbox($name)
1042 $pdf->bleedbox($name, -orient => 'orientation')
1043 $pdf->bleedbox($w,$h)
1044 $pdf->bleedbox($llx,$lly, $urx,$ury)
1045 ($llx,$lly, $urx,$ury) = $pdf->bleedbox()
1046 Sets the global Bleed Box (or page's Bleed Box, if "$page->"
1047 instead). This is typically used in printing on paper, where you
1048 want ink or color (such as thumb tabs) to be printed a bit beyond
1049 the final paper size, to ensure that the cut paper bleeds (the cut
1050 goes through the ink), rather than accidentally leaving some white
1051 paper visible outside. Allow enough "bleed" over the expected trim
1052 line to account for minor variations in paper handling, folding,
1053 and cutting; to avoid showing white paper at the edge. The Bleed
1054 Box is where printing could actually extend to; the Trim Box is
1055 normally within it, where the paper would actually be cut. The
1056 default value is equal to the Crop Box, but is often a bit smaller.
1057 The space between the Bleed Box and the Crop Box is available for
1058 printer instructions, color alignment dots, etc., while crop marks
1059 (trim guides) are at least partly within the bleed area (and should
1060 be printed after content is printed).
1061
1062 If no arguments are given, the current Bleed Box (global or page)
1063 coordinates are returned instead. The former "get_bleedbox" (page
1064 only) function is deprecated and will likely be removed some time
1065 in the future. If a Bleed Box has not been defined, the Crop Box
1066 coordinates (if defined) will be returned, otherwise the Media Box
1067 coordinates (which always exist) will be returned. In addition,
1068 when setting the Bleed Box, the resulting coordinates are returned.
1069 This permits you to specify the bleed box by a name (alias) and get
1070 the dimensions back, all in one call.
1071
1072 A global (PDF level) bleedbox setting is inherited by each page, or
1073 can be overridden by setting bleedbox in the page. As with
1074 "mediabox", only one bleed box may be set at this (PDF) level. As
1075 with "mediabox", a named media size may have an orientation (l or
1076 L) for Landscape mode. Note that the PDF level global Bleed Box
1077 will be used even if the page gets its own Crop Box. That is, the
1078 page's Bleed Box inherits the global Bleed Box, not the page Crop
1079 Box, even if the page has its own media size! If you set the page's
1080 own Media Box or Crop Box, you should consider also explicitly
1081 setting the page Bleed Box (and other boxes).
1082
1083 Trim Box
1084
1085 $pdf->trimbox($name)
1086 $pdf->trimbox($name, -orient => 'orientation')
1087 $pdf->trimbox($w,$h)
1088 $pdf->trimbox($llx,$lly, $urx,$ury)
1089 ($llx,$lly, $urx,$ury) = $pdf->trimbox()
1090 Sets the global Trim Box (or page's Trim Box, if "$page->"
1091 instead). This is supposed to be the actual dimensions of the
1092 finished page (after trimming of the paper). In some production
1093 environments, it is useful to have printer's instructions, cut
1094 marks, and so on outside of the trim box. The default value is
1095 equal to Crop Box, but is often a bit smaller than any Bleed Box,
1096 to allow the desired "bleed" effect.
1097
1098 If no arguments are given, the current Trim Box (global or page)
1099 coordinates are returned instead. The former "get_trimbox" (page
1100 only) function is deprecated and will likely be removed some time
1101 in the future. If a Trim Box has not been defined, the Crop Box
1102 coordinates (if defined) will be returned, otherwise the Media Box
1103 coordinates (which always exist) will be returned. In addition,
1104 when setting the Trim Box, the resulting coordinates are returned.
1105 This permits you to specify the trim box by a name (alias) and get
1106 the dimensions back, all in one call.
1107
1108 A global (PDF level) trimbox setting is inherited by each page, or
1109 can be overridden by setting trimbox in the page. As with
1110 "mediabox", only one trim box may be set at this (PDF) level. As
1111 with "mediabox", a named media size may have an orientation (l or
1112 L) for Landscape mode. Note that the PDF level global Trim Box
1113 will be used even if the page gets its own Crop Box. That is, the
1114 page's Trim Box inherits the global Trim Box, not the page Crop
1115 Box, even if the page has its own media size! If you set the page's
1116 own Media Box or Crop Box, you should consider also explicitly
1117 setting the page Trim Box (and other boxes).
1118
1119 Art Box
1120
1121 $pdf->artbox($name)
1122 $pdf->artbox($name, -orient => 'orientation')
1123 $pdf->artbox($w,$h)
1124 $pdf->artbox($llx,$lly, $urx,$ury)
1125 ($llx,$lly, $urx,$ury) = $pdf->artbox()
1126 Sets the global Art Box (or page's Art Box, if "$page->" instead).
1127 This is supposed to define "the extent of the page's meaningful
1128 content (including [margins])". It might exclude some content, such
1129 as Headlines or headings. Any binding or punched-holes margin would
1130 typically be outside of the Art Box, as would be page numbers and
1131 running headers and footers. The default value is equal to the Crop
1132 Box, although normally it would be no larger than any Trim Box. The
1133 Art Box may often be used for defining "important" content (e.g.,
1134 excluding advertisements) that may or may not be brought over to
1135 another page (e.g., N-up printing).
1136
1137 If no arguments are given, the current Art Box (global or page)
1138 coordinates are returned instead. The former "get_artbox" (page
1139 only) function is deprecated and will likely be removed some time
1140 in the future. If an Art Box has not been defined, the Crop Box
1141 coordinates (if defined) will be returned, otherwise the Media Box
1142 coordinates (which always exist) will be returned. In addition,
1143 when setting the Art Box, the resulting coordinates are returned.
1144 This permits you to specify the art box by a name (alias) and get
1145 the dimensions back, all in one call.
1146
1147 A global (PDF level) artbox setting is inherited by each page, or
1148 can be overridden by setting artbox in the page. As with
1149 "mediabox", only one art box may be set at this (PDF) level. As
1150 with "mediabox", a named media size may have an orientation (l or
1151 L) for Landscape mode. Note that the PDF level global Art Box will
1152 be used even if the page gets its own Crop Box. That is, the page's
1153 Art Box inherits the global Art Box, not the page Crop Box, even if
1154 the page has its own media size! If you set the page's own Media
1155 Box or Crop Box, you should consider also explicitly setting the
1156 page Art Box (and other boxes).
1157
1158 Suggested Box Usage
1159
1160 See "examples/Boxes.pl" for an example of using boxes.
1161
1162 How you define your boxes (or let them default) is up to you, depending
1163 on whether you're duplex printing US Letter or A4 on your laser
1164 printer, to be spiral bound on the bind margin, or engaging a
1165 professional printer. In the latter case, discuss in advance with the
1166 print firm what capabilities (and limitations) they have and what
1167 information they need from a PDF file. For instance, they may not want
1168 a Crop Box defined, and may call for very specific box sizes. For large
1169 press runs, they may print multiple pages (N-up) duplexed on large web
1170 roll "signatures", which are then intricately folded and guillotined
1171 (trimmed) and bound together into books or magazines. You would usually
1172 just supply a PDF with all the pages; they would take care of the
1173 signature layout (which includes offsets and 180 degree rotations).
1174
1175 (As an aside, don't count on a printer having any particular font
1176 available, so be sure to ask. Usually they will want you to embed all
1177 fonts used, but ask first, and double-check before handing over the
1178 print job! TTF/OTF fonts ("ttfont()") are embedded by default, but
1179 other fonts (core, ps, bdf, cjk) are not! A printer may have a core
1180 font collection, but they are free to substitute a "workalike" font for
1181 any given core font, and the results may not match what you saw on your
1182 PC!)
1183
1184 On the assumption that you're using a single sheet (US Letter or A4)
1185 laser or inkjet printer, are you planning to trim each sheet down to a
1186 smaller final size? If so, you can do true bleeds by defining a Trim
1187 Box and a slightly larger Bleed Box. You would print bleeds (all the
1188 way to the finished edge) out to the Bleed Box, but nothing is enforced
1189 about the Bleed Box. At the other end of the spectrum, you would define
1190 the Media Box to be the physical paper size being printed on. Most
1191 printers reserve a little space on the sides (and possibly top and
1192 bottom) for paper handling, so it is often good to define your Crop Box
1193 as the printable area. Remember that the Media Box sets the coordinate
1194 system used, so you still need to avoid going outside the Crop Box with
1195 content (most readers and printers will not show any ink outside of the
1196 Crop Box). Whether or not you define a Crop Box, you're going to almost
1197 always end up with white paper on at least the sides.
1198
1199 For small in-house jobs, you probably won't need color alignment dots
1200 and other such professional instructions and information between the
1201 Bleed Box and the Crop Box, but crop marks for trimming (if used)
1202 should go just outside the Trim Box (partly or wholly within the Bleed
1203 Box), and be drawn after all content. If you're not trimming the paper,
1204 don't try to do any bleed effects (including solid background color
1205 pages/covers), as you will usually have a white edge around the sheet
1206 anyway. Don't count on a PDF document never being physically printed,
1207 and not just displayed (where you can do things like bleed all the way
1208 to the media edge). Finally, for single sheet printing, an Art Box is
1209 probably unnecessary, but if you're combining pages into N-up prints,
1210 or doing other manipulations, it may be useful.
1211
1212 Box Inheritance
1213
1214 What Media, Crop, Bleed, Trim, and Art Boxes a page gets can be a
1215 little complicated. Note that usually, only the Media and Crop Boxes
1216 will have a clear visual effect. The visual effect of the other boxes
1217 (if any) may be very subtle.
1218
1219 First, everything is set at the global (PDF) level. The Media Box is
1220 always defined, and defaults to US Letter (8.5 inches wide by 11 inches
1221 high). The global Crop Box inherits the Media Box, unless explicitly
1222 defined. The Bleed, Trim, and Art Boxes inherit the Crop Box, unless
1223 explicitly defined. A global box should only be defined once, as the
1224 last one defined is the one that will be written to the PDF!
1225
1226 Second, a page inherits the global boxes, for its initial settings. You
1227 may call any of the box set methods ("cropbox", "trimbox", etc.) to
1228 explicitly set (override) any box for this page. Note that setting a
1229 new Media Box for the page does not reset the page's Crop Box -- it
1230 still uses whatever it inherited from the global Crop Box. You would
1231 need to explicitly set the page's Crop Box if you want a different
1232 setting. Likewise, the page's Bleed, Trim, and Art Boxes will not be
1233 reset by a new page Crop Box -- they will still inherit from the global
1234 (PDF) settings.
1235
1236 Third, the page Media Box (the one actually used for output pages),
1237 clips or limits all the other boxes to extend no larger than its size.
1238 For example, if the Media Box is US Letter, and you set a Crop Box of
1239 A4 size, the smaller of the two heights (11 inches) would be effective,
1240 and the smaller of the two widths (8.26 inches, 595 Points) would be
1241 effective. The given dimensions of a box are returned on query (get),
1242 not the effective dimensions clipped by the Media Box.
1243
1244 FONT METHODS
1245 Core Fonts
1246
1247 Core fonts are limited to single byte encodings. You cannot use UTF-8
1248 or other multibyte encodings with core fonts. The default encoding for
1249 the core fonts is WinAnsiEncoding (roughly the CP-1252 superset of
1250 ISO-8859-1). See the "-encode" option below to change this encoding.
1251 See "font automap" in PDF::Builder::Resource::Font method for
1252 information on accessing more than 256 glyphs in a font, using planes,
1253 although there is no guarantee that future changes to font files will
1254 permit consistent results.
1255
1256 Note that core fonts use fixed lists of expected glyphs, along with
1257 metrics such as their widths. This may not exactly match up with
1258 whatever local font file is used by the PDF reader. It's usually pretty
1259 close, but many cases have been found where the list of glyphs is
1260 different between the core fonts and various local font files, so be
1261 aware of this.
1262
1263 To allow UTF-8 text and extended glyph counts, you should consider
1264 replacing your use of core fonts with TrueType (.ttf) and OpenType
1265 (.otf) fonts. There are tools, such as FontForge, which can do a fairly
1266 good (though, not perfect) job of converting a Type1 font library to
1267 OTF.
1268
1269 Examples:
1270
1271 $font1 = $pdf->corefont('Times-Roman', -encode => 'latin2');
1272 $font2 = $pdf->corefont('Times-Bold');
1273 $font3 = $pdf->corefont('Helvetica');
1274 $font4 = $pdf->corefont('ZapfDingbats');
1275
1276 Valid %options are:
1277
1278 -encode
1279 Changes the encoding of the font from its default. Notice that the
1280 encoding (not the entire font's glyph list) is shown in a PDF
1281 object (record), listing 256 glyphs associated with this encoding
1282 (and that are available in this font).
1283
1284 -dokern
1285 Enables kerning if data is available.
1286
1287 Notes:
1288
1289 Even though these are called "core" fonts, they are not shipped with
1290 PDF::Builder, but are expected to be found on the machine with the PDF
1291 reader. Most core fonts are installed with a PDF reader, and thus are
1292 not coordinated with PDF::Builder. PDF::Builder does ship with core
1293 font metrics files (width, glyph names, etc.), but these cannot be
1294 guaranteed to be in sync with what the PDF reader has installed!
1295
1296 There are some 14 core fonts (regular, italic, bold, and bold-italic
1297 for Times [serif], Helvetica [sans serif], Courier [fixed pitch]; plus
1298 two symbol fonts) that are supposed to be available on any PDF reader,
1299 although other fonts with very similar metrics are often substituted.
1300 You should not count on any of the 15 Windows core fonts (Bank Gothic,
1301 Georgia, Trebuchet, Verdana, and two more symbol fonts) being present,
1302 especially on Linux, Mac, or other non-Windows platforms. Be aware if
1303 you are producing PDFs to be read on a variety of different systems!
1304
1305 If you want to ensure the widest portability for a PDF document you
1306 produce, you should consider using TTF fonts (instead of core fonts)
1307 and embedding them in the document. This ensures that there will be no
1308 substitutions, that all metrics are known and match the glyphs, UTF-8
1309 encoding can be used, and that the glyphs will be available on the
1310 reader's machine. At least on Windows platforms, most of the fonts are
1311 TTF anyway, which are used behind the scenes for "core" fonts, while
1312 missing most of the capabilities of TTF (now or possibly later in
1313 PDF::Builder) such as embedding, ligatures, UTF-8, etc. The downside
1314 is, obviously, that the resulting PDF file will be larger because it
1315 includes the font(s). There might also be copyright or licensing issues
1316 with the redistribution of font files in this manner (you might want to
1317 check, before widely distributing a PDF document with embedded fonts,
1318 although many do permit the part of the font used, to be embedded.).
1319
1320 See also PDF::Builder::Resource::Font::CoreFont.
1321
1322 PS Fonts
1323
1324 PS (T1) fonts are limited to single byte encodings. You cannot use
1325 UTF-8 or other multibyte encodings with T1 fonts. The default encoding
1326 for the T1 fonts is WinAnsiEncoding (roughly the CP-1252 superset of
1327 ISO-8859-1). See the "-encode" option below to change this encoding.
1328 See "font automap" in PDF::Builder::Resource::Font method for
1329 information on accessing more than 256 glyphs in a font, using planes,
1330 although there is no guarantee that future changes to font files will
1331 permit consistent results. Note: many Type1 fonts are limited to 256
1332 glyphs, but some are available with more than 256 glyphs. Still, a
1333 maximum of 256 at a time are usable.
1334
1335 "psfont" accepts both ASCII (.pfa) and binary (.pfb) Type1 glyph files.
1336 Font metrics can be supplied in either ASCII (.afm) or binary (.pfm)
1337 format, as can be seen in the examples given below. It is possible to
1338 use .pfa with .pfm and .pfb with .afm if that's what's available. The
1339 ASCII and binary files have the same content, just in different
1340 formats.
1341
1342 To allow UTF-8 text and extended glyph counts in one font, you should
1343 consider replacing your use of Type1 fonts with TrueType (.ttf) and
1344 OpenType (.otf) fonts. There are tools, such as FontForge, which can do
1345 a fairly good (though, not perfect) job of converting your font library
1346 to OTF.
1347
1348 Examples:
1349
1350 $font1 = $pdf->psfont('Times-Book.pfa', -afmfile => 'Times-Book.afm');
1351 $font2 = $pdf->psfont('/fonts/Synest-FB.pfb', -pfmfile => '/fonts/Synest-FB.pfm');
1352
1353 Valid %options are:
1354
1355 -encode
1356 Changes the encoding of the font from its default. Notice that the
1357 encoding (not the entire font's glyph list) is shown in a PDF
1358 object (record), listing 256 glyphs associated with this encoding
1359 (and that are available in this font).
1360
1361 -afmfile
1362 Specifies the location of the ASCII font metrics file (.afm). It
1363 may be used with either an ASCII (.pfa) or binary (.pfb) glyph
1364 file.
1365
1366 -pfmfile
1367 Specifies the location of the binary font metrics file (.pfm). It
1368 may be used with either an ASCII (.pfa) or binary (.pfb) glyph
1369 file.
1370
1371 -dokern
1372 Enables kerning if data is available.
1373
1374 Note: these T1 (Type1) fonts are not shipped with PDF::Builder, but are
1375 expected to be found on the machine with the PDF reader. Most PDF
1376 readers do not install T1 fonts, and it is up to the user of the PDF
1377 reader to install the needed fonts. Unlike TrueType fonts, PS (T1)
1378 fonts are not embedded in the PDF, and must be supplied on the Reader
1379 end.
1380
1381 See also PDF::Builder::Resource::Font::Postscript.
1382
1383 TrueType Fonts
1384
1385 Warning: BaseEncoding is not set by default for TrueType fonts, so text
1386 in the PDF isn't searchable (by the PDF reader) unless a ToUnicode CMap
1387 is included. A ToUnicode CMap is included by default (-unicodemap set
1388 to 1) by PDF::Builder, but allows it to be disabled (for performance
1389 and file size reasons) by setting -unicodemap to 0. This will produce
1390 non-searchable text, which, besides being annoying to users, may
1391 prevent screen readers and other aids to disabled users from working
1392 correctly!
1393
1394 Examples:
1395
1396 $font1 = $pdf->ttfont('Times.ttf');
1397 $font2 = $pdf->ttfont('Georgia.otf');
1398
1399 Valid %options are:
1400
1401 -encode
1402 Changes the encoding of the font from its default
1403 (WinAnsiEncoding).
1404
1405 Note that for a single byte encoding (e.g., 'latin1'), you are
1406 limited to 256 characters defined for that encoding. 'automap' does
1407 not work with TrueType. If you want more characters than that, use
1408 'utf8' encoding with a UTF-8 encoded text string.
1409
1410 -isocmap
1411 Use the ISO Unicode Map instead of the default MS Unicode Map.
1412
1413 -unicodemap
1414 If 1 (default), output ToUnicode CMap to permit text searches and
1415 screen readers. Set to 0 to save space by not including the
1416 ToUnicode CMap, but text searching and screen reading will not be
1417 possible.
1418
1419 -dokern
1420 Enables kerning if data is available.
1421
1422 -noembed
1423 Disables embedding of the font file. Note that this is potentially
1424 hazardous, as the glyphs provided on the PDF reader machine may not
1425 match what was used on the PDF writer machine (the one running
1426 PDF::Builder)! If you know for sure that all PDF readers will be
1427 using the same TTF or OTF file you're using with PDF::Builder; not
1428 embedding the font may be acceptable, in return for a smaller PDF
1429 file size. Note that the Reader needs to know where to find the
1430 font file -- it can't be in any random place, but typically needs
1431 to be listed in a path that the Reader follows. Otherwise, it will
1432 be unable to render the text!
1433
1434 The only value for the "-noembed" flag currently checked for is 1,
1435 which means to not embed the font file in the PDF. Any other value
1436 currently results in the font file being embedded (by default),
1437 although in the future, other values might be given significance
1438 (such as checking permission bits).
1439
1440 Some additional comments on embedding font file(s) into the PDF:
1441 besides substantially increasing the size of the PDF (even if the
1442 font is subsetted, by default), PDF::Builder does not check the
1443 font file for any flags indicating font licensing issues and
1444 limitations on use. A font foundry may not permit embedding at all,
1445 may permit a subset of the font to be embedded, may permit a full
1446 font to be embedded, and may specify what can be done with an
1447 embedded font (e.g., may or may not be extracted for further use
1448 beyond displaying this one PDF). When you choose to use (and embed)
1449 a font, you should be aware of any such licensing issues.
1450
1451 -nosubset
1452 Disables subsetting of a TTF/OTF font, when embedded. By default,
1453 only the glyphs used by a document are included in the file, and
1454 not the entire font. This can result in a tremendous savings in
1455 PDF file size. If you intend to allow the PDF to be edited by
1456 users, not having the entire font glyph set available may cause
1457 problems, so be aware of that (and consider using "-nosubset => 1".
1458 Setting this flag to any value results in the entire font glyph set
1459 being embedded in the file. It might be a good idea to use only the
1460 value 1, in case other values are assigned roles in the future.
1461
1462 -debug
1463 If set to 1 (default is 0), diagnostic information is output about
1464 the CMap processing.
1465
1466 -usecmf
1467 If set to 1 (default is 0), the first priority is to make use of
1468 one of the four ".cmap" files for CJK fonts. This is the old way of
1469 processing TTF files. If, after all is said and done, a working
1470 internal CMap hasn't been found (for -usecmf=>0), "ttfont()" will
1471 fall back to using a ".cmap" file if possible.
1472
1473 -cmaps
1474 This flag may be set to a string listing the Platform/Encoding
1475 pairs to look for of any internal CMaps in the font file, in the
1476 desired order (highest priority first). If one list (comma and/or
1477 space-separated pairs) is given, it is used for both Windows and
1478 non-Windows platforms (on which PDF::Builder is running, not the
1479 PDF reader's). Two lists, separated by a semicolon ; may be given,
1480 with the first being used for a Windows platform and the second for
1481 non-Windows. The default list is "0/6 3/10 0/4 3/1 0/3; 0/6 0/4
1482 3/10 0/3 3/1". Finally, instead of a P/E list, a string "find_ms"
1483 may be given to tell it to simply call the Font::TTF "find_ms()"
1484 method to find a (preferably Windows) internal CMap. "-cmaps" set
1485 to 'find_ms' would emulate the old way of looking for CMaps. Symbol
1486 fonts (3/0) always use find_ms(), and the new default lookup is (if
1487 ".cmap" isn't used, see "-usecmf") to try to get a match with the
1488 default list for the appropriate OS. If none can be found,
1489 find_ms() is tried, and as last resort use the ".cmap" (if
1490 available), even if "-usecmf" is not 1.
1491
1492 CJK Fonts
1493
1494 Examples:
1495
1496 $font = $pdf->cjkfont('korean');
1497 $font = $pdf->cjkfont('traditional');
1498
1499 Valid %options are:
1500
1501 -encode
1502 Changes the encoding of the font from its default.
1503
1504 Warning: Unlike "ttfont", the font file is not embedded in the output
1505 PDF file. This is evidently behavior left over from the early days of
1506 CJK fonts, where the "Cmap" and "Data" were always external files,
1507 rather than internal tables. If you need a CJK-using PDF file to embed
1508 the font, for portability, you can create a PDF using "cjkfont", and
1509 then use an external utility (e.g., "pdfcairo") to embed the font in
1510 the PDF. It may also be possible to use "ttfont" instead, to produce
1511 the PDF, provided you can deduce the correct font file name from
1512 examining the PDF file (e.g., on my Windows system, the "Ming" font
1513 would be "$font = $pdf->ttfont("C:/Program Files (x86)/Adobe/Acrobat
1514 Reader DC/Resource/CIDFont/AdobeMingStd-Light.otf")". Of course, the
1515 font file used would have to be ".ttf" or ".otf". It may act a little
1516 differently than "cjkfont" (due a a different Cmap), but you should be
1517 able to embed the font file into the PDF.
1518
1519 See also PDF::Builder::Resource::CIDFont::CJKFont
1520
1521 Synthetic Fonts
1522
1523 Warning: BaseEncoding is not set by default for these fonts, so text in
1524 the PDF isn't searchable (by the PDF reader) unless a ToUnicode CMap is
1525 included. A ToUnicode CMap is included by default (-unicodemap set to
1526 1) by PDF::Builder, but allows it to be disabled (for performance and
1527 file size reasons) by setting -unicodemap to 0. This will produce non-
1528 searchable text, which, besides being annoying to users, may prevent
1529 screen readers and other aids to disabled users from working correctly!
1530
1531 Examples:
1532
1533 $cf = $pdf->corefont('Times-Roman', -encode => 'latin1');
1534 $sf = $pdf->synfont($cf, -condense => 0.85); # compressed 85%
1535 $sfb = $pdf->synfont($cf, -bold => 1); # embolden by 10em
1536 $sfi = $pdf->synfont($cf, -oblique => -12); # italic at -12 degrees
1537
1538 Valid %options are:
1539
1540 -condense
1541 Character width condense/expand factor (0.1-0.9 = condense, 1 =
1542 normal/default, 1.1+ = expand). It is the multiplier to apply to
1543 the width of each character.
1544
1545 -oblique
1546 Italic angle (+/- degrees, default 0), sets skew of character box.
1547
1548 -bold
1549 Emboldening factor (0.1+, bold = 1, heavy = 2, ...), additional
1550 thickness to draw outline of character (with a heavier line width)
1551 before filling.
1552
1553 -space
1554 Additional character spacing in milliems (0-1000)
1555
1556 -caps
1557 0 for normal text, 1 for small caps. Implemented by asking the
1558 font what the uppercased translation (single character) is for a
1559 given character, and outputting it at 80% height and 88% width
1560 (heavier vertical stems are better looking than a straight 80%
1561 scale).
1562
1563 Note that only lower case letters which appear in the "standard"
1564 font (plane 0 for core fonts and PS fonts) will be small-capped.
1565 This may include eszett (German sharp s), which becomes SS, and
1566 dotless i and j which become I and J respectively. There are many
1567 other accented Latin alphabet letters which may show up in planes 1
1568 and higher. Ligatures (e.g., ij and ffl) do not have uppercase
1569 equivalents, nor does a long s. If you have text which includes
1570 such characters, you may want to consider preprocessing it to
1571 replace them with Latin character expansions (e.g., i+j and f+f+l)
1572 before small-capping.
1573
1574 Note that CJK fonts (created with the "cjkfont" method) do not work
1575 properly with "synfont". This is due to a different internal structure
1576 of the CJK fonts, as compared to corefont, ttfont, and psfont base
1577 fonts. If you require a synthesized (modified) CJK font, you might try
1578 finding the TTF or OTF original, use "ttfont" to create the base font,
1579 and running "synfont" against that, in the manner described for
1580 embedding "CJK Fonts".
1581
1582 See also PDF::Builder::Resource::Font::SynFont
1583
1584 IMAGE METHODS
1585 This is additional information on enhanced libraries available for TIFF
1586 and PNG images. See specific information listings for GD, GIF, JPEG,
1587 and PNM image formats. In addition, see "examples/Content.pl" for an
1588 example of placing an image on a page, as well as using in a "Form".
1589
1590 Why is my image flipped or rotated?
1591
1592 Something not uncommonly seen when using JPEG photos in a PDF is that
1593 the images will be rotated and/or mirrored (flipped). This may happen
1594 when using TIFF images too. What happens is that the camera stores an
1595 image just as it comes off the CCD sensor, regardless of the camera
1596 orientation, and does not rotate it to the correct orientation! It does
1597 store a separate "orientation" flag to suggest how the image might be
1598 corrected, but not all image processing obeys this flag (PDF::Builder
1599 does not.). For example, if you take a "portrait" (tall) photo of a
1600 tree (with the phone held vertically), and then use it in a PDF, the
1601 tree may appear to have been cut down! (appears in landscape mode)
1602
1603 I have found some code that should allow the "image_jpeg" or "image"
1604 routine to auto-rotate to (supposedly) the correct orientation, by
1605 looking for the Exif metadata "Orientation" tag in the file. However,
1606 three problems arise: 1) if a photo has been edited, and rotated or
1607 flipped in the process, there is no guarantee that the Orientation tag
1608 has been corrected. 2) more than one Orientation tag may exist (e.g.,
1609 in the binary APP1/Exif header, and in XML data), and they may not
1610 agree with each other -- which should be used? 3) the code would need
1611 to uncompress the raster data, swap and/or transpose rows and/or
1612 columns, and recompress the raster data for inclusion into the PDF.
1613 This is costly and error-prone. In any case, the user would need to be
1614 able to override any auto-rotate function.
1615
1616 For the time being, PDF::Builder will simply leave it up to the user of
1617 the library to take care of rotating and/or flipping an image which
1618 displays incorrectly. It is possible that we will consider adding some
1619 sort of query or warning that the image appears to not be "normally"
1620 oriented (Orientation value 1 or "Top-left"), according to the
1621 Orientation flag. You can consider either (re-)saving the photo in an
1622 editor such as PhotoShop or GIMP, or using PDF::Builder code similar to
1623 the following (for images rotated 180 degrees):
1624
1625 $pW = 612; $pH = 792; # page dimensions (US Letter)
1626 my $img = $pdf->image_jpeg("AliceLake.jpeg");
1627 # raw size WxH 4032x3024, scaled down to 504x378
1628 $sW = 4032/8; $sH = 3024/8;
1629 # intent is to center on US Letter sized page (LL at 54,207)
1630 # Orientation flag on this image is 3 (rotated 180 degrees).
1631 # if naively displayed (just $gfx->image call), it will be upside down
1632
1633 $gfx->save();
1634
1635 ## method 0: simple display, is rotated 180 degrees!
1636 #$gfx->image($img, ($pW-$sW)/2,($pH-$sH)/2, $sW,$sH);
1637
1638 ## method 1: translate, then rotate
1639 #$gfx->translate($pW,$pH); # to new origin (media UR corner)
1640 #$gfx->rotate(180); # rotate around new origin
1641 #$gfx->image($img, ($pW-$sW)/2,($pH-$sH)/2, $sW,$sH);
1642 # image's UR corner, not LL
1643
1644 # method 2: rotate, then translate
1645 $gfx->rotate(180); # rotate around current origin
1646 $gfx->translate(-$sW,-$sH); # translate in rotated coordinates
1647 $gfx->image($img, -($pW-$sW)/2,-($pH-$sH)/2, $sW,$sH);
1648 # image's UR corner, not LL
1649
1650 ## method 3: flip (mirror) twice
1651 #$scale = 1; # not rescaling here
1652 #$size_page = $pH/$scale;
1653 #$invScale = 1.0/$scale;
1654 #$gfx->add("-$invScale 0 0 -$invScale 0 $size_page cm");
1655 #$gfx->image($img, -($pW-$sW)/2-$sW,($pH-$sH)/2, $sW,$sH);
1656
1657 $gfx->restore();
1658
1659 If your image is also mirrored (flipped about an axis), simple rotation
1660 will not suffice. You could do something with a reversal of the
1661 coordinate system, as in "method 3" above (see "Advanced Methods" in
1662 PDF::Builder::Content). To mirror only left/right, the second $invScale
1663 would be positive; to mirror only top/bottom, the first would be
1664 positive. If all else fails, you could save a mirrored copy in a photo
1665 editor. 90 or 270 degree rotations will require a "rotate" call,
1666 possibly with "cm" usage to reverse mirroring. Incidentally, do not
1667 confuse this issue with the coordinate flipping performed by some
1668 Chrome browsers when printing a page to PDF.
1669
1670 Note that TIFF images may have the same rotation/mirroring problems as
1671 JPEG, which is not surprising, as the Exif format was lifted from TIFF
1672 for use in JPEG. The cure will be similar to JPEG's.
1673
1674 TIFF Images
1675
1676 Note that the Graphics::TIFF support library does not currently permit
1677 a filehandle for $file.
1678
1679 PDF::Builder will use the Graphics::TIFF support library for TIFF
1680 functions, if it is available, unless explicitly told not to. Your code
1681 can test whether Graphics::TIFF is available by examining
1682 "$tiff->usesLib()" or "$pdf->LA_GT()".
1683
1684 = -1
1685 Graphics::TIFF is installed, but your code has specified
1686 "-nouseGT", to not use it. The old, pure Perl, code (buggy!) will
1687 be used instead, as if Graphics::TIFF was not installed.
1688
1689 = 0 Graphics::TIFF is not installed. Not all systems are able to
1690 successfully install this package, as it requires libtiff.a.
1691
1692 = 1 Graphics::TIFF is installed and is being used.
1693
1694 Options:
1695
1696 -nouseGT => 1
1697 Do not use the Graphics::TIFF library, even if it's available.
1698 Normally you would want to use this library, but there may be cases
1699 where you don't, such as when you want to use a file handle instead
1700 of a name.
1701
1702 -silent => 1
1703 Do not give the message that Graphics::TIFF is not installed. This
1704 message will be given only once, but you may want to suppress it,
1705 such as during t-tests.
1706
1707 PNG Images
1708
1709 PDF::Builder will use the Image::PNG::Libpng support library for PNG
1710 functions, if it is available, unless explicitly told not to. Your code
1711 can test whether Image::PNG::Libpng is available by examining
1712 "$png->usesLib()" or "$pdf->LA_IPL()".
1713
1714 = -1
1715 Image::PNG::Libpng is installed, but your code has specified
1716 "-nouseIPL", to not use it. The old, pure Perl, code (slower and
1717 less capable) will be used instead, as if Image::PNG::Libpng was
1718 not installed.
1719
1720 = 0 Image::PNG::Libpng is not installed. Not all systems are able to
1721 successfully install this package, as it requires libpng.a.
1722
1723 = 1 Image::PNG::Libpng is installed and is being used.
1724
1725 Options:
1726
1727 -nouseIPL => 1
1728 Do not use the Image::PNG::Libpng library, even if it's available.
1729 Normally you would want to use this library, when available, but
1730 there may be cases where you don't.
1731
1732 -silent => 1
1733 Do not give the message that Image::PNG::Libpng is not installed.
1734 This message will be given only once, but you may want to suppress
1735 it, such as during t-tests.
1736
1737 -notrans => 1
1738 No transparency -- ignore tRNS chunk if provided, ignore Alpha
1739 channel if provided.
1740
1741 USING SHAPER (HarfBuzz::Shaper library)
1742 # if HarfBuzz::Shaper is not installed, either bail out, or try to
1743 # use regular TTF calls instead
1744 my $rc;
1745 $rc = eval {
1746 require HarfBuzz::Shaper;
1747 1;
1748 };
1749 if (!defined $rc) { $rc = 0; }
1750 if ($rc == 0) {
1751 # bail out in some manner
1752 } else {
1753 # can use Shaper
1754 }
1755
1756 my $fontfile = '/WINDOWS/Fonts/times.ttf'; # used by both Shaper and textHS
1757 my $fontsize = 15; # used by both Shaper and textHS
1758 my $font = $pdf->ttfont($fontfile);
1759 $text->font($font, $fontsize);
1760
1761 my $hb = HarfBuzz::Shaper->new(); # only need to set up once
1762 my %settings; # for textHS(), not Shaper
1763 $settings{'dump'} = 1; # see the diagnostics
1764 $settings{'script'} = 'Latn';
1765 $settings('dir'} = 'L'; # LTR
1766 $settings{'features'} = (); # required
1767
1768 # -- set language (override automatic setting)
1769 #$settings{'language'} = 'en';
1770 #$hb->set_language( 'en_US' );
1771 # -- turn OFF ligatures
1772 #push @{ $settings{'features'} }, '-liga';
1773 #$hb->add_features( '-liga' );
1774 # -- turn OFF kerning
1775 #push @{ $settings{'features'} }, '-kern';
1776 #$hb->add_features( '-kern' );
1777 $hb->set_font($fontfile);
1778 $hb->set_size($fontsize);
1779 $hb->set_text("Let's eat waffles in the field for brunch.");
1780 # expect ffl and fi ligatures, and perhaps some kerning
1781
1782 my $info = $hb->shaper();
1783 $text->textHS($info, \%settings); # -strikethru, -underline allowed
1784
1785 The package HarfBuzz::Shaper may be optionally installed in order to
1786 use the text-shaping capabilities of the HarfBuzz library. These
1787 include kerning and ligatures in Western scripts (such as the Latin
1788 alphabet). More complex scripts can be handled, such as Arabic family
1789 and Indic scripts, where multiple forms of a character may be
1790 automatically selected, characters may be reordered, and other
1791 modifications made. The examples/HarfBuzz.pl script gives some examples
1792 of what may be done.
1793
1794 Keep in mind that HarfBuzz works only with TrueType (.ttf) and OpenType
1795 (.otf) font files. It will not work with PostScript (Type1), core,
1796 bitmapped, or CJK fonts. Not all .ttf fonts have the instructions
1797 necessary to guide HarfBuzz, but most proper .otf fonts do. In other
1798 words, there are no guarantees that a particular font file will work
1799 with Shaper!
1800
1801 The basic idea is to break up text into "chunks" which are of the same
1802 script (alphabet), language, direction, font face, font size, and
1803 variant (italic, bold, etc.). These could range from a single character
1804 to paragraph-length strings of text. These are fed to HarfBuzz::Shaper,
1805 along with flags, the font file to be used, and other supporting
1806 information, to create an array of output glyphs. Each element is a
1807 hash describing the glyph to be output, including its name (if
1808 available), its glyph ID (number) in the selected font, its x and y
1809 displacement (usually 0), and its "advance" x and y values, all in
1810 points. For horizontal languages (LTR and RTL), the y advance is
1811 normally 0 and the x advance is the font's character width, less any
1812 kerning amount.
1813
1814 Shaper will attempt to figure out the script used and the text
1815 direction, based on the Unicode range; and a reasonable guess at the
1816 language used. The language can be overridden, but currently the script
1817 and text direction cannot be overridden.
1818
1819 An important note: the number of glyphs (array elements) may not be
1820 equal to the number of Unicode points (characters) given in the chunk's
1821 text string! Sometimes a character will be decomposed into several
1822 pieces (multiple glyphs); sometimes multiple characters may be combined
1823 into a single ligature glyph; and characters may be reordered
1824 (especially in Indic and Southeast Asian languages). As well, for
1825 Right-to-Left (bidirectional) scripts such as Hebrew and Arabic
1826 families, the text is output in Left-to-Right order (reversed from the
1827 input).
1828
1829 With due care, a Shaper array can be manipulated in code. The elements
1830 are more or less independent of each other, so elements can be
1831 modified, rearranged, inserted, or deleted. You might adjust the
1832 position of a glyph with 'dx' and 'dy' hash elements. The 'ax' value
1833 should be left alone, so that the wrong kerning isn't calculated, but
1834 you might need to adjust the "advance x" value by means of one of the
1835 following:
1836
1837 axs is a value to be substituted for 'ax' (points)
1838 axsp is a substituted value (percentage) of the original 'ax'
1839 axr reduces 'ax' by the value (points). If negative, increase 'ax'
1840 axrp reduces 'ax' by the given percentage. Again, negative increases
1841 'ax'
1842
1843 Caution: a given character's glyph ID is not necessarily going to be
1844 the same between any two fonts! For example, an ASCII space (U+0020)
1845 might be "<0001>" in one font, and "<0003>" in another font (even one
1846 closely related!). A U+00A0 required blank (non-breaking space) may be
1847 output as a regular ASCII space U+0020. Take care if you need to find a
1848 particular glyph in the array, especially if the number of elements
1849 don't match. Consider making a text string of "marker" characters
1850 (space, nbsp, hyphen, soft hyphen, etc.) and processing it through
1851 HarfBuzz::Shaper to get the corresponding glyph numbers. You may have
1852 to count spaces, say, to see where you could break a glyph array to fit
1853 a line.
1854
1855 The "advancewidthHS()" method uses the same inputs as does "textHS()".
1856 Like "advancewidth()", it returns the chunk length in points. Unlike
1857 "advancewidth()", you cannot override the glyph array's font, font
1858 size, etc.
1859
1860 Once you have your (possibly modified) array of glyphs, you feed it to
1861 the "textHS()" method to render it to the page. Remember that this
1862 method handles only a single line of text; it does not do line
1863 splitting or fitting -- that you currently need to do manually. For
1864 Western scripts (e.g., Latin), that might not be too difficult, but for
1865 other scripts that involve extensive modification of the raw
1866 characters, it may be quite difficult to split words, but you still may
1867 be able to split at inter-word spaces.
1868
1869 A useful, but not exhaustive, set of functions are allowed by
1870 "textHS()" use. Support includes direction setting (top-to-bottom and
1871 bottom-to-top directions, e.g., for Far Eastern languages in
1872 traditional orientation), and explicit script names and language
1873 (depending on what support HarfBuzz itself gives). Not yet supported
1874 are features such as discretionary ligatures and manual selection of
1875 glyphs (e.g., swashes and alternate forms).
1876
1877 Currently, "textHS()" can only handle a single text string. We are
1878 looking at how fitting to a line length (splitting up an array) could
1879 be done, as well as how words might be split on hard and soft hyphens.
1880 At some point, full paragraph and page shaping could be possible.
1881
1882
1883
1884perl v5.34.0 2021-07-22 PDF::Builder::Docs(3)