1PDF::Builder::Docs(3) User Contributed Perl DocumentationPDF::Builder::Docs(3)
2
3
4

NAME

6       PDF::Builder::Docs - additional documentation for Builder module
7

SOME SPECIAL NOTES

9   Software Development Kit
10       There are four levels of involvement with PDF::Builder. Depending on
11       what you want to do, different kinds of installs are recommended.
12
13       1. Simply installing PDF::Builder as a prerequisite for running some
14       other package. All you need to do is install the CPAN package for
15       PDF::Builder, and it will load the .pm files into your Perl library. If
16       the other package prereqs PDF::Builder, its installer may download and
17       install PDF::Builder automatically.
18
19       2. You want to write a Perl program that uses PDF::Builder functions.
20       In addition to installing PDF::Builder from CPAN, you will want
21       documentation on it. Obtain a copy of the product from GitHub
22       (https://github.com/PhilterPaper/Perl-PDF-Builder) or as a gzipped tar
23       file from CPAN.  This includes a utility to build (from POD) a library
24       of HTML documents, as well as examples (examples/ directory) and
25       contributed sample programs (contrib/ directory).
26
27       3. You want to modify PDF::Builder files. In addition to the CPAN and
28       GitHub distributions, you may choose to keep a local Git repository for
29       tracking your changes. Depending on whether or not your PDF::Builder
30       copy is being used for production purposes, you may want to do your
31       editing and testing in the Perl library installation (live) or in a
32       different place. The "t" tests (t/ directory) and examples provide good
33       regression tests to ensure that you haven't broken anything. If you do
34       your editing on the live code, don't forget when done to copy the
35       changes back into the master version you keep!
36
37       4. You want to contribute to the development of PDF::Builder. You will
38       need a local Git repository (and a GitHub account), so that when you've
39       got it all done, you can issue a "Pull Request" to bring it to our
40       attention. We can't guarantee that your work will be incorporated into
41       the project, but at least we will look at it. From time to time, a new
42       CPAN version will be issued.
43
44       If you want to make substantial changes for public use, and can't come
45       to a meeting of minds with us, you can even start your own GitHub
46       project and register a new CPAN project (that's what we did, forking
47       PDF::API2). Please don't just assume that we don't want your changes --
48       at least propose what you want to do in writing, so we can consider it.
49       We're always looking for people to help out and expand PDF::Builder.
50
51   Optional Libraries
52       PDF::Builder can make use of some optional libraries, which are not
53       required for a successful installation. If you want improved speed and
54       capabilities for certain functions, you may want to install and use
55       these libraries:
56
57       * Graphics::TIFF -- PDF::Builder inherited a rather slow, buggy, and
58       limited TIFF image library from PDF::API2. If Graphics::TIFF (available
59       on CPAN, uses libtiff.a) is installed, PDF::Builder will use that
60       instead, unless you specify that it is to use the old, pure Perl
61       library. The only time you might want to consider this is when you need
62       to pass an open filehandle to "image_tiff" instead of a file name. See
63       resolved bug reports RT 84665 and RT 118047, as well as "image_tiff",
64       for more information.
65
66       * Image::PNG::Libpng -- PDF::Builder inherited a rather slow and buggy
67       pure Perl PNG image library from PDF::API2. If Image::PNG::Libpng
68       (available on CPAN, uses libpng.a) is installed, PDF::Builder will use
69       that instead, unless you specify that it is to use the old, pure Perl
70       library. Using the new library will give you improved speed, the
71       ability to use 16 bit samples, and the ability to read interlaced PNG
72       files. See resolved bug report RT 124349, as well as "image_png", for
73       more information.
74
75       * HarfBuzz::Shaper -- This library enables PDF::Builder to handle
76       complex scripts (Arabic, Devanagari, etc.) as well as non-LTR writing
77       systems. It is also useful for Latin and other simple scripts, for
78       ligatures and improved kerning. HarfBuzz::Shaper is based on a set of
79       HarfBuzz libraries, which it will attempt to build if they are not
80       found. See "textHS" for more information.
81
82       Note that the installation process will attempt to install these
83       libraries automatically. If you don't wish to use one or more of them,
84       you are free to uninstall the optional librarie(s). If one or more
85       failed to install, no need to panic -- you simply won't be able to use
86       some advanced features, unless you are able to manually install the
87       modules (e.g., with "cpan install").
88
89   Strings (Character Text)
90       Perl, and hence PDF::Builder, use strings that support the full range
91       of Unicode characters. When importing strings into a Perl program, for
92       example by reading text from a file, you must be aware of what their
93       character encoding is. Single-byte encodings (default is 'latin1'),
94       represented as bytes of value 0x00 through 0xFF (0..255), will produce
95       different results if you do something that depends on the encoding,
96       such as sorting, searching, or comparing any two non-ASCII characters.
97       This also applies to any characters (text) hard coded into the Perl
98       program.
99
100       You can always decode the text from external encoding (ASCII, UTF-8,
101       Latin-3, etc.) into the Perl (internal) UTF-8 multibyte encoding. This
102       uses one to four bytes to represent each character. See pragma "utf8"
103       and module "Encode" for details about decoding text. Note that only
104       TrueType fonts ("ttfont") can make direct use of UTF-8-encoded text.
105       Other font types (core, T1, etc.) can only use single-byte encoded
106       text. If your text is ASCII, Latin-1, or CP-1252, you can just leave
107       the Perl strings as the default single-byte encoding.
108
109       Then, there is the matter of encoding the output to match up with
110       available font character sets. You're not actually translating the text
111       on output, but are telling the output system (and Reader) what encoding
112       the output byte stream represents, and what character glyphs they
113       should generate.
114
115       If you confine your text to plain ASCII (0x00 .. 0x7F byte values) or
116       even Latin-1 or CP-1252 (0x00 .. 0xFF byte values), you can use default
117       (non-UTF-8) Perl strings and use the default output encoding
118       (WinAnsiEncoding), which is more-or-less Windows CP-1252 (a superset in
119       turn, of ISO-8859-1 Latin-1). If your text uses any other characters,
120       you will need to be aware of what encoding your text strings are (in
121       the Perl string and for declaring output glyph generation).  See "Core
122       Fonts", "PS Fonts" and "TrueType Fonts" in "FONT METHODS" for
123       additional information.
124
125       Some Internal Details
126
127       Some of the following may be a bit scary or confusing to beginners, so
128       don't be afraid to skip over it until you're ready for it...
129
130       Perl (and PDF::Builder) internally use strings which are either single-
131       byte (ISO-8859-1/Latin-1) or multibyte UTF-8 encoded (there is an
132       internal flag marking the string as UTF-8 or not).  If you work
133       strictly in ASCII or Latin-1 or CP-1252 (each a superset of the
134       previous), you should be OK in not doing anything special about your
135       string encoding. You can just use the default Perl single byte strings
136       (internally marked as not UTF-8) and the default output encoding
137       (WinAnsiEncoding).
138
139       If you intend to use input from a variety of sources, you should
140       consider decoding (converting) your text to UTF-8, which will provide
141       an internally consistent representation (and your Perl code itself
142       should be saved in UTF-8, in case you want to use any hard coded non-
143       ASCII characters). In any string, non-ASCII characters (0x80 or higher)
144       would be converted to the Perl UTF-8 internal representation, via
145       "$string = Encode::decode(MY_ENCODING, $input);".  "MY_ENCODING" would
146       be a string like 'latin1', 'cp-1252', 'utf8', etc. Similar capabilities
147       are available for declaring a file to be in a certain encoding.
148
149       Be aware that if you use UTF-8 encoding for your text, that only
150       TrueType font output ("ttfont") can handle it directly. Corefont and
151       Type1 output will require that the text will have to be converted back
152       into a single-byte encoding (using "Encode::encode"), which may need to
153       be declared with "encode" (for "corefont" or "psfont"). If you have any
154       characters not found in the selected single-byte encoding (but are
155       found in the font itself), you will need to use "automap" to break up
156       the font glyphs into 256 character planes, map such characters to 0x00
157       .. 0xFF in the appropriate plane, and switch between font planes as
158       necessary.
159
160       Core and Type1 fonts (output) use the byte values in the string
161       (single-byte encoding only!) and provide a byte-to-glyph mapping record
162       for each plane.  TrueType outputs a group of four hexadecimal digits
163       representing the "CId" (character ID) of each character. The CId does
164       not correspond to either the single-byte or UTF-8 internal
165       representations of the characters.
166
167       The bottom line is that you need to know what the internal
168       representation of your text is, so that the output routines can tell
169       the PDF reader about it (via the PDF file). The text will not be
170       translated upon output, but the PDF reader needs to know what the
171       encoding in use is, so it knows what glyph to associate with each byte
172       (or byte sequence).
173
174       Note that some operating systems and Perl flavors are reputed to be
175       strict about encoding names. For example, latin1 (an alias) may be
176       rejected as invalid, while iso-8859-1 (a canonical value) will work.
177
178       By the way, it is recommended that you be using at least Perl 5.10 if
179       you are going to be using any non-ASCII characters. Perl 5.8 may be a
180       little unpredictable in handling such text.
181
182   Rendering Order
183       For better or worse, for compatibility purposes, PDF::Builder continues
184       the same rendering model as used by PDF::API2 (and possibly its
185       predecessors). That is, all graphics for one graphics object are put
186       into one record, and all text output for one text object goes into
187       another record. Which one is output first, is whichever is declared
188       first. This can lead to unexpected results, where items are rendered in
189       (apparently) the wrong order. That is, text and graphics items are not
190       necessarily output (rendered) in the same order as they were created in
191       code. Two items in the same object (e.g., $text) will be rendered in
192       the same order as they were coded, but items from different objects may
193       not be rendered in the expected order. The following example (source
194       code and annotated PDF excerpts) will hopefully illustrate the issue:
195
196        use strict;
197        use warnings;
198        use PDF::Builder;
199
200        # demonstrate text and graphics object order
201        #
202        my $fname = "objorder";
203
204        my $paper_size = "Letter";
205
206        # see the text and graphics stream contents
207        my $pdf = PDF::Builder->new(compress => 'none');
208        $pdf->mediabox($paper_size);
209        my $page = $pdf->page();
210        # adjust path for your operating system
211        my $fontTR = $pdf->ttfont('C:\\Windows\\Fonts\\timesbd.ttf');
212
213       For the first group, you might expect the "under" line to be output,
214       then the filled circle (disc) partly covering it, then the "over" line
215       covering the disc, and finally a filled rectangle (bar) over both
216       lines. What actually happened is that the $grfx graphics object was
217       declared first, so everything in that object (the disc and bar) is
218       output first, and the text object $text (both lines) comes afterwards.
219       The result is that the text lines are on top of the graphics drawings.
220
221        # ----------------------------
222        # 1. text, orange ball over, text over, bar over
223
224        my $grfx1 = $page->gfx();
225        my $text1 = $page->text();
226        $text1->font($fontTR, 20);  # 20 pt Times Roman bold
227
228        $text1->fillcolor('black');
229        $grfx1->strokecolor('blue');
230        $grfx1->fillcolor('orange');
231
232        $text1->translate(50,700);
233        $text1->text_left("This text should be under everything.");
234
235        $grfx1->circle(100,690, 30);
236        $grfx1->fillstroke();
237
238        $text1->translate(50,670);
239        $text1->text_left("This text should be over the ball and under the bar.");
240
241        $grfx1->rect(160,660, 20,70);
242        $grfx1->fillstroke();
243
244        % ---------------- group 1: define graphics object first, then text
245        11 0 obj << /Length 690 >> stream   % obj 11 is graphics for (1)
246         0 0 1 RG    % stroke blue
247        1 0.647059 0 rg   % fill orange
248        130 690 m ... c h B   % draw and fill circle
249        160 660 20 70 re B   % draw and fill bar
250        endstream endobj
251
252        12 0 obj << /Length 438 >> stream   % obj 12 is text for (1)
253          BT
254        /TiCBA 20 Tf   % Times Roman Bold 20pt
255        0 0 0 rg   % fill black
256        1 0 0 1 50 700 Tm   % position text
257        <0037 ... 0011> Tj   % "under" line
258        1 0 0 1 50 670 Tm   % position text
259        <0037 ... 0011> Tj   % "over" line
260          ET
261        endstream endobj
262
263       The second group is the same as the first, with the only difference
264       being that the text object was declared first, and then the graphics
265       object. The result is that the two text lines are rendered first, and
266       then the disc and bar are drawn over them.
267
268        # ----------------------------
269        # 2. (1) again, with graphics and text order reversed
270
271        my $text2 = $page->text();
272        my $grfx2 = $page->gfx();
273        $text2->font($fontTR, 20);  # 20 pt Times Roman bold
274
275        $text2->fillcolor('black');
276        $grfx2->strokecolor('blue');
277        $grfx2->fillcolor('orange');
278
279        $text2->translate(50,600);
280        $text2->text_left("This text should be under everything.");
281
282        $grfx2->circle(100,590, 30);
283        $grfx2->fillstroke();
284
285        $text2->translate(50,570);
286        $text2->text_left("This text should be over the ball and under the bar.");
287
288        $grfx2->rect(160,560, 20,70);
289        $grfx2->fillstroke();
290
291        % ---------------- group 2: define text object first, then graphics
292        13 0 obj << /Length 438 >> stream    % obj 13 is text for (2)
293          BT
294        /TiCBA 20 Tf   % Times Roman Bold 20pt
295        0 0 0 rg   % fill black
296        1 0 0 1 50 600 Tm   % position text
297        <0037 ... 0011> Tj   % "under" line
298        1 0 0 1 50 570 Tm   % position text
299        <0037 ... 0011> Tj   % "over" line
300          ET
301        endstream endobj
302
303        14 0 obj << /Length 690 >> stream   % obj 14 is graphics for (2)
304         0 0 1 RG   % stroke blue
305        1 0.647059 0 rg   % fill orange
306        130 590 m ... h B   % draw and fill circle
307        160 560 20 70 re B   % draw and fill bar
308        endstream endobj
309
310       The third group defines two text and two graphics objects, in the order
311       that they are expected in. The "under" text line is output first, then
312       the orange disc graphics is output, partly covering the text. The
313       "over" text line is now output -- it's actually over the disc, but is
314       orange because the previous object stream (first graphics object) left
315       the fill color (also used for text) as orange, because we didn't
316       explicitly set the fill color before outputting the second text line.
317       This is not "inheritance" so much as it is whatever the graphics
318       (drawing) state (used for both "graphics" and "text") is left in at the
319       end of one object, it's the state at the beginning of the next object.
320       If you wish to control this, consider surrounding the graphics or text
321       calls with "save()" and "restore()" calls to save and restore (push and
322       pop) the graphics state to what it was at the "save()". Finally, the
323       bar is drawn over everything.
324
325        # ----------------------------
326        # 3. (2) again, with two graphics and two text objects
327
328        my $text3 = $page->text();
329        my $grfx3 = $page->gfx();
330        $text3->font($fontTR, 20);  # 20 pt Times Roman bold
331        my $text4 = $page->text();
332        my $grfx4 = $page->gfx();
333        $text4->font($fontTR, 20);  # 20 pt Times Roman bold
334
335        $text3->fillcolor('black');
336        $grfx3->strokecolor('blue');
337        $grfx3->fillcolor('orange');
338        # $text4->fillcolor('yellow');
339        # $grfx4->strokecolor('red');
340        # $grfx4->fillcolor('purple');
341
342        $text3->translate(50,500);
343        $text3->text_left("This text should be under everything.");
344
345        $grfx3->circle(100,490, 30);
346        $grfx3->fillstroke();
347
348        $text4->translate(50,470);
349        $text4->text_left("This text should be over the ball and under the bar.");
350
351        $grfx4->rect(160,460, 20,70);
352        $grfx4->fillstroke();
353
354        % ---------------- group 3: define text1, graphics1, text2, graphics2
355        15 0 obj << /Length 206 >> stream   % obj 15 is text1 for (3)
356          BT
357        /TiCBA 20 Tf   % Times Roman Bold 20pt
358        0 0 0 rg  % fill black
359        1 0 0 1 50 500 Tm   % position text
360        <0037 ... 0011> Tj   % "under" line
361          ET
362        endstream endobj
363
364        16 0 obj << /Length 671 >> stream   % obj 16 is graphics1 for (3) circle
365         0 0 1 RG   % stroke blue
366        1 0.647059 0 rg   % fill orange
367        130 490 m ... h B   % draw and fill circle
368        endstream endobj
369
370        17 0 obj << /Length 257 >> stream   % obj 17 is text2 for (3)
371          BT
372        /TiCBA 20 Tf   % Times Roman Bold 20pt
373        1 0 0 1 50 470 Tm   % position text
374        <0037 ... 0011> Tj   % "over" line
375          ET
376        endstream endobj
377
378        18 0 obj << /Length 20 >> stream   % obj 18 is graphics for (3) bar
379         160 460 20 70 re B   % draw and fill bar
380        endstream endobj
381
382       The fourth group is the same as the third, except that we define the
383       fill color for the text in the second line. This makes it clear that
384       the "over" line (in yellow) was written after the orange disc, and
385       still before the bar.
386
387        # ----------------------------
388        # 4. (3) again, a new set of colors for second group
389
390        my $text3 = $page->text();
391        my $grfx3 = $page->gfx();
392        $text3->font($fontTR, 20);  # 20 pt Times Roman bold
393        my $text4 = $page->text();
394        my $grfx4 = $page->gfx();
395        $text4->font($fontTR, 20);  # 20 pt Times Roman bold
396
397        $text3->fillcolor('black');
398        $grfx3->strokecolor('blue');
399        $grfx3->fillcolor('orange');
400        $text4->fillcolor('yellow');
401        $grfx4->strokecolor('red');
402        $grfx4->fillcolor('purple');
403
404        $text3->translate(50,400);
405        $text3->text_left("This text should be under everything.");
406
407        $grfx3->circle(100,390, 30);
408        $grfx3->fillstroke();
409
410        $text4->translate(50,370);
411        $text4->text_left("This text should be over the ball and under the bar.");
412
413        $grfx4->rect(160,360, 20,70);
414        $grfx4->fillstroke();
415
416        % ---------------- group 4: define text1, graphics1, text2, graphics2 with colors for 2
417        19 0 obj << /Length 206 >> stream   % obj 19 is text1 for (4)
418          BT
419        /TiCBA 20 Tf   % Times Roman Bold 20pt
420        0 0 0 rg  % fill black
421        1 0 0 1 50 400 Tm   % position text
422        <0037 ... 0011> Tj   % "under" line
423          ET
424        endstream endobj
425
426        20 0 obj << /Length 671 >> stream   % obj 20 is graphics1 for (4) circle
427         0 0 1 RG   % stroke blue
428        1 0.647059 0 rg   % fill orange
429        130 390 m ... h B   % draw and fill circle
430        endstream endobj
431
432        21 0 obj << /Length 266 >> stream   % obj 21 is text2 for (4)
433          BT
434        /TiCBA 20 Tf   % Times Roman Bold 20pt
435        1 1 0 rg   % fill yellow
436        1 0 0 1 50 370 Tm   % position text
437        <0037 ... 0011> Tj   % "over" line
438          ET
439        endstream endobj
440
441        22 0 obj << /Length 52 >> stream   % obj 22 is graphics for (4) bar
442         1 0 0 RG   % stroke red
443        0.498039 0 0.498039 rg   % fill purple
444        160 360 20 70 re B   % draw and fill rectangle (bar)
445        endstream endobj
446
447        # ----------------------------
448        $pdf->saveas("$fname.pdf");
449
450       The separation of text and graphics means that only some text methods
451       are available in a graphics object, and only some graphics methods are
452       available in a text object. There is much overlap, but they differ.
453       There's really no reason the code couldn't have been written (in
454       PDF::API2, or earlier) as outputting to a single object, which would
455       keep everything in the same order as the method calls. An advantage
456       would be less object and stream overhead in the PDF file. The only
457       drawback might be that an object might more easily overflow and require
458       splitting into multiple objects, but that should be rare.
459
460       You should always be able to manually split an object by simply ending
461       output to the first object, and picking up with output to the second
462       object, so long as it was created immediately after the first object.
463       The graphics state at the end of the first object should be the initial
464       state at the beginning of the second object. However, use caution when
465       dealing with text objects -- the PDF specification states that the Text
466       matrices are not carried over from one object to the next (BT resets
467       them), so you may need to reset some settings.
468
469        $grfx1 = $page->gfx();
470        $grfx2 = $page->gfx();
471        # write a huge amount of stuff to $grfx1
472        # write a huge amount of stuff to $grfx2, picking up where $grfx1 left off
473
474       In any case, now that you understand the rendering order and how the
475       order of object declarations affects it, how text and graphics are
476       drawn can now be completely controlled as desired. There is really no
477       need to add another "both" type object that will handle all graphics
478       and text objects, as that would probably be a major code bloat for very
479       little benefit. However, it could be considered in the future if there
480       is a demonstrated need for it, such as serious PDF file size bloat due
481       to the extra object overhead when interleaving text and graphics
482       output.
483
484   PDF Versions Supported
485       When creating a PDF file using the functions in PDF::Builder, the
486       output is marked as PDF 1.4. This does not mean that all PDF
487       functionality up through 1.4 is supported! There are almost surely
488       features missing as far back as the PDF 1.0 standard.
489
490       The big problem is when a PDF of version 1.5 or higher is imported or
491       opened in PDF::Builder. If it contains content that is actually
492       unsupported by this software, there is a chance that something will
493       break. This does not guarantee that a PDF marked as "1.7" will go down
494       in flames when read by PDF::Builder, or that a PDF written back out
495       will break in a Reader, but the possibility is there. Much PDF writer
496       software simply marks its output as the highest version of PDF at the
497       time (usually 1.7), even if there is no content beyond, say, 1.2.
498       There is some handling of PDF 1.5 items in PDF::Builder, such as cross
499       reference streams, but support beyond 1.4 is very limited. All we can
500       say is to be careful when handling PDFs whose version is above 1.4, and
501       test thoroughly, as they may break at some point.
502
503       PDF::Builder includes a simple version control mechanism, where the
504       initial PDF version to be output (default 1.4) can be set by the
505       programmer. Input PDFs greater than 1.4 (current output level) will
506       receive a warning (can be suppressed) that the output level will be
507       raised to that level. The use of PDF features greater than the current
508       output level will likewise trigger a warning that the output level is
509       to be raised to the necessary level. If this is not desired, you should
510       avoid using those PDF features which are higher than the desired PDF
511       output level.
512
513   History
514       PDF::API2 was originally written by Alfred Reibenschuh, derived from
515       Martin Hosken's Text::PDF via the Text::PDF::API wrapper.  In 2009,
516       Otto Hirr started the PDF::API3 fork, but it never went anywhere.  In
517       2011, PDF::API2 maintenance was taken over by Steve Simms.  In 2017,
518       PDF::Builder was forked by Phil M. Perry, who desired a more aggressive
519       schedule of new features and bug fixes than Simms was providing,
520       although some of Simms's work has been ported from PDF::API2.
521
522       According to "pdfapi2_for_fun_and_profit_APW2005.pdf" (on
523       http://pdfapi2.sourceforge.net, an unmaintained site), the history of
524       PDF::API2 (the predecessor to PDF::Builder) goes as such:
525
526           •  First Code implemented based on PDFlib-0.6 (AFPL)
527           •  Changed to Text::PDF with a total rewrite as Text::PDF::API
528       (procedural)
529           •  Unmaintainable Code triggered rewrite into new Namespace
530       PDF::API2 (object-oriented, LGPL)
531           •  Object-Structure streamlined in 0.4x
532
533       At Simms's request, the name of the new offering was changed from
534       PDF::API4 to PDF::Builder, to reduce the chance of confusion due to
535       parallel development.  Perry's intent is to keep all internal methods
536       as upwardly compatible with PDF::API2 as possible, although it is
537       likely that there will be some drift (incompatibilities) over time. At
538       least initially, any program written based on PDF::API2 should be
539       convertible to PDF::Builder simply by changing "API2" anywhere it
540       occurs to "Builder". See the INFO/KNOWN_INCOMP known incompatibilities
541       file for further information.
542
543       Thanks...
544
545       Many users have helped out by reporting bugs and requesting
546       enhancements. A special shout out goes to those who have contributed
547       code and tests, or coordinated their package development with the needs
548       of PDF::Builder: Ben Bullock, Cary Gravel, Gregor Herrmann, Petr Pisar,
549       Jeffrey Ratcliffe, Steve Simms (via PDF::API2 fixes), and Johan
550       Vromans.  Drop me a line if I've overlooked your contribution!
551

DETAILED NOTES ON METHODS

553       Note: older versions of this package named various (hash element)
554       options with leading dashes (hyphens) in the name, e.g., '-encode'. The
555       use of a dash is now optional, and options are documented with names
556       not using dashes. At some point in the future, it is possible that
557       support for dashed names will be deprecated (and eventually withdrawn),
558       so it would be good practice to start using undashed names in new and
559       revised code.
560
561   After saving a file...
562       Note that a PDF object such as $pdf cannot continue to be used after
563       saving an output PDF file or string with $pdf->"save()", "saveas()", or
564       "stringify()". There is some cleanup and other operations done
565       internally which make the object unusable for further operations. You
566       will likely receive an error message about can't call method new_obj on
567       an undefined value if you try to keep using a PDF object.
568
569   IntegrityCheck
570       The PDF::Builder methods that open an existing PDF file, pass it by the
571       integrity checker method, "$self->IntegrityCheck(level, content)". This
572       method servers two purposes: 1) to find any "/Version" settings that
573       override the PDF version found in the PDF heading, and 2) perform some
574       basic validations on the contents of the PDF.
575
576       The "level" parameter accepts the following values:
577
578       0 = Do not output any diagnostic messages; just return any version
579       override.
580       1 = Output error-level (serious) diagnostic messages, as well as
581       returning any version override.
582           Errors include, in no place was the /Root object specified, or if
583           it was, the indicated object was not found. An object claims
584           another object as its child (/Kids list), but another object has
585           already claimed that child. An object claims a child, but that
586           child does not list a Parent, or the child lists a different
587           Parent.
588
589       2 = Output error- (serious) and warning- (less serious) level
590       diagnostic messages, as well as returning any version override. This is
591       the default.
592       3 = Output error- (serious), warning- (less serious), and note-
593       (informational) level diagnostic messages, as well as returning any
594       version override.
595           Notes include, in no place was the (optional) /Info object
596           specified, or if it was, the indicated object was not found. An
597           object was referenced, but no entry for it was found among the
598           objects. (This may be OK if the object is not defined, or is on the
599           free list, as the reference will then be ignored.) An object is
600           defined, but it appears that no other object is referencing it.
601
602       4 = Output error-, warning-, and note-level diagnostic messages, as
603       well as returning any version override. Also dump the diagnostic data
604       structure.
605       5 = Output error-, warning-, and note-level diagnostic messages, as
606       well as returning any version override. Also dump the diagnostic data
607       structure and the $self data structure (generally useful only if you
608       have already read in the PDF file).
609
610       The version is a string (e.g., '1.5') if found, otherwise "undef"
611       (undefined value) is returned.
612
613       For controlling the "automatic" call to IntegrityCheck (via opens), the
614       level may be given with the option (flag) "diaglevel => n", where "n"
615       is between 0 and 5.
616
617   Preferences - set user display preferences
618       $pdf->preferences(%options)
619           Controls viewing preferences for the PDF.
620
621       Page Mode Options
622
623           fullscreen
624               Full-screen mode, with no menu bar, window controls, or any
625               other window visible.
626
627           thumbs
628               Thumbnail images visible.
629
630           outlines
631               Document outline visible.
632
633       Page Layout Options
634
635           singlepage
636               Display one page at a time.
637
638           onecolumn
639               Display the pages in one column.
640
641           twocolumnleft
642               Display the pages in two columns, with oddnumbered pages on the
643               left.
644
645           twocolumnright
646               Display the pages in two columns, with oddnumbered pages on the
647               right.
648
649       Viewer Options
650
651           hidetoolbar
652               Specifying whether to hide tool bars.
653
654           hidemenubar
655               Specifying whether to hide menu bars.
656
657           hidewindowui
658               Specifying whether to hide user interface elements.
659
660           fitwindow
661               Specifying whether to resize the document's window to the size
662               of the displayed page.
663
664           centerwindow
665               Specifying whether to position the document's window in the
666               center of the screen.
667
668           displaytitle
669               Specifying whether the window's title bar should display the
670               document title taken from the Title entry of the document
671               information dictionary.
672
673           afterfullscreenthumbs
674               Thumbnail images visible after Full-screen mode.
675
676           afterfullscreenoutlines
677               Document outline visible after Full-screen mode.
678
679           printscalingnone
680               Set the default print setting for page scaling to none.
681
682           simplex
683               Print single-sided by default.
684
685           duplexflipshortedge
686               Print duplex by default and flip on the short edge of the
687               sheet.
688
689           duplexfliplongedge
690               Print duplex by default and flip on the long edge of the sheet.
691
692       Page Fit Options
693
694       These options are used for the "firstpage" layout, as well as for
695       Annotations, Named Destinations and Outlines.
696
697       'fit' => 1
698           Display the page designated by $page, with its contents magnified
699           just enough to fit the entire page within the window both
700           horizontally and vertically. If the required horizontal and
701           vertical magnification factors are different, use the smaller of
702           the two, centering the page within the window in the other
703           dimension.
704
705       'fith' => $top
706           Display the page designated by $page, with the vertical coordinate
707           $top positioned at the top edge of the window and the contents of
708           the page magnified just enough to fit the entire width of the page
709           within the window.
710
711       'fitv' => $left
712           Display the page designated by $page, with the horizontal
713           coordinate $left positioned at the left edge of the window and the
714           contents of the page magnified just enough to fit the entire height
715           of the page within the window.
716
717       'fitr' => [ $left, $bottom, $right, $top ]
718           Display the page designated by $page, with its contents magnified
719           just enough to fit the rectangle specified by the coordinates
720           $left, $bottom, $right, and $top entirely within the window both
721           horizontally and vertically. If the required horizontal and
722           vertical magnification factors are different, use the smaller of
723           the two, centering the rectangle within the window in the other
724           dimension.
725
726       'fitb' => 1
727           Display the page designated by $page, with its contents magnified
728           just enough to fit its bounding box entirely within the window both
729           horizontally and vertically. If the required horizontal and
730           vertical magnification factors are different, use the smaller of
731           the two, centering the bounding box within the window in the other
732           dimension.
733
734       'fitbh' => $top
735           Display the page designated by $page, with the vertical coordinate
736           $top positioned at the top edge of the window and the contents of
737           the page magnified just enough to fit the entire width of its
738           bounding box within the window.
739
740       'fitbv' => $left
741           Display the page designated by $page, with the horizontal
742           coordinate $left positioned at the left edge of the window and the
743           contents of the page magnified just enough to fit the entire height
744           of its bounding box within the window.
745
746       'xyz' => [ $left, $top, $zoom ]
747           Display the page designated by $page, with the coordinates
748           "$[$left, $top]" positioned at the top-left corner of the window
749           and the contents of the page magnified by the factor $zoom. A zero
750           (0) value for any of the parameters $left, $top, or $zoom specifies
751           that the current value of that parameter is to be retained
752           unchanged.
753
754       Initial Page Options
755
756       firstpage => [ $page, %options ]
757           Specifying the page (either a page number or a page object) to be
758           displayed, plus one of the location options listed above in "Page
759           Fit Options".
760
761       Example
762
763           $pdf->preferences(
764               fullscreen => 1,
765               onecolumn => 1,
766               afterfullscreenoutlines => 1,
767               firstpage => [$page, fit => 1],
768           );
769
770   info Example
771           %h = $pdf->info(
772               'Author'       => "Alfred Reibenschuh",
773               'CreationDate' => "D:20020911000000+01'00'",
774               'ModDate'      => "D:YYYYMMDDhhmmssOHH'mm'",
775               'Creator'      => "fredos-script.pl",
776               'Producer'     => "PDF::Builder",
777               'Title'        => "some Publication",
778               'Subject'      => "perl ?",
779               'Keywords'     => "all good things are pdf"
780           );
781           print "Author: $h{'Author'}\n";
782
783   XMP XML example
784           $xml = $pdf->xmpMetadata();
785           print "PDFs Metadata reads: $xml\n";
786           $xml=<<EOT;
787           <?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
788           <?adobe-xap-filters esc="CRLF"?>
789           <x:xmpmeta
790             xmlns:x='adobe:ns:meta/'
791             x:xmptk='XMP toolkit 2.9.1-14, framework 1.6'>
792               <rdf:RDF
793                 xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
794                 xmlns:iX='http://ns.adobe.com/iX/1.0/'>
795                   <rdf:Description
796                     rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
797                     xmlns:pdf='http://ns.adobe.com/pdf/1.3/'
798                     pdf:Producer='Acrobat Distiller 6.0.1 for Macintosh'></rdf:Description>
799                   <rdf:Description
800                     rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
801                     xmlns:xap='http://ns.adobe.com/xap/1.0/'
802                     xap:CreateDate='2004-11-14T08:41:16Z'
803                     xap:ModifyDate='2004-11-14T16:38:50-08:00'
804                     xap:CreatorTool='FrameMaker 7.0'
805                     xap:MetadataDate='2004-11-14T16:38:50-08:00'></rdf:Description>
806                   <rdf:Description
807                     rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
808                     xmlns:xapMM='http://ns.adobe.com/xap/1.0/mm/'
809                     xapMM:DocumentID='uuid:919b9378-369c-11d9-a2b5-000393c97fd8'/></rdf:Description>
810                   <rdf:Description
811                     rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
812                     xmlns:dc='http://purl.org/dc/elements/1.1/'
813                     dc:format='application/pdf'>
814                       <dc:description>
815                         <rdf:Alt>
816                           <rdf:li xml:lang='x-default'>Adobe Portable Document Format (PDF)</rdf:li>
817                         </rdf:Alt>
818                       </dc:description>
819                       <dc:creator>
820                         <rdf:Seq>
821                           <rdf:li>Adobe Systems Incorporated</rdf:li>
822                         </rdf:Seq>
823                       </dc:creator>
824                       <dc:title>
825                         <rdf:Alt>
826                           <rdf:li xml:lang='x-default'>PDF Reference, version 1.6</rdf:li>
827                         </rdf:Alt>
828                       </dc:title>
829                   </rdf:Description>
830               </rdf:RDF>
831           </x:xmpmeta>
832           <?xpacket end='w'?>
833           EOT
834
835           $xml = $pdf->xmpMetadata($xml);
836           print "PDF metadata now reads: $xml\n";
837
838   "BOX" METHODS
839       A general note: Use care if specifying a different Media Box (or other
840       "box") for a page, than the global "box" setting, to define the whole
841       "chain" of boxes on the page, to avoid surprises. For example, to
842       define a global Media Box (paper size) and a global Crop Box, and then
843       define a new page-level Media Box without defining a new page-level
844       Crop Box, may give odd results in the resultant cropping. Such
845       combinations are not well defined.
846
847       All dimensions in boxes default to the default User Unit, which is
848       points (1/72 inch). Note that the PDF specification limits sizes and
849       coordinates to 14400 User Units (200 inches, for the default User Unit
850       of one point), and Adobe products (so far) follow this limit for
851       Acrobat and Distiller. It is worth noting that other PDF writers and
852       readers may choose to ignore the 14400 unit limit, with or without the
853       use of a specified User Unit. Therefore, PDF::Builder does not enforce
854       any limits on coordinates -- it's your responsibility to consider what
855       readers and other PDF tools may be used with a PDF you produce!  Also
856       note that earlier Acrobat readers had coordinate limits as small as
857       3240 User Units (45 inches), and minimum media size of 72 or 3 User
858       Units.
859
860       User Units
861
862       $pdf->userunit($number)
863           The default User Unit in the PDF coordinate system is one point
864           (1/72 inch). You can think of it as a scale factor to enable larger
865           (or even, smaller) documents.  This method may be used (for PDF 1.6
866           and higher) to set the User Unit to some number of points. For
867           example, "userunit(72)" will set the scale multiplier to 72.0
868           points per User Unit, or 1 inch to the User Unit. Any number
869           greater than zero is acceptable, although some readers and tools
870           may not handle User Units of less than 1.0 very well.
871
872           Not all readers respect the User Unit, if you give one, or handle
873           it in exactly the same way. Adobe Distiller, for one, does not use
874           it. How User Units are handled may vary from reader to reader.
875           Adobe Acrobat, at this writing, respects User Unit in version 7.0
876           and up, but limits it to 75000 (giving a maximum document size of
877           15 million inches or 236.7 miles or 381 km). Other readers and PDF
878           tools may allow a larger (or smaller) limit.
879
880           Your Mileage May Vary: Some readers ignore a global User Unit
881           setting and do not have pages inherit it (PDF::Builder duplicates
882           it on each page to simulate inheritance). Some readers may give
883           spurious warnings about truncated content when a Media Box is
884           changed while User Units are being used. Some readers do strange
885           things with Crop Boxes when a User Unit is in effect.
886
887           Depending on the reader used, the effect of a larger User Unit
888           (greater than 1) may mean lower resolution (chunkier or coarser
889           appearance) in the rendered document. If you're printing something
890           the size of a highway billboard, this may not matter to you, but
891           you should be aware of the possibility (even with fractional
892           coordinates). Conversely, a User Unit of less than 1.0 (if
893           permitted) reduces the allowable size of your document, but may
894           result in greater resolution.
895
896           A global (PDF level) User Unit setting is inherited by each page
897           (an action by PDF::Builder, not necessarily automatically done by
898           the reader), or can be overridden by calling userunit in the page.
899           Do not give more than one global userunit setting, as only the last
900           one will be used.  Setting a page's User Unit (if "$page->"
901           instead) is permitted (overriding the global setting for this
902           page). However, many sources recommend against doing this, as
903           results may not be as expected (once again, depending on the quirks
904           of the reader).
905
906           Remember to call "userunit" before calling anything having to do
907           with page or box sizes, or coordinates. Especially when setting
908           'named' box sizes, the methods need to know the current User Unit
909           so that named page sizes (in points) may be scaled down to the
910           current User Unit.
911
912       Media Box
913
914       $pdf->mediabox($name)
915       $pdf->mediabox($name, orient => 'orientation' )
916       $pdf->mediabox($w,$h)
917       $pdf->mediabox($llx,$lly, $urx,$ury)
918       ($llx,$lly, $urx,$ury) = $pdf->mediabox()
919           Sets the global Media Box (or page's Media Box, if "$page->"
920           instead).  This defines the width and height (or by corner
921           coordinates, or by standard name) of the output page itself, such
922           as the physical paper size. This is normally the largest of the
923           "boxes". If any subsidiary box (within it) exceeds the media box,
924           the portion of the material or boxes outside of the Media Box will
925           be ignored. That is, the Media Box is the One Box to Rule Them All,
926           and is the overall limit for other boxes (some documentation refers
927           to the Media Box as "clipping" other boxes). In addition, the Media
928           Box defines the overall coordinate system for text and graphics
929           operations.
930
931           If no arguments are given, the current Media Box (global or page)
932           coordinates are returned instead. The former "get_mediabox" (page
933           only) function is deprecated and will likely be removed some time
934           in the future. In addition, when setting the Media Box, the
935           resulting coordinates are returned. This permits you to specify the
936           page size by a name (alias) and get the dimensions back, all in one
937           call.
938
939           Note that many printers can not print all the way to the physical
940           edge of the paper, so you should plan to leave some blank margin,
941           even outside of any crop marks and bleeds. Printers and on-screen
942           readers are free to discard any content found outside the Media
943           Box, and printers may discard some material just inside the Media
944           Box.
945
946           A global Media Box is required by the PDF spec; if not explicitly
947           given, PDF::Builder will set the global Media Box to US Letter size
948           (8.5in x 11in).  This is the media size that will be used for all
949           pages if you do not specify a "mediabox" call on a page. That is, a
950           global (PDF level) mediabox setting is inherited by each page, or
951           can be overridden by setting mediabox in the page. Do not give more
952           than one global mediabox setting, as only the last one will be
953           used.
954
955           If you give a single string name (e.g., 'A4'), you may optionally
956           add an orientation to turn the page 90 degrees into Landscape mode:
957           "orient => 'L'" or "orient => 'l'". "orient" is the only option
958           recognized, and a string beginning with an 'L' or 'l' (for
959           Landscape) is the only value of interest (anything else is treated
960           as Portrait mode). The y axis still runs from 0 at the bottom of
961           the page to what used to be the page width (now, height) at the
962           top, and likewise for the x axis: 0 at left to (former) height at
963           the right. That is, the coordinate system is the same as before,
964           except that the height and width are different.
965
966           The lower left corner does not have to be 0,0. It can be any values
967           you want, including negative values (so long as the resulting
968           media's sides are at least one point long). "mediabox" sets the
969           coordinate system (including the origin) of the graphics and text
970           that will be drawn, as well as for subsequent "boxes".  It's even
971           possible to give any two opposite corners (such as upper left and
972           lower right). The coordinate system will be rearranged (by the
973           Reader) to still be the conventional minimum "x" and "y" in the
974           lower left (i.e., you can't make "y" increase from top to bottom!).
975
976           Example:
977
978               $pdf = PDF::Builder->new();
979               $pdf->mediabox('A4'); # A4 size (595 Pt wide by 842 Pt high)
980               ...
981               $pdf->saveas('our/new.pdf');
982
983               $pdf = PDF::Builder->new();
984               $pdf->mediabox(595, 842); # A4 size, with implicit 0,0 LL corner
985               ...
986               $pdf->saveas('our/new.pdf');
987
988               $pdf = PDF::Builder->new;
989               $pdf->mediabox(0, 0, 595, 842); # A4 size, with explicit 0,0 LL corner
990               ...
991               $pdf->saveas('our/new.pdf');
992
993           See the PDF::Builder::Resource::PaperSizes source code for the full
994           list of supported names (aliases) and their dimensions in points.
995           You are free to add additional paper sizes to this file, if you
996           wish. You might want to do this if you frequently use a standard
997           page size in rotated (Landscape) mode. See also the "getPaperSizes"
998           call in PDF::Builder::Util. These names (aliases) are also usable
999           in other "box" calls, although useful only if the "box" is the same
1000           size as the full media (Media Box), and you don't mind their
1001           starting at 0,0.
1002
1003       Crop Box
1004
1005       $pdf->cropbox($name)
1006       $pdf->cropbox($name, orient => 'orientation')
1007       $pdf->cropbox($w,$h)
1008       $pdf->cropbox($llx,$lly, $urx,$ury)
1009       ($llx,$lly, $urx,$ury) = $pdf->cropbox()
1010           Sets the global Crop Box (or page's Crop Box, if "$page->"
1011           instead).  This will define the media size to which the output will
1012           later be clipped. Note that this does not itself output any crop
1013           marks to guide cutting of the paper! PDF Readers should consider
1014           this to be the visible portion of the page, and anything found
1015           outside it may be clipped (invisible). By default, it is equal to
1016           the Media Box, but may be defined to be smaller, in the coordinate
1017           system set by the Media Box. A global setting will be inherited by
1018           each page, but can be overridden on a per-page basis.
1019
1020           A Reader or Printer may choose to discard any clipped (invisible)
1021           part of the page, and show only the area within the Crop Box. For
1022           example, if your page Media Box is A4 (0,0 to 595,842 Points), and
1023           your Crop Box is (100,100 to 495,742), a reader such as Adobe
1024           Acrobat Reader may show you a page 395 by 642 Points in size (i.e.,
1025           just the visible area of your page). Other Readers may show you the
1026           full media size (Media Box) and a 100 Point wide blank area (in
1027           this example) around the visible content.
1028
1029           If no arguments are given, the current Crop Box (global or page)
1030           coordinates are returned instead. The former "get_cropbox" (page
1031           only) function is deprecated and will likely be removed some time
1032           in the future. If a Crop Box has not been defined, the Media Box
1033           coordinates (which always exist) will be returned instead. In
1034           addition, when setting the Crop Box, the resulting coordinates are
1035           returned. This permits you to specify the crop box by a name
1036           (alias) and get the dimensions back, all in one call.
1037
1038           Do not confuse the Crop Box with the "Trim Box", which shows where
1039           printed paper is expected to actually be cut. Some PDF Readers may
1040           reduce the visible "paper" background to the size of the crop box;
1041           others may simply omit any content outside it. Either way, you
1042           would lose any trim or crop marks, printer instructions, color
1043           alignment dots, or other content outside the Crop Box. A good use
1044           of the Crop Box would be limit printing to the area where a printer
1045           can reliably put down ink, and leave white the edge areas where
1046           paper-handling mechanisms prevent ink or toner from being applied.
1047           This would keep you from accidentally putting valuable content in
1048           an area where a printer will refuse to print, yet permit you to
1049           include a bleed area and space for printer's marks and
1050           instructions. Needless to say, if your printer cannot print to the
1051           very edge of the paper, you will need to trim (cut) the printed
1052           sheets to get true bleeds.
1053
1054           A global (PDF level) cropbox setting is inherited by each page, or
1055           can be overridden by setting cropbox in the page.  As with
1056           "mediabox", only one crop box may be set at this (PDF) level.  As
1057           with "mediabox", a named media size may have an orientation (l or
1058           L) for Landscape mode.  Note that the PDF level global Crop Box
1059           will be used even if the page gets its own Media Box. That is, the
1060           page's Crop Box inherits the global Crop Box, not the page Media
1061           Box, even if the page has its own media size! If you set the page's
1062           own Media Box, you should consider also explicitly setting the page
1063           Crop Box (and other boxes).
1064
1065       Bleed Box
1066
1067       $pdf->bleedbox($name)
1068       $pdf->bleedbox($name, orient => 'orientation')
1069       $pdf->bleedbox($w,$h)
1070       $pdf->bleedbox($llx,$lly, $urx,$ury)
1071       ($llx,$lly, $urx,$ury) = $pdf->bleedbox()
1072           Sets the global Bleed Box (or page's Bleed Box, if "$page->"
1073           instead).  This is typically used in printing on paper, where you
1074           want ink or color (such as thumb tabs) to be printed a bit beyond
1075           the final paper size, to ensure that the cut paper bleeds (the cut
1076           goes through the ink), rather than accidentally leaving some white
1077           paper visible outside.  Allow enough "bleed" over the expected trim
1078           line to account for minor variations in paper handling, folding,
1079           and cutting; to avoid showing white paper at the edge.  The Bleed
1080           Box is where printing could actually extend to; the Trim Box is
1081           normally within it, where the paper would actually be cut. The
1082           default value is equal to the Crop Box, but is often a bit smaller.
1083           The space between the Bleed Box and the Crop Box is available for
1084           printer instructions, color alignment dots, etc., while crop marks
1085           (trim guides) are at least partly within the bleed area (and should
1086           be printed after content is printed).
1087
1088           If no arguments are given, the current Bleed Box (global or page)
1089           coordinates are returned instead. The former "get_bleedbox" (page
1090           only) function is deprecated and will likely be removed some time
1091           in the future. If a Bleed Box has not been defined, the Crop Box
1092           coordinates (if defined) will be returned, otherwise the Media Box
1093           coordinates (which always exist) will be returned.  In addition,
1094           when setting the Bleed Box, the resulting coordinates are returned.
1095           This permits you to specify the bleed box by a name (alias) and get
1096           the dimensions back, all in one call.
1097
1098           A global (PDF level) bleedbox setting is inherited by each page, or
1099           can be overridden by setting bleedbox in the page.  As with
1100           "mediabox", only one bleed box may be set at this (PDF) level.  As
1101           with "mediabox", a named media size may have an orientation (l or
1102           L) for Landscape mode.  Note that the PDF level global Bleed Box
1103           will be used even if the page gets its own Crop Box. That is, the
1104           page's Bleed Box inherits the global Bleed Box, not the page Crop
1105           Box, even if the page has its own media size! If you set the page's
1106           own Media Box or Crop Box, you should consider also explicitly
1107           setting the page Bleed Box (and other boxes).
1108
1109       Trim Box
1110
1111       $pdf->trimbox($name)
1112       $pdf->trimbox($name, orient => 'orientation')
1113       $pdf->trimbox($w,$h)
1114       $pdf->trimbox($llx,$lly, $urx,$ury)
1115       ($llx,$lly, $urx,$ury) = $pdf->trimbox()
1116           Sets the global Trim Box (or page's Trim Box, if "$page->"
1117           instead).  This is supposed to be the actual dimensions of the
1118           finished page (after trimming of the paper). In some production
1119           environments, it is useful to have printer's instructions, cut
1120           marks, and so on outside of the trim box. The default value is
1121           equal to Crop Box, but is often a bit smaller than any Bleed Box,
1122           to allow the desired "bleed" effect.
1123
1124           If no arguments are given, the current Trim Box (global or page)
1125           coordinates are returned instead. The former "get_trimbox" (page
1126           only) function is deprecated and will likely be removed some time
1127           in the future. If a Trim Box has not been defined, the Crop Box
1128           coordinates (if defined) will be returned, otherwise the Media Box
1129           coordinates (which always exist) will be returned.  In addition,
1130           when setting the Trim Box, the resulting coordinates are returned.
1131           This permits you to specify the trim box by a name (alias) and get
1132           the dimensions back, all in one call.
1133
1134           A global (PDF level) trimbox setting is inherited by each page, or
1135           can be overridden by setting trimbox in the page.  As with
1136           "mediabox", only one trim box may be set at this (PDF) level.  As
1137           with "mediabox", a named media size may have an orientation (l or
1138           L) for Landscape mode.  Note that the PDF level global Trim Box
1139           will be used even if the page gets its own Crop Box. That is, the
1140           page's Trim Box inherits the global Trim Box, not the page Crop
1141           Box, even if the page has its own media size! If you set the page's
1142           own Media Box or Crop Box, you should consider also explicitly
1143           setting the page Trim Box (and other boxes).
1144
1145       Art Box
1146
1147       $pdf->artbox($name)
1148       $pdf->artbox($name, orient => 'orientation')
1149       $pdf->artbox($w,$h)
1150       $pdf->artbox($llx,$lly, $urx,$ury)
1151       ($llx,$lly, $urx,$ury) = $pdf->artbox()
1152           Sets the global Art Box (or page's Art Box, if "$page->" instead).
1153           This is supposed to define "the extent of the page's meaningful
1154           content (including [margins])". It might exclude some content, such
1155           as Headlines or headings. Any binding or punched-holes margin would
1156           typically be outside of the Art Box, as would be page numbers and
1157           running headers and footers. The default value is equal to the Crop
1158           Box, although normally it would be no larger than any Trim Box. The
1159           Art Box may often be used for defining "important" content (e.g.,
1160           excluding advertisements) that may or may not be brought over to
1161           another page (e.g., N-up printing).
1162
1163           If no arguments are given, the current Art Box (global or page)
1164           coordinates are returned instead. The former "get_artbox" (page
1165           only) function is deprecated and will likely be removed some time
1166           in the future. If an Art Box has not been defined, the Crop Box
1167           coordinates (if defined) will be returned, otherwise the Media Box
1168           coordinates (which always exist) will be returned.  In addition,
1169           when setting the Art Box, the resulting coordinates are returned.
1170           This permits you to specify the art box by a name (alias) and get
1171           the dimensions back, all in one call.
1172
1173           A global (PDF level) artbox setting is inherited by each page, or
1174           can be overridden by setting artbox in the page.  As with
1175           "mediabox", only one art box may be set at this (PDF) level.  As
1176           with "mediabox", a named media size may have an orientation (l or
1177           L) for Landscape mode.  Note that the PDF level global Art Box will
1178           be used even if the page gets its own Crop Box. That is, the page's
1179           Art Box inherits the global Art Box, not the page Crop Box, even if
1180           the page has its own media size! If you set the page's own Media
1181           Box or Crop Box, you should consider also explicitly setting the
1182           page Art Box (and other boxes).
1183
1184       Suggested Box Usage
1185
1186       See "examples/Boxes.pl" for an example of using boxes.
1187
1188       How you define your boxes (or let them default) is up to you, depending
1189       on whether you're duplex printing US Letter or A4 on your laser
1190       printer, to be spiral bound on the bind margin, or engaging a
1191       professional printer. In the latter case, discuss in advance with the
1192       print firm what capabilities (and limitations) they have and what
1193       information they need from a PDF file. For instance, they may not want
1194       a Crop Box defined, and may call for very specific box sizes. For large
1195       press runs, they may print multiple pages (N-up) duplexed on large web
1196       roll "signatures", which are then intricately folded and guillotined
1197       (trimmed) and bound together into books or magazines. You would usually
1198       just supply a PDF with all the pages; they would take care of the
1199       signature layout (which includes offsets and 180 degree rotations).
1200
1201       (As an aside, don't count on a printer having any particular font
1202       available, so be sure to ask. Usually they will want you to embed all
1203       fonts used, but ask first, and double-check before handing over the
1204       print job! TTF/OTF fonts ("ttfont()") are embedded by default, but
1205       other fonts (core, ps, bdf, cjk) are not! A printer may have a core
1206       font collection, but they are free to substitute a "workalike" font for
1207       any given core font, and the results may not match what you saw on your
1208       PC!)
1209
1210       On the assumption that you're using a single sheet (US Letter or A4)
1211       laser or inkjet printer, are you planning to trim each sheet down to a
1212       smaller final size? If so, you can do true bleeds by defining a Trim
1213       Box and a slightly larger Bleed Box. You would print bleeds (all the
1214       way to the finished edge) out to the Bleed Box, but nothing is enforced
1215       about the Bleed Box. At the other end of the spectrum, you would define
1216       the Media Box to be the physical paper size being printed on. Most
1217       printers reserve a little space on the sides (and possibly top and
1218       bottom) for paper handling, so it is often good to define your Crop Box
1219       as the printable area. Remember that the Media Box sets the coordinate
1220       system used, so you still need to avoid going outside the Crop Box with
1221       content (most readers and printers will not show any ink outside of the
1222       Crop Box). Whether or not you define a Crop Box, you're going to almost
1223       always end up with white paper on at least the sides.
1224
1225       For small in-house jobs, you probably won't need color alignment dots
1226       and other such professional instructions and information between the
1227       Bleed Box and the Crop Box, but crop marks for trimming (if used)
1228       should go just outside the Trim Box (partly or wholly within the Bleed
1229       Box), and be drawn after all content. If you're not trimming the paper,
1230       don't try to do any bleed effects (including solid background color
1231       pages/covers), as you will usually have a white edge around the sheet
1232       anyway. Don't count on a PDF document never being physically printed,
1233       and not just displayed (where you can do things like bleed all the way
1234       to the media edge). Finally, for single sheet printing, an Art Box is
1235       probably unnecessary, but if you're combining pages into N-up prints,
1236       or doing other manipulations, it may be useful.
1237
1238       Box Inheritance
1239
1240       What Media, Crop, Bleed, Trim, and Art Boxes a page gets can be a
1241       little complicated. Note that usually, only the Media and Crop Boxes
1242       will have a clear visual effect. The visual effect of the other boxes
1243       (if any) may be very subtle.
1244
1245       First, everything is set at the global (PDF) level. The Media Box is
1246       always defined, and defaults to US Letter (8.5 inches wide by 11 inches
1247       high). The global Crop Box inherits the Media Box, unless explicitly
1248       defined. The Bleed, Trim, and Art Boxes inherit the Crop Box, unless
1249       explicitly defined. A global box should only be defined once, as the
1250       last one defined is the one that will be written to the PDF!
1251
1252       Second, a page inherits the global boxes, for its initial settings. You
1253       may call any of the box set methods ("cropbox", "trimbox", etc.) to
1254       explicitly set (override) any box for this page. Note that setting a
1255       new Media Box for the page does not reset the page's Crop Box -- it
1256       still uses whatever it inherited from the global Crop Box. You would
1257       need to explicitly set the page's Crop Box if you want a different
1258       setting. Likewise, the page's Bleed, Trim, and Art Boxes will not be
1259       reset by a new page Crop Box -- they will still inherit from the global
1260       (PDF) settings.
1261
1262       Third, the page Media Box (the one actually used for output pages),
1263       clips or limits all the other boxes to extend no larger than its size.
1264       For example, if the Media Box is US Letter, and you set a Crop Box of
1265       A4 size, the smaller of the two heights (11 inches) would be effective,
1266       and the smaller of the two widths (8.26 inches, 595 Points) would be
1267       effective.  The given dimensions of a box are returned on query (get),
1268       not the effective dimensions clipped by the Media Box.
1269
1270   FONT METHODS
1271       Core Fonts
1272
1273       Core fonts are limited to single byte encodings. You cannot use UTF-8
1274       or other multibyte encodings with core fonts. The default encoding for
1275       the core fonts is WinAnsiEncoding (roughly the CP-1252 superset of
1276       ISO-8859-1). See the "encode" option below to change this encoding.
1277       See "font automap" in PDF::Builder::Resource::Font method for
1278       information on accessing more than 256 glyphs in a font, using planes,
1279       although there is no guarantee that future changes to font files will
1280       permit consistent results.
1281
1282       Note that core fonts use fixed lists of expected glyphs, along with
1283       metrics such as their widths. This may not exactly match up with
1284       whatever local font file is used by the PDF reader. It's usually pretty
1285       close, but many cases have been found where the list of glyphs is
1286       different between the core fonts and various local font files, so be
1287       aware of this.
1288
1289       To allow UTF-8 text and extended glyph counts, you should consider
1290       replacing your use of core fonts with TrueType (.ttf) and OpenType
1291       (.otf) fonts. There are tools, such as FontForge, which can do a fairly
1292       good (though, not perfect) job of converting a Type1 font library to
1293       OTF.
1294
1295       Examples:
1296
1297           $font1 = $pdf->corefont('Times-Roman', encode => 'latin2');
1298           $font2 = $pdf->corefont('Times-Bold');
1299           $font3 = $pdf->corefont('Helvetica');
1300           $font4 = $pdf->corefont('ZapfDingbats');
1301
1302       Valid %options are:
1303
1304       encode
1305           Changes the encoding of the font from its default. Notice that the
1306           encoding (not the entire font's glyph list) is shown in a PDF
1307           object (record), listing 256 glyphs associated with this encoding
1308           (and that are available in this font).
1309
1310       dokern
1311           Enables kerning if data is available.
1312
1313       Notes:
1314
1315       Even though these are called "core" fonts, they are not shipped with
1316       PDF::Builder, but are expected to be found on the machine with the PDF
1317       reader. Most core fonts are installed with a PDF reader, and thus are
1318       not coordinated with PDF::Builder. PDF::Builder does ship with core
1319       font metrics files (width, glyph names, etc.), but these cannot be
1320       guaranteed to be in sync with what the PDF reader has installed!
1321
1322       There are some 14 core fonts (regular, italic, bold, and bold-italic
1323       for Times [serif], Helvetica [sans serif], Courier [fixed pitch]; plus
1324       two symbol fonts) that are supposed to be available on any PDF reader,
1325       although other fonts with very similar metrics are often substituted.
1326       You should not count on any of the 15 Windows core fonts (Bank Gothic,
1327       Georgia, Trebuchet, Verdana, and two more symbol fonts) being present,
1328       especially on Linux, Mac, or other non-Windows platforms. Be aware if
1329       you are producing PDFs to be read on a variety of different systems!
1330
1331       If you want to ensure the widest portability for a PDF document you
1332       produce, you should consider using TTF fonts (instead of core fonts)
1333       and embedding them in the document. This ensures that there will be no
1334       substitutions, that all metrics are known and match the glyphs, UTF-8
1335       encoding can be used, and that the glyphs will be available on the
1336       reader's machine. At least on Windows platforms, most of the fonts are
1337       TTF anyway, which are used behind the scenes for "core" fonts, while
1338       missing most of the capabilities of TTF (now or possibly later in
1339       PDF::Builder) such as embedding, ligatures, UTF-8, etc.  The downside
1340       is, obviously, that the resulting PDF file will be larger because it
1341       includes the font(s). There might also be copyright or licensing issues
1342       with the redistribution of font files in this manner (you might want to
1343       check, before widely distributing a PDF document with embedded fonts,
1344       although many do permit the part of the font used, to be embedded.).
1345
1346       See also PDF::Builder::Resource::Font::CoreFont.
1347
1348       PS Fonts
1349
1350       PS (T1) fonts are limited to single byte encodings. You cannot use
1351       UTF-8 or other multibyte encodings with T1 fonts.  The default encoding
1352       for the T1 fonts is WinAnsiEncoding (roughly the CP-1252 superset of
1353       ISO-8859-1). See the "encode" option below to change this encoding.
1354       See "font automap" in PDF::Builder::Resource::Font method for
1355       information on accessing more than 256 glyphs in a font, using planes,
1356       although there is no guarantee that future changes to font files will
1357       permit consistent results.  Note: many Type1 fonts are limited to 256
1358       glyphs, but some are available with more than 256 glyphs. Still, a
1359       maximum of 256 at a time are usable.
1360
1361       "psfont" accepts both ASCII (.pfa) and binary (.pfb) Type1 glyph files.
1362       Font metrics can be supplied in either ASCII (.afm) or binary (.pfm)
1363       format, as can be seen in the examples given below. It is possible to
1364       use .pfa with .pfm and .pfb with .afm if that's what's available. The
1365       ASCII and binary files have the same content, just in different
1366       formats.
1367
1368       To allow UTF-8 text and extended glyph counts in one font, you should
1369       consider replacing your use of Type1 fonts with TrueType (.ttf) and
1370       OpenType (.otf) fonts. There are tools, such as FontForge, which can do
1371       a fairly good (though, not perfect) job of converting your font library
1372       to OTF.
1373
1374       Examples:
1375
1376           $font1 = $pdf->psfont('Times-Book.pfa', afmfile => 'Times-Book.afm');
1377           $font2 = $pdf->psfont('/fonts/Synest-FB.pfb', pfmfile => '/fonts/Synest-FB.pfm');
1378
1379       Valid %options are:
1380
1381       encode
1382           Changes the encoding of the font from its default. Notice that the
1383           encoding (not the entire font's glyph list) is shown in a PDF
1384           object (record), listing 256 glyphs associated with this encoding
1385           (and that are available in this font).
1386
1387       afmfile
1388           Specifies the location of the ASCII font metrics file (.afm). It
1389           may be used with either an ASCII (.pfa) or binary (.pfb) glyph
1390           file.
1391
1392       pfmfile
1393           Specifies the location of the binary font metrics file (.pfm). It
1394           may be used with either an ASCII (.pfa) or binary (.pfb) glyph
1395           file.
1396
1397       dokern
1398           Enables kerning if data is available.
1399
1400       Note: these T1 (Type1) fonts are not shipped with PDF::Builder, but are
1401       expected to be found on the machine with the PDF reader. Most PDF
1402       readers do not install T1 fonts, and it is up to the user of the PDF
1403       reader to install the needed fonts. Unlike TrueType fonts, PS (T1)
1404       fonts are not embedded in the PDF, and must be supplied on the Reader
1405       end.
1406
1407       See also PDF::Builder::Resource::Font::Postscript.
1408
1409       TrueType Fonts
1410
1411       Warning: BaseEncoding is not set by default for TrueType fonts, so text
1412       in the PDF isn't searchable (by the PDF reader) unless a ToUnicode CMap
1413       is included. A ToUnicode CMap is included by default (unicodemap set to
1414       1) by PDF::Builder, but allows it to be disabled (for performance and
1415       file size reasons) by setting unicodemap to 0. This will produce non-
1416       searchable text, which, besides being annoying to users, may prevent
1417       screen readers and other aids to disabled users from working correctly!
1418
1419       Examples:
1420
1421           $font1 = $pdf->ttfont('Times.ttf');
1422           $font2 = $pdf->ttfont('Georgia.otf');
1423
1424       Valid %options are:
1425
1426       encode
1427           Changes the encoding of the font from its default
1428           (WinAnsiEncoding).
1429
1430           Note that for a single byte encoding (e.g., 'latin1'), you are
1431           limited to 256 characters defined for that encoding. 'automap' does
1432           not work with TrueType.  If you want more characters than that, use
1433           'utf8' encoding with a UTF-8 encoded text string.
1434
1435       isocmap
1436           Use the ISO Unicode Map instead of the default MS Unicode Map.
1437
1438       unicodemap
1439           If 1 (default), output ToUnicode CMap to permit text searches and
1440           screen readers. Set to 0 to save space by not including the
1441           ToUnicode CMap, but text searching and screen reading will not be
1442           possible.
1443
1444       dokern
1445           Enables kerning if data is available.
1446
1447       noembed
1448           Disables embedding of the font file. Note that this is potentially
1449           hazardous, as the glyphs provided on the PDF reader machine may not
1450           match what was used on the PDF writer machine (the one running
1451           PDF::Builder)! If you know for sure that all PDF readers will be
1452           using the same TTF or OTF file you're using with PDF::Builder; not
1453           embedding the font may be acceptable, in return for a smaller PDF
1454           file size. Note that the Reader needs to know where to find the
1455           font file -- it can't be in any random place, but typically needs
1456           to be listed in a path that the Reader follows. Otherwise, it will
1457           be unable to render the text!
1458
1459           The only value for the "noembed" flag currently checked for is 1,
1460           which means to not embed the font file in the PDF. Any other value
1461           currently results in the font file being embedded (by default),
1462           although in the future, other values might be given significance
1463           (such as checking permission bits).
1464
1465           Some additional comments on embedding font file(s) into the PDF:
1466           besides substantially increasing the size of the PDF (even if the
1467           font is subsetted, by default), PDF::Builder does not check the
1468           font file for any flags indicating font licensing issues and
1469           limitations on use. A font foundry may not permit embedding at all,
1470           may permit a subset of the font to be embedded, may permit a full
1471           font to be embedded, and may specify what can be done with an
1472           embedded font (e.g., may or may not be extracted for further use
1473           beyond displaying this one PDF). When you choose to use (and embed)
1474           a font, you should be aware of any such licensing issues.
1475
1476       nosubset
1477           Disables subsetting of a TTF/OTF font, when embedded. By default,
1478           only the glyphs used by a document are included in the file, and
1479           not the entire font.  This can result in a tremendous savings in
1480           PDF file size. If you intend to allow the PDF to be edited by
1481           users, not having the entire font glyph set available may cause
1482           problems, so be aware of that (and consider using "nosubset => 1".
1483           Setting this flag to any value results in the entire font glyph set
1484           being embedded in the file. It might be a good idea to use only the
1485           value 1, in case other values are assigned roles in the future.
1486
1487       debug
1488           If set to 1 (default is 0), diagnostic information is output about
1489           the CMap processing.
1490
1491       usecmf
1492           If set to 1 (default is 0), the first priority is to make use of
1493           one of the four ".cmap" files for CJK fonts. This is the old way of
1494           processing TTF files. If, after all is said and done, a working
1495           internal CMap hasn't been found (for usecmf=>0), "ttfont()" will
1496           fall back to using a ".cmap" file if possible.
1497
1498       cmaps
1499           This flag may be set to a string listing the Platform/Encoding
1500           pairs to look for of any internal CMaps in the font file, in the
1501           desired order (highest priority first). If one list (comma and/or
1502           space-separated pairs) is given, it is used for both Windows and
1503           non-Windows platforms (on which PDF::Builder is running, not the
1504           PDF reader's). Two lists, separated by a semicolon ; may be given,
1505           with the first being used for a Windows platform and the second for
1506           non-Windows. The default list is "0/6 3/10 0/4 3/1 0/3; 0/6 0/4
1507           3/10 0/3 3/1".  Finally, instead of a P/E list, a string "find_ms"
1508           may be given to tell it to simply call the Font::TTF "find_ms()"
1509           method to find a (preferably Windows) internal CMap. "cmaps" set to
1510           'find_ms' would emulate the old way of looking for CMaps. Symbol
1511           fonts (3/0) always use find_ms(), and the new default lookup is (if
1512           ".cmap" isn't used, see "usecmf") to try to get a match with the
1513           default list for the appropriate OS. If none can be found,
1514           find_ms() is tried, and as last resort use the ".cmap" (if
1515           available), even if "usecmf" is not 1.
1516
1517       CJK Fonts
1518
1519       Examples:
1520
1521           $font = $pdf->cjkfont('korean');
1522           $font = $pdf->cjkfont('traditional');
1523
1524       Valid %options are:
1525
1526       encode
1527           Changes the encoding of the font from its default.
1528
1529       Warning: Unlike "ttfont", the font file is not embedded in the output
1530       PDF file. This is evidently behavior left over from the early days of
1531       CJK fonts, where the "Cmap" and "Data" were always external files,
1532       rather than internal tables.  If you need a CJK-using PDF file to embed
1533       the font, for portability, you can create a PDF using "cjkfont", and
1534       then use an external utility (e.g., "pdfcairo") to embed the font in
1535       the PDF. It may also be possible to use "ttfont" instead, to produce
1536       the PDF, provided you can deduce the correct font file name from
1537       examining the PDF file (e.g., on my Windows system, the "Ming" font
1538       would be "$font = $pdf->ttfont("C:/Program Files/Adobe/Acrobat
1539       DC/Resource/CIDFont/AdobeMingStd-Light.otf")".  Of course, the font
1540       file used would have to be ".ttf" or ".otf".  It may act a little
1541       differently than "cjkfont" (due a a different Cmap), but you should be
1542       able to embed the font file into the PDF.
1543
1544       See also PDF::Builder::Resource::CIDFont::CJKFont
1545
1546       Synthetic Fonts
1547
1548       Warning: BaseEncoding is not set by default for these fonts, so text in
1549       the PDF isn't searchable (by the PDF reader) unless a ToUnicode CMap is
1550       included. A ToUnicode CMap is included by default (unicodemap set to 1)
1551       by PDF::Builder, but allows it to be disabled (for performance and file
1552       size reasons) by setting unicodemap to 0. This will produce non-
1553       searchable text, which, besides being annoying to users, may prevent
1554       screen readers and other aids to disabled users from working correctly!
1555
1556       Examples:
1557
1558           $cf  = $pdf->corefont('Times-Roman', encode => 'latin1');
1559           $sf  = $pdf->synfont($cf, condense => 0.85);   # compressed 85%
1560           $sfb = $pdf->synfont($cf, bold => 1);          # embolden by 10em
1561           $sfi = $pdf->synfont($cf, oblique => -12);     # italic at -12 degrees
1562
1563       Valid %options are:
1564
1565       condense
1566           Character width condense/expand factor (0.1-0.9 = condense, 1 =
1567           normal/default, 1.1+ = expand). It is the multiplier to apply to
1568           the width of each character.
1569
1570       oblique
1571           Italic angle (+/- degrees, default 0), sets skew of character box.
1572
1573       bold
1574           Emboldening factor (0.1+, bold = 1, heavy = 2, ...), additional
1575           thickness to draw outline of character (with a heavier line width)
1576           before filling.
1577
1578       space
1579           Additional character spacing in milliems (0-1000)
1580
1581       caps
1582           0 for normal text, 1 for small caps.  Implemented by asking the
1583           font what the uppercased translation (single character) is for a
1584           given character, and outputting it at 80% height and 88% width
1585           (heavier vertical stems are better looking than a straight 80%
1586           scale).
1587
1588           Note that only lower case letters which appear in the "standard"
1589           font (plane 0 for core fonts and PS fonts) will be small-capped.
1590           This may include eszett (German sharp s), which becomes SS, and
1591           dotless i and j which become I and J respectively. There are many
1592           other accented Latin alphabet letters which may show up in planes 1
1593           and higher. Ligatures (e.g., ij and ffl) do not have uppercase
1594           equivalents, nor does a long s. If you have text which includes
1595           such characters, you may want to consider preprocessing it to
1596           replace them with Latin character expansions (e.g., i+j and f+f+l)
1597           before small-capping.
1598
1599       Note that CJK fonts (created with the "cjkfont" method) do not work
1600       properly with "synfont". This is due to a different internal structure
1601       of the CJK fonts, as compared to corefont, ttfont, and psfont base
1602       fonts.  If you require a synthesized (modified) CJK font, you might try
1603       finding the TTF or OTF original, use "ttfont" to create the base font,
1604       and running "synfont" against that, in the manner described for
1605       embedding "CJK Fonts".
1606
1607       See also PDF::Builder::Resource::Font::SynFont
1608
1609   IMAGE METHODS
1610       This is additional information on enhanced libraries available for TIFF
1611       and PNG images. See specific information listings for GD, GIF, JPEG,
1612       and PNM image formats. In addition, see "examples/Content.pl" for an
1613       example of placing an image on a page, as well as using in a "Form".
1614
1615       Why is my image flipped or rotated?
1616
1617       Something not uncommonly seen when using JPEG photos in a PDF is that
1618       the images will be rotated and/or mirrored (flipped). This may happen
1619       when using TIFF images too. What happens is that the camera stores an
1620       image just as it comes off the CCD sensor, regardless of the camera
1621       orientation, and does not rotate it to the correct orientation! It does
1622       store a separate "orientation" flag to suggest how the image might be
1623       corrected, but not all image processing obeys this flag (PDF::Builder
1624       does not.). For example, if you take a "portrait" (tall) photo of a
1625       tree (with the phone held vertically), and then use it in a PDF, the
1626       tree may appear to have been cut down! (appears in landscape mode)
1627
1628       I have found some code that should allow the "image_jpeg" or "image"
1629       routine to auto-rotate to (supposedly) the correct orientation, by
1630       looking for the Exif metadata "Orientation" tag in the file. However,
1631       three problems arise: 1) if a photo has been edited, and rotated or
1632       flipped in the process, there is no guarantee that the Orientation tag
1633       has been corrected.  2) more than one Orientation tag may exist (e.g.,
1634       in the binary APP1/Exif header, and in XML data), and they may not
1635       agree with each other -- which should be used?  3) the code would need
1636       to uncompress the raster data, swap and/or transpose rows and/or
1637       columns, and recompress the raster data for inclusion into the PDF.
1638       This is costly and error-prone.  In any case, the user would need to be
1639       able to override any auto-rotate function.
1640
1641       For the time being, PDF::Builder will simply leave it up to the user of
1642       the library to take care of rotating and/or flipping an image which
1643       displays incorrectly. It is possible that we will consider adding some
1644       sort of query or warning that the image appears to not be "normally"
1645       oriented (Orientation value 1 or "Top-left"), according to the
1646       Orientation flag. You can consider either (re-)saving the photo in an
1647       editor such as PhotoShop or GIMP, or using PDF::Builder code similar to
1648       the following (for images rotated 180 degrees):
1649
1650           $pW = 612; $pH = 792;  # page dimensions (US Letter)
1651           my $img = $pdf->image_jpeg("AliceLake.jpeg");
1652           # raw size WxH 4032x3024, scaled down to 504x378
1653           $sW = 4032/8; $sH = 3024/8;
1654           # intent is to center on US Letter sized page (LL at 54,207)
1655           # Orientation flag on this image is 3 (rotated 180 degrees).
1656           # if naively displayed (just $gfx->image call), it will be upside down
1657
1658           $gfx->save();
1659
1660           ## method 0: simple display, is rotated 180 degrees!
1661           #$gfx->image($img, ($pW-$sW)/2,($pH-$sH)/2, $sW,$sH);
1662
1663           ## method 1: translate, then rotate
1664           #$gfx->translate($pW,$pH);             # to new origin (media UR corner)
1665           #$gfx->rotate(180);                    # rotate around new origin
1666           #$gfx->image($img, ($pW-$sW)/2,($pH-$sH)/2, $sW,$sH);
1667                                                  # image's UR corner, not LL
1668
1669           # method 2: rotate, then translate
1670           $gfx->rotate(180);                     # rotate around current origin
1671           $gfx->translate(-$sW,-$sH);            # translate in rotated coordinates
1672           $gfx->image($img, -($pW-$sW)/2,-($pH-$sH)/2, $sW,$sH);
1673                                                  # image's UR corner, not LL
1674
1675           ## method 3: flip (mirror) twice
1676           #$scale = 1;  # not rescaling here
1677           #$size_page = $pH/$scale;
1678           #$invScale = 1.0/$scale;
1679           #$gfx->add("-$invScale 0 0 -$invScale 0 $size_page cm");
1680           #$gfx->image($img, -($pW-$sW)/2-$sW,($pH-$sH)/2, $sW,$sH);
1681
1682           $gfx->restore();
1683
1684       If your image is also mirrored (flipped about an axis), simple rotation
1685       will not suffice. You could do something with a reversal of the
1686       coordinate system, as in "method 3" above (see "Advanced Methods" in
1687       PDF::Builder::Content). To mirror only left/right, the second $invScale
1688       would be positive; to mirror only top/bottom, the first would be
1689       positive. If all else fails, you could save a mirrored copy in a photo
1690       editor.  90 or 270 degree rotations will require a "rotate" call,
1691       possibly with "cm" usage to reverse mirroring.  Incidentally, do not
1692       confuse this issue with the coordinate flipping performed by some
1693       Chrome browsers when printing a page to PDF.
1694
1695       Note that TIFF images may have the same rotation/mirroring problems as
1696       JPEG, which is not surprising, as the Exif format was lifted from TIFF
1697       for use in JPEG. The cure will be similar to JPEG's.
1698
1699       TIFF Images
1700
1701       Note that the Graphics::TIFF support library does not currently permit
1702       a filehandle for $file.
1703
1704       PDF::Builder will use the Graphics::TIFF support library for TIFF
1705       functions, if it is available, unless explicitly told not to. Your code
1706       can test whether Graphics::TIFF is available by examining
1707       "$tiff->usesLib()" or "$pdf->LA_GT()".
1708
1709       = -1
1710           Graphics::TIFF is installed, but your code has specified "nouseGT",
1711           to not use it. The old, pure Perl, code (buggy!) will be used
1712           instead, as if Graphics::TIFF was not installed.
1713
1714       = 0 Graphics::TIFF is not installed. Not all systems are able to
1715           successfully install this package, as it requires libtiff.a.
1716
1717       = 1 Graphics::TIFF is installed and is being used.
1718
1719       Options:
1720
1721       nouseGT => 1
1722           Do not use the Graphics::TIFF library, even if it's available.
1723           Normally you would want to use this library, but there may be cases
1724           where you don't, such as when you want to use a file handle instead
1725           of a name.
1726
1727       silent => 1
1728           Do not give the message that Graphics::TIFF is not installed. This
1729           message will be given only once, but you may want to suppress it,
1730           such as during t-tests.
1731
1732       PNG Images
1733
1734       PDF::Builder will use the Image::PNG::Libpng support library for PNG
1735       functions, if it is available, unless explicitly told not to. Your code
1736       can test whether Image::PNG::Libpng is available by examining
1737       "$png->usesLib()" or "$pdf->LA_IPL()".
1738
1739       = -1
1740           Image::PNG::Libpng is installed, but your code has specified
1741           "nouseIPL", to not use it. The old, pure Perl, code (slower and
1742           less capable) will be used instead, as if Image::PNG::Libpng was
1743           not installed.
1744
1745       = 0 Image::PNG::Libpng is not installed. Not all systems are able to
1746           successfully install this package, as it requires libpng.a.
1747
1748       = 1 Image::PNG::Libpng is installed and is being used.
1749
1750       Options:
1751
1752       nouseIPL => 1
1753           Do not use the Image::PNG::Libpng library, even if it's available.
1754           Normally you would want to use this library, when available, but
1755           there may be cases where you don't.
1756
1757       silent => 1
1758           Do not give the message that Image::PNG::Libpng is not installed.
1759           This message will be given only once, but you may want to suppress
1760           it, such as during t-tests.
1761
1762       notrans => 1
1763           No transparency -- ignore tRNS chunk if provided, ignore Alpha
1764           channel if provided.
1765
1766   USING SHAPER (HarfBuzz::Shaper library)
1767           # if HarfBuzz::Shaper is not installed, either bail out, or try to
1768           # use regular TTF calls instead
1769           my $rc;
1770           $rc = eval {
1771               require HarfBuzz::Shaper;
1772               1;
1773           };
1774           if (!defined $rc) { $rc = 0; }
1775           if ($rc == 0) {
1776               # bail out in some manner
1777           } else {
1778               # can use Shaper
1779           }
1780
1781           my $fontfile = '/WINDOWS/Fonts/times.ttf'; # used by both Shaper and textHS
1782           my $fontsize = 15;                         # used by both Shaper and textHS
1783           my $font = $pdf->ttfont($fontfile);
1784           $text->font($font, $fontsize);
1785
1786           my $hb = HarfBuzz::Shaper->new(); # only need to set up once
1787           my %settings; # for textHS(), not Shaper
1788           $settings{'dump'} = 1; # see the diagnostics
1789           $settings{'script'} = 'Latn';
1790           $settings('dir'} = 'L';  # LTR
1791           $settings{'features'} = ();  # required
1792
1793           # -- set language (override automatic setting)
1794           #$settings{'language'} = 'en';
1795           #$hb->set_language( 'en_US' );
1796           # -- turn OFF ligatures
1797           #push @{ $settings{'features'} }, 'liga';
1798           #$hb->add_features( 'liga' );
1799           # -- turn OFF kerning
1800           #push @{ $settings{'features'} }, 'kern';
1801           #$hb->add_features( 'kern' );
1802           $hb->set_font($fontfile);
1803           $hb->set_size($fontsize);
1804           $hb->set_text("Let's eat waffles in the field for brunch.");
1805             # expect ffl and fi ligatures, and perhaps some kerning
1806
1807           my $info = $hb->shaper();
1808           $text->textHS($info, \%settings); # strikethru, underline allowed
1809
1810       The package HarfBuzz::Shaper may be optionally installed in order to
1811       use the text-shaping capabilities of the HarfBuzz library. These
1812       include kerning and ligatures in Western scripts (such as the Latin
1813       alphabet). More complex scripts can be handled, such as Arabic family
1814       and Indic scripts, where multiple forms of a character may be
1815       automatically selected, characters may be reordered, and other
1816       modifications made. The examples/HarfBuzz.pl script gives some examples
1817       of what may be done.
1818
1819       Keep in mind that HarfBuzz works only with TrueType (.ttf) and OpenType
1820       (.otf) font files. It will not work with PostScript (Type1), core,
1821       bitmapped, or CJK fonts. Not all .ttf fonts have the instructions
1822       necessary to guide HarfBuzz, but most proper .otf fonts do. In other
1823       words, there are no guarantees that a particular font file will work
1824       with Shaper!
1825
1826       The basic idea is to break up text into "chunks" which are of the same
1827       script (alphabet), language, direction, font face, font size, and
1828       variant (italic, bold, etc.). These could range from a single character
1829       to paragraph-length strings of text. These are fed to HarfBuzz::Shaper,
1830       along with flags, the font file to be used, and other supporting
1831       information, to create an array of output glyphs. Each element is a
1832       hash describing the glyph to be output, including its name (if
1833       available), its glyph ID (number) in the selected font, its x and y
1834       displacement (usually 0), and its "advance" x and y values, all in
1835       points. For horizontal languages (LTR and RTL), the y advance is
1836       normally 0 and the x advance is the font's character width, less any
1837       kerning amount.
1838
1839       Shaper will attempt to figure out the script used and the text
1840       direction, based on the Unicode range; and a reasonable guess at the
1841       language used. The language can be overridden, but currently the script
1842       and text direction cannot be overridden.
1843
1844       An important note: the number of glyphs (array elements) may not be
1845       equal to the number of Unicode points (characters) given in the chunk's
1846       text string!  Sometimes a character will be decomposed into several
1847       pieces (multiple glyphs); sometimes multiple characters may be combined
1848       into a single ligature glyph; and characters may be reordered
1849       (especially in Indic and Southeast Asian languages).  As well, for
1850       Right-to-Left (bidirectional) scripts such as Hebrew and Arabic
1851       families, the text is output in Left-to-Right order (reversed from the
1852       input).
1853
1854       With due care, a Shaper array can be manipulated in code. The elements
1855       are more or less independent of each other, so elements can be
1856       modified, rearranged, inserted, or deleted. You might adjust the
1857       position of a glyph with 'dx' and 'dy' hash elements. The 'ax' value
1858       should be left alone, so that the wrong kerning isn't calculated, but
1859       you might need to adjust the "advance x" value by means of one of the
1860       following:
1861
1862       axs is a value to be substituted for 'ax' (points)
1863       axsp is a substituted value (percentage) of the original 'ax'
1864       axr reduces 'ax' by the value (points). If negative, increase 'ax'
1865       axrp reduces 'ax' by the given percentage. Again, negative increases
1866       'ax'
1867
1868       Caution: a given character's glyph ID is not necessarily going to be
1869       the same between any two fonts! For example, an ASCII space (U+0020)
1870       might be "<0001>" in one font, and "<0003>" in another font (even one
1871       closely related!). A U+00A0 required blank (non-breaking space) may be
1872       output as a regular ASCII space U+0020. Take care if you need to find a
1873       particular glyph in the array, especially if the number of elements
1874       don't match. Consider making a text string of "marker" characters
1875       (space, nbsp, hyphen, soft hyphen, etc.) and processing it through
1876       HarfBuzz::Shaper to get the corresponding glyph numbers. You may have
1877       to count spaces, say, to see where you could break a glyph array to fit
1878       a line.
1879
1880       The "advancewidthHS()" method uses the same inputs as does "textHS()".
1881       Like "advancewidth()", it returns the chunk length in points. Unlike
1882       "advancewidth()", you cannot override the glyph array's font, font
1883       size, etc.
1884
1885       Once you have your (possibly modified) array of glyphs, you feed it to
1886       the "textHS()" method to render it to the page. Remember that this
1887       method handles only a single line of text; it does not do line
1888       splitting or fitting -- that you currently need to do manually. For
1889       Western scripts (e.g., Latin), that might not be too difficult, but for
1890       other scripts that involve extensive modification of the raw
1891       characters, it may be quite difficult to split words, but you still may
1892       be able to split at inter-word spaces.
1893
1894       A useful, but not exhaustive, set of functions are allowed by
1895       "textHS()" use.  Support includes direction setting (top-to-bottom and
1896       bottom-to-top directions, e.g., for Far Eastern languages in
1897       traditional orientation), and explicit script names and language
1898       (depending on what support HarfBuzz itself gives).  Not yet supported
1899       are features such as discretionary ligatures and manual selection of
1900       glyphs (e.g., swashes and alternate forms).
1901
1902       Currently, "textHS()" can only handle a single text string. We are
1903       looking at how fitting to a line length (splitting up an array) could
1904       be done, as well as how words might be split on hard and soft hyphens.
1905       At some point, full paragraph and page shaping could be possible.
1906
1907
1908
1909perl v5.36.0                      2022-09-13             PDF::Builder::Docs(3)
Impressum