1PDF::Builder::Docs(3) User Contributed Perl DocumentationPDF::Builder::Docs(3)
2
3
4

NAME

6       PDF::Builder::Docs - additional documentation for Builder module
7

SOME SPECIAL NOTES

9   Software Development Kit
10       There are four levels of involvement with PDF::Builder. Depending on
11       what you want to do, different kinds of installs are recommended.
12
13       1. Simply installing PDF::Builder as a prerequisite for running some
14       other package. All you need to do is install the CPAN package for
15       PDF::Builder, and it will load the .pm files into your Perl library. If
16       the other package prereqs PDF::Builder, its installer may download and
17       install PDF::Builder automatically.
18
19       2. You want to write a Perl program that uses PDF::Builder functions.
20       In addition to installing PDF::Builder from CPAN, you will want
21       documentation on it. Obtain a copy of the product from GitHub
22       (https://github.com/PhilterPaper/Perl-PDF-Builder) or as a gzipped tar
23       file from CPAN.  This includes a utility to build (from POD) a library
24       of HTML documents, as well as examples (examples/ directory) and
25       contributed sample programs (contrib/ directory).
26
27       3. You want to modify PDF::Builder files. In addition to the CPAN and
28       GitHub distributions, you may choose to keep a local Git repository for
29       tracking your changes. Depending on whether or not your PDF::Builder
30       copy is being used for production purposes, you may want to do your
31       editing and testing in the Perl library installation (live) or in a
32       different place. The "t" tests (t/ directory) and examples provide good
33       regression tests to ensure that you haven't broken anything. If you do
34       your editing on the live code, don't forget when done to copy the
35       changes back into the master version you keep!
36
37       4. You want to contribute to the development of PDF::Builder. You will
38       need a local Git repository (and a GitHub account), so that when you've
39       got it all done, you can issue a "Pull Request" to bring it to our
40       attention. We can't guarantee that your work will be incorporated into
41       the project, but at least we will look at it. From time to time, a new
42       CPAN version will be issued.
43
44       If you want to make substantial changes for public use, and can't come
45       to a meeting of minds with us, you can even start your own GitHub
46       project and register a new CPAN project (that's what we did, forking
47       PDF::API2). Please don't just assume that we don't want your changes --
48       at least propose what you want to do in writing, so we can consider it.
49       We're always looking for people to help out and expand PDF::Builder.
50
51   Optional Libraries
52       PDF::Builder can make use of some optional libraries, which are not
53       required for a successful installation. If you want improved speed and
54       capabilities for certain functions, you may want to install and use
55       these libraries:
56
57       * Graphics::TIFF -- PDF::Builder inherited a rather slow, buggy, and
58       limited TIFF image library from PDF::API2. If Graphics::TIFF (available
59       on CPAN, uses libtiff.a) is installed, PDF::Builder will use that
60       instead, unless you specify that it is to use the old, pure Perl
61       library. The only time you might want to consider this is when you need
62       to pass an open filehandle to "image_tiff" instead of a file name. See
63       resolved bug reports RT 84665 and RT 118047, as well as "image_tiff",
64       for more information.
65
66       * Image::PNG::Libpng -- PDF::Builder inherited a rather slow and buggy
67       pure Perl PNG image library from PDF::API2. If Image::PNG::Libpng
68       (available on CPAN, uses libpng.a) is installed, PDF::Builder will use
69       that instead, unless you specify that it is to use the old, pure Perl
70       library. Using the new library will give you improved speed, the
71       ability to use 16 bit samples, and the ability to read interlaced PNG
72       files. See resolved bug report RT 124349, as well as "image_png", for
73       more information.
74
75       * HarfBuzz::Shaper -- This library enables PDF::Builder to handle
76       complex scripts (Arabic, Devanagari, etc.) as well as non-LTR writing
77       systems. It is also useful for Latin and other simple scripts, for
78       ligatures and improved kerning. HarfBuzz::Shaper is based on a set of
79       HarfBuzz libraries, which it will attempt to build if they are not
80       found. See "textHS" for more information.
81
82       * Text::Markdown -- This library is used if you want to format
83       "Markdown" style code in PDF::Builder, via the column() method. It
84       translates a certain dialect of Markdown into HTML, which is then
85       further processed.
86
87       * HTML::TreeBuilder -- This library is used to format HTML input into a
88       data structure which PDF::Builder can interpret, via the column()
89       method.  Note that if Markdown input is used, it will also need
90       HTML::TreeBuilder to handle the HTML the Markdown is translated to.
91
92       Note that the installation process will not attempt to install these
93       libraries automatically. If you don't wish to use one or more of them,
94       you are free to not install the optional librarie(s). If you may want
95       to make use of one or more, consider installing them before installing
96       PDF::Builder, so that any t-tests and/or examples that make use of
97       these libraries may be run during installation and checkout of
98       PDF::Builder. Remember, you can always install an optional library
99       later, if you want to make use of it.
100
101   Strings (Character Text)
102       Perl, and hence PDF::Builder, use strings that support the full range
103       of Unicode characters. When importing strings into a Perl program, for
104       example by reading text from a file, you must be aware of what their
105       character encoding is. Single-byte encodings (default is 'latin1'),
106       represented as bytes of value 0x00 through 0xFF (0..255), will produce
107       different results if you do something that depends on the encoding,
108       such as sorting, searching, or comparing any two non-ASCII characters.
109       This also applies to any characters (text) hard coded into the Perl
110       program.
111
112       You can always decode the text from external encoding (ASCII, UTF-8,
113       Latin-3, etc.) into the Perl (internal) UTF-8 multibyte encoding. This
114       uses one to four bytes to represent each character. See pragma "utf8"
115       and module "Encode" for details about decoding text. Note that only
116       TrueType fonts ("ttfont") can make direct use of UTF-8-encoded text.
117       Other font types (core, T1, etc.) can only use single-byte encoded
118       text. If your text is ASCII, Latin-1, or CP-1252, you can just leave
119       the Perl strings as the default single-byte encoding.
120
121       Then, there is the matter of encoding the output to match up with
122       available font character sets. You're not actually translating the text
123       on output, but are telling the output system (and Reader) what encoding
124       the output byte stream represents, and what character glyphs they
125       should generate.
126
127       If you confine your text to plain ASCII (0x00 .. 0x7F byte values) or
128       even Latin-1 or CP-1252 (0x00 .. 0xFF byte values), you can use default
129       (non-UTF-8) Perl strings and use the default output encoding
130       (WinAnsiEncoding), which is more-or-less Windows CP-1252 (a superset in
131       turn, of ISO-8859-1 Latin-1). If your text uses any other characters,
132       you will need to be aware of what encoding your text strings are (in
133       the Perl string and for declaring output glyph generation).  See "Core
134       Fonts", "PS Fonts" and "TrueType Fonts" in "FONT METHODS" for
135       additional information.
136
137       Some Internal Details
138
139       Some of the following may be a bit scary or confusing to beginners, so
140       don't be afraid to skip over it until you're ready for it...
141
142       Perl (and PDF::Builder) internally use strings which are either single-
143       byte (ISO-8859-1/Latin-1) or multibyte UTF-8 encoded (there is an
144       internal flag marking the string as UTF-8 or not).  If you work
145       strictly in ASCII or Latin-1 or CP-1252 (each a superset of the
146       previous), you should be OK in not doing anything special about your
147       string encoding. You can just use the default Perl single byte strings
148       (internally marked as not UTF-8) and the default output encoding
149       (WinAnsiEncoding).
150
151       If you intend to use input from a variety of sources, you should
152       consider decoding (converting) your text to UTF-8, which will provide
153       an internally consistent representation (and your Perl code itself
154       should be saved in UTF-8, in case you want to use any hard coded non-
155       ASCII characters). In any string, non-ASCII characters (0x80 or higher)
156       would be converted to the Perl UTF-8 internal representation, via
157       "$string = Encode::decode(MY_ENCODING, $input);".  "MY_ENCODING" would
158       be a string like 'latin1', 'cp-1252', 'utf8', etc. Similar capabilities
159       are available for declaring a file to be in a certain encoding.
160
161       Be aware that if you use UTF-8 encoding for your text, that only
162       TrueType font output ("ttfont") can handle it directly. Corefont and
163       Type1 output will require that the text will have to be converted back
164       into a single-byte encoding (using "Encode::encode"), which may need to
165       be declared with "encode" (for "corefont" or "psfont"). If you have any
166       characters not found in the selected single-byte encoding (but are
167       found in the font itself), you will need to use "automap" to break up
168       the font glyphs into 256 character planes, map such characters to 0x00
169       .. 0xFF in the appropriate plane, and switch between font planes as
170       necessary.
171
172       Core and Type1 fonts (output) use the byte values in the string
173       (single-byte encoding only!) and provide a byte-to-glyph mapping record
174       for each plane.  TrueType outputs a group of four hexadecimal digits
175       representing the "CId" (character ID) of each character. The CId does
176       not correspond to either the single-byte or UTF-8 internal
177       representations of the characters.
178
179       The bottom line is that you need to know what the internal
180       representation of your text is, so that the output routines can tell
181       the PDF reader about it (via the PDF file). The text will not be
182       translated upon output, but the PDF reader needs to know what the
183       encoding in use is, so it knows what glyph to associate with each byte
184       (or byte sequence).
185
186       Note that some operating systems and Perl flavors are reputed to be
187       strict about encoding names. For example, latin1 (an alias) may be
188       rejected as invalid, while iso-8859-1 (a canonical value) will work.
189
190       By the way, it is recommended that you be using at least Perl 5.10 if
191       you are going to be using any non-ASCII characters. Perl 5.8 may be a
192       little unpredictable in handling such text.
193
194   Rendering Order
195       For better or worse, for compatibility purposes, PDF::Builder continues
196       the same rendering model as used by PDF::API2 (and possibly its
197       predecessors). That is, all graphics for one graphics object are put
198       into one record, and all text output for one text object goes into
199       another record. Which one is output first, is whichever is declared
200       first. This can lead to unexpected results, where items are rendered in
201       (apparently) the wrong order. That is, text and graphics items are not
202       necessarily output (rendered) in the same order as they were created in
203       code. Two items in the same object (e.g., $text) will be rendered in
204       the same order as they were coded, but items from different objects may
205       not be rendered in the expected order. The following example (source
206       code and annotated PDF excerpts) will hopefully illustrate the issue:
207
208        use strict;
209        use warnings;
210        use PDF::Builder;
211
212        # demonstrate text and graphics object order
213        #
214        my $fname = "objorder";
215
216        my $paper_size = "Letter";
217
218        # see the text and graphics stream contents
219        my $pdf = PDF::Builder->new(compress => 'none');
220        $pdf->mediabox($paper_size);
221        my $page = $pdf->page();
222        # adjust path for your operating system
223        my $fontTR = $pdf->ttfont('C:\\Windows\\Fonts\\timesbd.ttf');
224
225       For the first group, you might expect the "under" line to be output,
226       then the filled circle (disc) partly covering it, then the "over" line
227       covering the disc, and finally a filled rectangle (bar) over both
228       lines. What actually happened is that the $grfx graphics object was
229       declared first, so everything in that object (the disc and bar) is
230       output first, and the text object $text (both lines) comes afterwards.
231       The result is that the text lines are on top of the graphics drawings.
232
233        # ----------------------------
234        # 1. text, orange ball over, text over, bar over
235
236        my $grfx1 = $page->gfx();
237        my $text1 = $page->text();
238        $text1->font($fontTR, 20);  # 20 pt Times Roman bold
239
240        $text1->fillcolor('black');
241        $grfx1->strokecolor('blue');
242        $grfx1->fillcolor('orange');
243
244        $text1->translate(50,700);
245        $text1->text_left("This text should be under everything.");
246
247        $grfx1->circle(100,690, 30);
248        $grfx1->fillstroke();
249
250        $text1->translate(50,670);
251        $text1->text_left("This text should be over the ball and under the bar.");
252
253        $grfx1->rect(160,660, 20,70);
254        $grfx1->fillstroke();
255
256        % ---------------- group 1: define graphics object first, then text
257        11 0 obj << /Length 690 >> stream   % obj 11 is graphics for (1)
258         0 0 1 RG    % stroke blue
259        1 0.647059 0 rg   % fill orange
260        130 690 m ... c h B   % draw and fill circle
261        160 660 20 70 re B   % draw and fill bar
262        endstream endobj
263
264        12 0 obj << /Length 438 >> stream   % obj 12 is text for (1)
265          BT
266        /TiCBA 20 Tf   % Times Roman Bold 20pt
267        0 0 0 rg   % fill black
268        1 0 0 1 50 700 Tm   % position text
269        <0037 ... 0011> Tj   % "under" line
270        1 0 0 1 50 670 Tm   % position text
271        <0037 ... 0011> Tj   % "over" line
272          ET
273        endstream endobj
274
275       The second group is the same as the first, with the only difference
276       being that the text object was declared first, and then the graphics
277       object. The result is that the two text lines are rendered first, and
278       then the disc and bar are drawn over them.
279
280        # ----------------------------
281        # 2. (1) again, with graphics and text order reversed
282
283        my $text2 = $page->text();
284        my $grfx2 = $page->gfx();
285        $text2->font($fontTR, 20);  # 20 pt Times Roman bold
286
287        $text2->fillcolor('black');
288        $grfx2->strokecolor('blue');
289        $grfx2->fillcolor('orange');
290
291        $text2->translate(50,600);
292        $text2->text_left("This text should be under everything.");
293
294        $grfx2->circle(100,590, 30);
295        $grfx2->fillstroke();
296
297        $text2->translate(50,570);
298        $text2->text_left("This text should be over the ball and under the bar.");
299
300        $grfx2->rect(160,560, 20,70);
301        $grfx2->fillstroke();
302
303        % ---------------- group 2: define text object first, then graphics
304        13 0 obj << /Length 438 >> stream    % obj 13 is text for (2)
305          BT
306        /TiCBA 20 Tf   % Times Roman Bold 20pt
307        0 0 0 rg   % fill black
308        1 0 0 1 50 600 Tm   % position text
309        <0037 ... 0011> Tj   % "under" line
310        1 0 0 1 50 570 Tm   % position text
311        <0037 ... 0011> Tj   % "over" line
312          ET
313        endstream endobj
314
315        14 0 obj << /Length 690 >> stream   % obj 14 is graphics for (2)
316         0 0 1 RG   % stroke blue
317        1 0.647059 0 rg   % fill orange
318        130 590 m ... h B   % draw and fill circle
319        160 560 20 70 re B   % draw and fill bar
320        endstream endobj
321
322       The third group defines two text and two graphics objects, in the order
323       that they are expected in. The "under" text line is output first, then
324       the orange disc graphics is output, partly covering the text. The
325       "over" text line is now output -- it's actually over the disc, but is
326       orange because the previous object stream (first graphics object) left
327       the fill color (also used for text) as orange, because we didn't
328       explicitly set the fill color before outputting the second text line.
329       This is not "inheritance" so much as it is whatever the graphics
330       (drawing) state (used for both "graphics" and "text") is left in at the
331       end of one object, it's the state at the beginning of the next object.
332       If you wish to control this, consider surrounding the graphics or text
333       calls with save() and restore() calls to save and restore (push and
334       pop) the graphics state to what it was at the save(). Finally, the bar
335       is drawn over everything.
336
337        # ----------------------------
338        # 3. (2) again, with two graphics and two text objects
339
340        my $text3 = $page->text();
341        my $grfx3 = $page->gfx();
342        $text3->font($fontTR, 20);  # 20 pt Times Roman bold
343        my $text4 = $page->text();
344        my $grfx4 = $page->gfx();
345        $text4->font($fontTR, 20);  # 20 pt Times Roman bold
346
347        $text3->fillcolor('black');
348        $grfx3->strokecolor('blue');
349        $grfx3->fillcolor('orange');
350        # $text4->fillcolor('yellow');
351        # $grfx4->strokecolor('red');
352        # $grfx4->fillcolor('purple');
353
354        $text3->translate(50,500);
355        $text3->text_left("This text should be under everything.");
356
357        $grfx3->circle(100,490, 30);
358        $grfx3->fillstroke();
359
360        $text4->translate(50,470);
361        $text4->text_left("This text should be over the ball and under the bar.");
362
363        $grfx4->rect(160,460, 20,70);
364        $grfx4->fillstroke();
365
366        % ---------------- group 3: define text1, graphics1, text2, graphics2
367        15 0 obj << /Length 206 >> stream   % obj 15 is text1 for (3)
368          BT
369        /TiCBA 20 Tf   % Times Roman Bold 20pt
370        0 0 0 rg  % fill black
371        1 0 0 1 50 500 Tm   % position text
372        <0037 ... 0011> Tj   % "under" line
373          ET
374        endstream endobj
375
376        16 0 obj << /Length 671 >> stream   % obj 16 is graphics1 for (3) circle
377         0 0 1 RG   % stroke blue
378        1 0.647059 0 rg   % fill orange
379        130 490 m ... h B   % draw and fill circle
380        endstream endobj
381
382        17 0 obj << /Length 257 >> stream   % obj 17 is text2 for (3)
383          BT
384        /TiCBA 20 Tf   % Times Roman Bold 20pt
385        1 0 0 1 50 470 Tm   % position text
386        <0037 ... 0011> Tj   % "over" line
387          ET
388        endstream endobj
389
390        18 0 obj << /Length 20 >> stream   % obj 18 is graphics for (3) bar
391         160 460 20 70 re B   % draw and fill bar
392        endstream endobj
393
394       The fourth group is the same as the third, except that we define the
395       fill color for the text in the second line. This makes it clear that
396       the "over" line (in yellow) was written after the orange disc, and
397       still before the bar.
398
399        # ----------------------------
400        # 4. (3) again, a new set of colors for second group
401
402        my $text3 = $page->text();
403        my $grfx3 = $page->gfx();
404        $text3->font($fontTR, 20);  # 20 pt Times Roman bold
405        my $text4 = $page->text();
406        my $grfx4 = $page->gfx();
407        $text4->font($fontTR, 20);  # 20 pt Times Roman bold
408
409        $text3->fillcolor('black');
410        $grfx3->strokecolor('blue');
411        $grfx3->fillcolor('orange');
412        $text4->fillcolor('yellow');
413        $grfx4->strokecolor('red');
414        $grfx4->fillcolor('purple');
415
416        $text3->translate(50,400);
417        $text3->text_left("This text should be under everything.");
418
419        $grfx3->circle(100,390, 30);
420        $grfx3->fillstroke();
421
422        $text4->translate(50,370);
423        $text4->text_left("This text should be over the ball and under the bar.");
424
425        $grfx4->rect(160,360, 20,70);
426        $grfx4->fillstroke();
427
428        % ---------------- group 4: define text1, graphics1, text2, graphics2 with colors for 2
429        19 0 obj << /Length 206 >> stream   % obj 19 is text1 for (4)
430          BT
431        /TiCBA 20 Tf   % Times Roman Bold 20pt
432        0 0 0 rg  % fill black
433        1 0 0 1 50 400 Tm   % position text
434        <0037 ... 0011> Tj   % "under" line
435          ET
436        endstream endobj
437
438        20 0 obj << /Length 671 >> stream   % obj 20 is graphics1 for (4) circle
439         0 0 1 RG   % stroke blue
440        1 0.647059 0 rg   % fill orange
441        130 390 m ... h B   % draw and fill circle
442        endstream endobj
443
444        21 0 obj << /Length 266 >> stream   % obj 21 is text2 for (4)
445          BT
446        /TiCBA 20 Tf   % Times Roman Bold 20pt
447        1 1 0 rg   % fill yellow
448        1 0 0 1 50 370 Tm   % position text
449        <0037 ... 0011> Tj   % "over" line
450          ET
451        endstream endobj
452
453        22 0 obj << /Length 52 >> stream   % obj 22 is graphics for (4) bar
454         1 0 0 RG   % stroke red
455        0.498039 0 0.498039 rg   % fill purple
456        160 360 20 70 re B   % draw and fill rectangle (bar)
457        endstream endobj
458
459        # ----------------------------
460        $pdf->saveas("$fname.pdf");
461
462       The separation of text and graphics means that only some text methods
463       are available in a graphics object, and only some graphics methods are
464       available in a text object. There is much overlap, but they differ.
465       There's really no reason the code couldn't have been written (in
466       PDF::API2, or earlier) as outputting to a single object, which would
467       keep everything in the same order as the method calls. An advantage
468       would be less object and stream overhead in the PDF file. The only
469       drawback might be that an object might more easily overflow and require
470       splitting into multiple objects, but that should be rare.
471
472       You should always be able to manually split an object by simply ending
473       output to the first object, and picking up with output to the second
474       object, so long as it was created immediately after the first object.
475       The graphics state at the end of the first object should be the initial
476       state at the beginning of the second object. However, use caution when
477       dealing with text objects -- the PDF specification states that the Text
478       matrices are not carried over from one object to the next (BT resets
479       them), so you may need to reset some settings.
480
481        $grfx1 = $page->gfx();
482        $grfx2 = $page->gfx();
483        # write a huge amount of stuff to $grfx1
484        # write a huge amount of stuff to $grfx2, picking up where $grfx1 left off
485
486       In any case, now that you understand the rendering order and how the
487       order of object declarations affects it, how text and graphics are
488       drawn can now be completely controlled as desired. There is really no
489       need to add another "both" type object that will handle all graphics
490       and text objects, as that would probably be a major code bloat for very
491       little benefit. However, it could be considered in the future if there
492       is a demonstrated need for it, such as serious PDF file size bloat due
493       to the extra object overhead when interleaving text and graphics
494       output.
495
496       There is not currently a general facility for mixed-use objects, but a
497       limited example is the current implementation of underline, line-
498       through, and overline text (within column() markup); which are
499       performed within the text object, temporarily exiting (ET) to graphics
500       mode to draw the lines, and then returning (BT) to text mode. This was
501       done so that baseline coordinate adjustments could be easily made.
502       Since "BT" resets some text settings, this needs to be done with care!
503
504   PDF Versions Supported
505       When creating a PDF file using the functions in PDF::Builder, the
506       output is marked as PDF 1.4. This does not mean that all PDF
507       functionality up through 1.4 is supported! There are almost surely
508       features missing as far back as the PDF 1.0 standard.
509
510       The big problem is when a PDF of version 1.5 or higher is imported or
511       opened in PDF::Builder. If it contains content that is actually
512       unsupported by this software, there is a chance that something will
513       break. This does not guarantee that a PDF marked as "1.7" will go down
514       in flames when read by PDF::Builder, or that a PDF written back out
515       will break in a Reader, but the possibility is there. Much PDF writer
516       software simply marks its output as the highest version of PDF at the
517       time (usually 1.7), even if there is no content beyond, say, 1.2.
518       There is some handling of PDF 1.5 items in PDF::Builder, such as cross
519       reference streams, but support beyond 1.4 is very limited. All we can
520       say is to be careful when handling PDFs whose version is above 1.4, and
521       test thoroughly, as they may break at some point.
522
523       PDF::Builder includes a simple version control mechanism, where the
524       initial PDF version to be output (default 1.4) can be set by the
525       programmer. Input PDFs greater than 1.4 (current output level) will
526       receive a warning (can be suppressed) that the output level will be
527       raised to that level. The use of PDF features greater than the current
528       output level will likewise trigger a warning that the output level is
529       to be raised to the necessary level. If this is not desired, you should
530       avoid using those PDF features which are higher than the desired PDF
531       output level.
532
533   History
534       PDF::API2 was originally written by Alfred Reibenschuh, derived from
535       Martin Hosken's Text::PDF via the Text::PDF::API wrapper.  In 2009,
536       Otto Hirr started the PDF::API3 fork, but it never went anywhere.  In
537       2011, PDF::API2 maintenance was taken over by Steve Simms.  In 2017,
538       PDF::Builder was forked by Phil M. Perry, who desired a more aggressive
539       schedule of new features and bug fixes than Simms was providing,
540       although some of Simms's work has been ported from PDF::API2.
541
542       According to "pdfapi2_for_fun_and_profit_APW2005.pdf" (on
543       http://pdfapi2.sourceforge.net, an unmaintained site), the history of
544       PDF::API2 (the predecessor to PDF::Builder) goes as such:
545
546           •  First Code implemented based on PDFlib-0.6 (AFPL)
547           •  Changed to Text::PDF with a total rewrite as Text::PDF::API
548       (procedural)
549           •  Unmaintainable Code triggered rewrite into new Namespace
550       PDF::API2 (object-oriented, LGPL)
551           •  Object-Structure streamlined in 0.4x
552
553       At Simms's request, the name of the new offering was changed from
554       PDF::API4 to PDF::Builder, to reduce the chance of confusion due to
555       parallel development.  Perry's intent is to keep all internal methods
556       as upwardly compatible with PDF::API2 as possible, although it is
557       likely that there will be some drift (incompatibilities) over time. At
558       least initially, any program written based on PDF::API2 should be
559       convertible to PDF::Builder simply by changing "API2" anywhere it
560       occurs to "Builder". See the INFO/KNOWN_INCOMP known incompatibilities
561       file for further information.
562
563       Thanks...
564
565       Many users have helped out by reporting bugs and requesting
566       enhancements. A special shout out goes to those who have contributed
567       code and tests, or coordinated their package development with the needs
568       of PDF::Builder: Ben Bullock, Cary Gravel, Gregor Herrmann, Petr Pisar,
569       Jeffrey Ratcliffe, Steve Simms (via PDF::API2 fixes), and Johan
570       Vromans.  Drop me a line if I've overlooked your contribution!
571

DETAILED NOTES ON METHODS

573       Note: older versions of this package named various (hash element)
574       options with leading dashes (hyphens) in the name, e.g., '-encode'. The
575       use of a dash is now optional, and options are documented with names
576       not using dashes. At some point in the future, it is possible that
577       support for dashed names will be deprecated (and eventually withdrawn),
578       so it would be good practice to start using undashed names in new and
579       revised code.
580
581   After saving a file...
582       Note that a PDF object such as $pdf cannot continue to be used after
583       saving an output PDF file or string with $pdf->save(), saveas(), or
584       stringify(). There is some cleanup and other operations done internally
585       which make the object unusable for further operations. You will likely
586       receive an error message about can't call method new_obj on an
587       undefined value if you try to keep using a PDF object.
588
589   IntegrityCheck
590       The PDF::Builder methods that open an existing PDF file, pass it by the
591       integrity checker method, "$self->IntegrityCheck(level, content)". This
592       method servers two purposes: 1) to find any "/Version" settings that
593       override the PDF version found in the PDF heading, and 2) perform some
594       basic validations on the contents of the PDF.
595
596       The "level" parameter accepts the following values:
597
598       0 = Do not output any diagnostic messages; just return any version
599       override.
600       1 = Output error-level (serious) diagnostic messages, as well as
601       returning any version override.
602           Errors include, in no place was the /Root object specified, or if
603           it was, the indicated object was not found. An object claims
604           another object as its child (/Kids list), but another object has
605           already claimed that child. An object claims a child, but that
606           child does not list a Parent, or the child lists a different
607           Parent.
608
609       2 = Output error- (serious) and warning- (less serious) level
610       diagnostic messages, as well as returning any version override. This is
611       the default.
612       3 = Output error- (serious), warning- (less serious), and note-
613       (informational) level diagnostic messages, as well as returning any
614       version override.
615           Notes include, in no place was the (optional) /Info object
616           specified, or if it was, the indicated object was not found. An
617           object was referenced, but no entry for it was found among the
618           objects. (This may be OK if the object is not defined, or is on the
619           free list, as the reference will then be ignored.) An object is
620           defined, but it appears that no other object is referencing it.
621
622       4 = Output error-, warning-, and note-level diagnostic messages, as
623       well as returning any version override. Also dump the diagnostic data
624       structure.
625       5 = Output error-, warning-, and note-level diagnostic messages, as
626       well as returning any version override. Also dump the diagnostic data
627       structure and the $self data structure (generally useful only if you
628       have already read in the PDF file).
629
630       The version is a string (e.g., '1.5') if found, otherwise "undef"
631       (undefined value) is returned.
632
633       For controlling the "automatic" call to IntegrityCheck (via opens), the
634       level may be given with the option (flag) "diaglevel => n", where "n"
635       is between 0 and 5.
636
637   Preferences - set user display preferences
638       $pdf->preferences(%options)
639           Controls viewing preferences for the PDF.
640
641       Page Mode Options
642
643           fullscreen
644               Full-screen mode, with no menu bar, window controls, or any
645               other window visible.
646
647           thumbs
648               Thumbnail images visible.
649
650           outlines
651               Document outline visible.
652
653       Page Layout Options
654
655           singlepage
656               Display one page at a time.
657
658           onecolumn
659               Display the pages in one column.
660
661           twocolumnleft
662               Display the pages in two columns, with oddnumbered pages on the
663               left.
664
665           twocolumnright
666               Display the pages in two columns, with oddnumbered pages on the
667               right.
668
669       Viewer Options
670
671           hidetoolbar
672               Specifying whether to hide tool bars.
673
674           hidemenubar
675               Specifying whether to hide menu bars.
676
677           hidewindowui
678               Specifying whether to hide user interface elements.
679
680           fitwindow
681               Specifying whether to resize the document's window to the size
682               of the displayed page.
683
684           centerwindow
685               Specifying whether to position the document's window in the
686               center of the screen.
687
688           displaytitle
689               Specifying whether the window's title bar should display the
690               document title taken from the Title entry of the document
691               information dictionary.
692
693           afterfullscreenthumbs
694               Thumbnail images visible after Full-screen mode.
695
696           afterfullscreenoutlines
697               Document outline visible after Full-screen mode.
698
699           printscalingnone
700               Set the default print setting for page scaling to none.
701
702           simplex
703               Print single-sided by default.
704
705           duplexflipshortedge
706               Print duplex by default and flip on the short edge of the
707               sheet.
708
709           duplexfliplongedge
710               Print duplex by default and flip on the long edge of the sheet.
711
712       Page Fit Options
713
714       These options are used for the "firstpage" layout, as well as for
715       Annotations, Named Destinations and Outlines.
716
717       'fit' => 1
718           Display the page designated by $page, with its contents magnified
719           just enough to fit the entire page within the window both
720           horizontally and vertically. If the required horizontal and
721           vertical magnification factors are different, use the smaller of
722           the two, centering the page within the window in the other
723           dimension.
724
725       'fith' => $top
726           Display the page designated by $page, with the vertical coordinate
727           $top positioned at the top edge of the window and the contents of
728           the page magnified just enough to fit the entire width of the page
729           within the window.
730
731       'fitv' => $left
732           Display the page designated by $page, with the horizontal
733           coordinate $left positioned at the left edge of the window and the
734           contents of the page magnified just enough to fit the entire height
735           of the page within the window.
736
737       'fitr' => [ $left, $bottom, $right, $top ]
738           Display the page designated by $page, with its contents magnified
739           just enough to fit the rectangle specified by the coordinates
740           $left, $bottom, $right, and $top entirely within the window both
741           horizontally and vertically. If the required horizontal and
742           vertical magnification factors are different, use the smaller of
743           the two, centering the rectangle within the window in the other
744           dimension.
745
746       'fitb' => 1
747           Display the page designated by $page, with its contents magnified
748           just enough to fit its bounding box entirely within the window both
749           horizontally and vertically. If the required horizontal and
750           vertical magnification factors are different, use the smaller of
751           the two, centering the bounding box within the window in the other
752           dimension.
753
754       'fitbh' => $top
755           Display the page designated by $page, with the vertical coordinate
756           $top positioned at the top edge of the window and the contents of
757           the page magnified just enough to fit the entire width of its
758           bounding box within the window.
759
760       'fitbv' => $left
761           Display the page designated by $page, with the horizontal
762           coordinate $left positioned at the left edge of the window and the
763           contents of the page magnified just enough to fit the entire height
764           of its bounding box within the window.
765
766       'xyz' => [ $left, $top, $zoom ]
767           Display the page designated by $page, with the coordinates
768           "$[$left, $top]" positioned at the top-left corner of the window
769           and the contents of the page magnified by the factor $zoom. A zero
770           (0) value for any of the parameters $left, $top, or $zoom specifies
771           that the current value of that parameter is to be retained
772           unchanged.
773
774       Initial Page Options
775
776       firstpage => [ $page, %options ]
777           Specifying the page (either a page number or a page object) to be
778           displayed, plus one of the location options listed above in "Page
779           Fit Options".
780
781       Example
782
783           $pdf->preferences(
784               fullscreen => 1,
785               onecolumn => 1,
786               afterfullscreenoutlines => 1,
787               firstpage => [$page, fit => 1],
788           );
789
790   info Example
791           %h = $pdf->info(
792               'Author'       => "Alfred Reibenschuh",
793               'CreationDate' => "D:20020911000000+01'00'",
794               'ModDate'      => "D:YYYYMMDDhhmmssOHH'mm'",
795               'Creator'      => "fredos-script.pl",
796               'Producer'     => "PDF::Builder",
797               'Title'        => "some Publication",
798               'Subject'      => "perl ?",
799               'Keywords'     => "all good things are pdf"
800           );
801           print "Author: $h{'Author'}\n";
802
803   XMP XML example
804           $xml = $pdf->xmpMetadata();
805           print "PDFs Metadata reads: $xml\n";
806           $xml=<<EOT;
807           <?xpacket begin='' id='W5M0MpCehiHzreSzNTczkc9d'?>
808           <?adobe-xap-filters esc="CRLF"?>
809           <x:xmpmeta
810             xmlns:x='adobe:ns:meta/'
811             x:xmptk='XMP toolkit 2.9.1-14, framework 1.6'>
812               <rdf:RDF
813                 xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
814                 xmlns:iX='http://ns.adobe.com/iX/1.0/'>
815                   <rdf:Description
816                     rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
817                     xmlns:pdf='http://ns.adobe.com/pdf/1.3/'
818                     pdf:Producer='Acrobat Distiller 6.0.1 for Macintosh'></rdf:Description>
819                   <rdf:Description
820                     rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
821                     xmlns:xap='http://ns.adobe.com/xap/1.0/'
822                     xap:CreateDate='2004-11-14T08:41:16Z'
823                     xap:ModifyDate='2004-11-14T16:38:50-08:00'
824                     xap:CreatorTool='FrameMaker 7.0'
825                     xap:MetadataDate='2004-11-14T16:38:50-08:00'></rdf:Description>
826                   <rdf:Description
827                     rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
828                     xmlns:xapMM='http://ns.adobe.com/xap/1.0/mm/'
829                     xapMM:DocumentID='uuid:919b9378-369c-11d9-a2b5-000393c97fd8'/></rdf:Description>
830                   <rdf:Description
831                     rdf:about='uuid:b8659d3a-369e-11d9-b951-000393c97fd8'
832                     xmlns:dc='http://purl.org/dc/elements/1.1/'
833                     dc:format='application/pdf'>
834                       <dc:description>
835                         <rdf:Alt>
836                           <rdf:li xml:lang='x-default'>Adobe Portable Document Format (PDF)</rdf:li>
837                         </rdf:Alt>
838                       </dc:description>
839                       <dc:creator>
840                         <rdf:Seq>
841                           <rdf:li>Adobe Systems Incorporated</rdf:li>
842                         </rdf:Seq>
843                       </dc:creator>
844                       <dc:title>
845                         <rdf:Alt>
846                           <rdf:li xml:lang='x-default'>PDF Reference, version 1.6</rdf:li>
847                         </rdf:Alt>
848                       </dc:title>
849                   </rdf:Description>
850               </rdf:RDF>
851           </x:xmpmeta>
852           <?xpacket end='w'?>
853           EOT
854
855           $xml = $pdf->xmpMetadata($xml);
856           print "PDF metadata now reads: $xml\n";
857
858   "BOX" METHODS
859       A general note: Use care if specifying a different Media Box (or other
860       "box") for a page, than the global "box" setting, to define the whole
861       "chain" of boxes on the page, to avoid surprises. For example, to
862       define a global Media Box (paper size) and a global Crop Box, and then
863       define a new page-level Media Box without defining a new page-level
864       Crop Box, may give odd results in the resultant cropping. Such
865       combinations are not well defined.
866
867       All dimensions in boxes default to the default User Unit, which is
868       points (1/72 inch). Note that the PDF specification limits sizes and
869       coordinates to 14400 User Units (200 inches, for the default User Unit
870       of one point), and Adobe products (so far) follow this limit for
871       Acrobat and Distiller. It is worth noting that other PDF writers and
872       readers may choose to ignore the 14400 unit limit, with or without the
873       use of a specified User Unit. Therefore, PDF::Builder does not enforce
874       any limits on coordinates -- it's your responsibility to consider what
875       readers and other PDF tools may be used with a PDF you produce!  Also
876       note that earlier Acrobat readers had coordinate limits as small as
877       3240 User Units (45 inches), and minimum media size of 72 or 3 User
878       Units.
879
880       User Units
881
882       $pdf->userunit($number)
883           The default User Unit in the PDF coordinate system is one point
884           (1/72 inch). You can think of it as a scale factor to enable larger
885           (or even, smaller) documents.  This method may be used (for PDF 1.6
886           and higher) to set the User Unit to some number of points. For
887           example, userunit(72) will set the scale multiplier to 72.0 points
888           per User Unit, or 1 inch to the User Unit. Any number greater than
889           zero is acceptable, although some readers and tools may not handle
890           User Units of less than 1.0 very well.
891
892           Not all readers respect the User Unit, if you give one, or handle
893           it in exactly the same way. Adobe Distiller, for one, does not use
894           it. How User Units are handled may vary from reader to reader.
895           Adobe Acrobat, at this writing, respects User Unit in version 7.0
896           and up, but limits it to 75000 (giving a maximum document size of
897           15 million inches or 236.7 miles or 381 km). Other readers and PDF
898           tools may allow a larger (or smaller) limit.
899
900           Your Mileage May Vary: Some readers ignore a global User Unit
901           setting and do not have pages inherit it (PDF::Builder duplicates
902           it on each page to simulate inheritance). Some readers may give
903           spurious warnings about truncated content when a Media Box is
904           changed while User Units are being used. Some readers do strange
905           things with Crop Boxes when a User Unit is in effect.
906
907           Depending on the reader used, the effect of a larger User Unit
908           (greater than 1) may mean lower resolution (chunkier or coarser
909           appearance) in the rendered document. If you're printing something
910           the size of a highway billboard, this may not matter to you, but
911           you should be aware of the possibility (even with fractional
912           coordinates). Conversely, a User Unit of less than 1.0 (if
913           permitted) reduces the allowable size of your document, but may
914           result in greater resolution.
915
916           A global (PDF level) User Unit setting is inherited by each page
917           (an action by PDF::Builder, not necessarily automatically done by
918           the reader), or can be overridden by calling userunit in the page.
919           Do not give more than one global userunit setting, as only the last
920           one will be used.  Setting a page's User Unit (if "$page->"
921           instead) is permitted (overriding the global setting for this
922           page). However, many sources recommend against doing this, as
923           results may not be as expected (once again, depending on the quirks
924           of the reader).
925
926           Remember to call "userunit" before calling anything having to do
927           with page or box sizes, or coordinates. Especially when setting
928           'named' box sizes, the methods need to know the current User Unit
929           so that named page sizes (in points) may be scaled down to the
930           current User Unit.
931
932       Media Box
933
934       $pdf->mediabox($name)
935       $pdf->mediabox($name, orient => 'orientation' )
936       $pdf->mediabox($w,$h)
937       $pdf->mediabox($llx,$lly, $urx,$ury)
938       ($llx,$lly, $urx,$ury) = $pdf->mediabox()
939           Sets the global Media Box (or page's Media Box, if "$page->"
940           instead).  This defines the width and height (or by corner
941           coordinates, or by standard name) of the output page itself, such
942           as the physical paper size. This is normally the largest of the
943           "boxes". If any subsidiary box (within it) exceeds the media box,
944           the portion of the material or boxes outside of the Media Box will
945           be ignored. That is, the Media Box is the One Box to Rule Them All,
946           and is the overall limit for other boxes (some documentation refers
947           to the Media Box as "clipping" other boxes). In addition, the Media
948           Box defines the overall coordinate system for text and graphics
949           operations.
950
951           If no arguments are given, the current Media Box (global or page)
952           coordinates are returned instead. The former "get_mediabox" (page
953           only) function is deprecated and will likely be removed some time
954           in the future. In addition, when setting the Media Box, the
955           resulting coordinates are returned. This permits you to specify the
956           page size by a name (alias) and get the dimensions back, all in one
957           call.
958
959           Note that many printers can not print all the way to the physical
960           edge of the paper, so you should plan to leave some blank margin,
961           even outside of any crop marks and bleeds. Printers and on-screen
962           readers are free to discard any content found outside the Media
963           Box, and printers may discard some material just inside the Media
964           Box.
965
966           A global Media Box is required by the PDF spec; if not explicitly
967           given, PDF::Builder will set the global Media Box to US Letter size
968           (8.5in x 11in).  This is the media size that will be used for all
969           pages if you do not specify a "mediabox" call on a page. That is, a
970           global (PDF level) mediabox setting is inherited by each page, or
971           can be overridden by setting mediabox in the page. Do not give more
972           than one global mediabox setting, as only the last one will be
973           used.
974
975           If you give a single string name (e.g., 'A4'), you may optionally
976           add an orientation to turn the page 90 degrees into Landscape mode:
977           "orient => 'L'" or "orient => 'l'". "orient" is the only option
978           recognized, and a string beginning with an 'L' or 'l' (for
979           Landscape) is the only value of interest (anything else is treated
980           as Portrait mode). The y axis still runs from 0 at the bottom of
981           the page to what used to be the page width (now, height) at the
982           top, and likewise for the x axis: 0 at left to (former) height at
983           the right. That is, the coordinate system is the same as before,
984           except that the height and width are different.
985
986           The lower left corner does not have to be 0,0. It can be any values
987           you want, including negative values (so long as the resulting
988           media's sides are at least one point long). "mediabox" sets the
989           coordinate system (including the origin) of the graphics and text
990           that will be drawn, as well as for subsequent "boxes".  It's even
991           possible to give any two opposite corners (such as upper left and
992           lower right). The coordinate system will be rearranged (by the
993           Reader) to still be the conventional minimum "x" and "y" in the
994           lower left (i.e., you can't make "y" increase from top to bottom!).
995
996           Example:
997
998               $pdf = PDF::Builder->new();
999               $pdf->mediabox('A4'); # A4 size (595 Pt wide by 842 Pt high)
1000               ...
1001               $pdf->saveas('our/new.pdf');
1002
1003               $pdf = PDF::Builder->new();
1004               $pdf->mediabox(595, 842); # A4 size, with implicit 0,0 LL corner
1005               ...
1006               $pdf->saveas('our/new.pdf');
1007
1008               $pdf = PDF::Builder->new;
1009               $pdf->mediabox(0, 0, 595, 842); # A4 size, with explicit 0,0 LL corner
1010               ...
1011               $pdf->saveas('our/new.pdf');
1012
1013           See the PDF::Builder::Resource::PaperSizes source code for the full
1014           list of supported names (aliases) and their dimensions in points.
1015           You are free to add additional paper sizes to this file, if you
1016           wish. You might want to do this if you frequently use a standard
1017           page size in rotated (Landscape) mode. See also the "getPaperSizes"
1018           call in PDF::Builder::Util. These names (aliases) are also usable
1019           in other "box" calls, although useful only if the "box" is the same
1020           size as the full media (Media Box), and you don't mind their
1021           starting at 0,0.
1022
1023       Crop Box
1024
1025       $pdf->cropbox($name)
1026       $pdf->cropbox($name, orient => 'orientation')
1027       $pdf->cropbox($w,$h)
1028       $pdf->cropbox($llx,$lly, $urx,$ury)
1029       ($llx,$lly, $urx,$ury) = $pdf->cropbox()
1030           Sets the global Crop Box (or page's Crop Box, if "$page->"
1031           instead).  This will define the media size to which the output will
1032           later be clipped. Note that this does not itself output any crop
1033           marks to guide cutting of the paper! PDF Readers should consider
1034           this to be the visible portion of the page, and anything found
1035           outside it may be clipped (invisible). By default, it is equal to
1036           the Media Box, but may be defined to be smaller, in the coordinate
1037           system set by the Media Box. A global setting will be inherited by
1038           each page, but can be overridden on a per-page basis.
1039
1040           A Reader or Printer may choose to discard any clipped (invisible)
1041           part of the page, and show only the area within the Crop Box. For
1042           example, if your page Media Box is A4 (0,0 to 595,842 Points), and
1043           your Crop Box is (100,100 to 495,742), a reader such as Adobe
1044           Acrobat Reader may show you a page 395 by 642 Points in size (i.e.,
1045           just the visible area of your page). Other Readers may show you the
1046           full media size (Media Box) and a 100 Point wide blank area (in
1047           this example) around the visible content.
1048
1049           If no arguments are given, the current Crop Box (global or page)
1050           coordinates are returned instead. The former "get_cropbox" (page
1051           only) function is deprecated and will likely be removed some time
1052           in the future. If a Crop Box has not been defined, the Media Box
1053           coordinates (which always exist) will be returned instead. In
1054           addition, when setting the Crop Box, the resulting coordinates are
1055           returned. This permits you to specify the crop box by a name
1056           (alias) and get the dimensions back, all in one call.
1057
1058           Do not confuse the Crop Box with the "Trim Box", which shows where
1059           printed paper is expected to actually be cut. Some PDF Readers may
1060           reduce the visible "paper" background to the size of the crop box;
1061           others may simply omit any content outside it. Either way, you
1062           would lose any trim or crop marks, printer instructions, color
1063           alignment dots, or other content outside the Crop Box. A good use
1064           of the Crop Box would be limit printing to the area where a printer
1065           can reliably put down ink, and leave white the edge areas where
1066           paper-handling mechanisms prevent ink or toner from being applied.
1067           This would keep you from accidentally putting valuable content in
1068           an area where a printer will refuse to print, yet permit you to
1069           include a bleed area and space for printer's marks and
1070           instructions. Needless to say, if your printer cannot print to the
1071           very edge of the paper, you will need to trim (cut) the printed
1072           sheets to get true bleeds.
1073
1074           A global (PDF level) cropbox setting is inherited by each page, or
1075           can be overridden by setting cropbox in the page.  As with
1076           "mediabox", only one crop box may be set at this (PDF) level.  As
1077           with "mediabox", a named media size may have an orientation (l or
1078           L) for Landscape mode.  Note that the PDF level global Crop Box
1079           will be used even if the page gets its own Media Box. That is, the
1080           page's Crop Box inherits the global Crop Box, not the page Media
1081           Box, even if the page has its own media size! If you set the page's
1082           own Media Box, you should consider also explicitly setting the page
1083           Crop Box (and other boxes).
1084
1085       Bleed Box
1086
1087       $pdf->bleedbox($name)
1088       $pdf->bleedbox($name, orient => 'orientation')
1089       $pdf->bleedbox($w,$h)
1090       $pdf->bleedbox($llx,$lly, $urx,$ury)
1091       ($llx,$lly, $urx,$ury) = $pdf->bleedbox()
1092           Sets the global Bleed Box (or page's Bleed Box, if "$page->"
1093           instead).  This is typically used in printing on paper, where you
1094           want ink or color (such as thumb tabs) to be printed a bit beyond
1095           the final paper size, to ensure that the cut paper bleeds (the cut
1096           goes through the ink), rather than accidentally leaving some white
1097           paper visible outside.  Allow enough "bleed" over the expected trim
1098           line to account for minor variations in paper handling, folding,
1099           and cutting; to avoid showing white paper at the edge.  The Bleed
1100           Box is where printing could actually extend to; the Trim Box is
1101           normally within it, where the paper would actually be cut. The
1102           default value is equal to the Crop Box, but is often a bit smaller.
1103           The space between the Bleed Box and the Crop Box is available for
1104           printer instructions, color alignment dots, etc., while crop marks
1105           (trim guides) are at least partly within the bleed area (and should
1106           be printed after content is printed).
1107
1108           If no arguments are given, the current Bleed Box (global or page)
1109           coordinates are returned instead. The former "get_bleedbox" (page
1110           only) function is deprecated and will likely be removed some time
1111           in the future. If a Bleed Box has not been defined, the Crop Box
1112           coordinates (if defined) will be returned, otherwise the Media Box
1113           coordinates (which always exist) will be returned.  In addition,
1114           when setting the Bleed Box, the resulting coordinates are returned.
1115           This permits you to specify the bleed box by a name (alias) and get
1116           the dimensions back, all in one call.
1117
1118           A global (PDF level) bleedbox setting is inherited by each page, or
1119           can be overridden by setting bleedbox in the page.  As with
1120           "mediabox", only one bleed box may be set at this (PDF) level.  As
1121           with "mediabox", a named media size may have an orientation (l or
1122           L) for Landscape mode.  Note that the PDF level global Bleed Box
1123           will be used even if the page gets its own Crop Box. That is, the
1124           page's Bleed Box inherits the global Bleed Box, not the page Crop
1125           Box, even if the page has its own media size! If you set the page's
1126           own Media Box or Crop Box, you should consider also explicitly
1127           setting the page Bleed Box (and other boxes).
1128
1129       Trim Box
1130
1131       $pdf->trimbox($name)
1132       $pdf->trimbox($name, orient => 'orientation')
1133       $pdf->trimbox($w,$h)
1134       $pdf->trimbox($llx,$lly, $urx,$ury)
1135       ($llx,$lly, $urx,$ury) = $pdf->trimbox()
1136           Sets the global Trim Box (or page's Trim Box, if "$page->"
1137           instead).  This is supposed to be the actual dimensions of the
1138           finished page (after trimming of the paper). In some production
1139           environments, it is useful to have printer's instructions, cut
1140           marks, and so on outside of the trim box. The default value is
1141           equal to Crop Box, but is often a bit smaller than any Bleed Box,
1142           to allow the desired "bleed" effect.
1143
1144           If no arguments are given, the current Trim Box (global or page)
1145           coordinates are returned instead. The former "get_trimbox" (page
1146           only) function is deprecated and will likely be removed some time
1147           in the future. If a Trim Box has not been defined, the Crop Box
1148           coordinates (if defined) will be returned, otherwise the Media Box
1149           coordinates (which always exist) will be returned.  In addition,
1150           when setting the Trim Box, the resulting coordinates are returned.
1151           This permits you to specify the trim box by a name (alias) and get
1152           the dimensions back, all in one call.
1153
1154           A global (PDF level) trimbox setting is inherited by each page, or
1155           can be overridden by setting trimbox in the page.  As with
1156           "mediabox", only one trim box may be set at this (PDF) level.  As
1157           with "mediabox", a named media size may have an orientation (l or
1158           L) for Landscape mode.  Note that the PDF level global Trim Box
1159           will be used even if the page gets its own Crop Box. That is, the
1160           page's Trim Box inherits the global Trim Box, not the page Crop
1161           Box, even if the page has its own media size! If you set the page's
1162           own Media Box or Crop Box, you should consider also explicitly
1163           setting the page Trim Box (and other boxes).
1164
1165       Art Box
1166
1167       $pdf->artbox($name)
1168       $pdf->artbox($name, orient => 'orientation')
1169       $pdf->artbox($w,$h)
1170       $pdf->artbox($llx,$lly, $urx,$ury)
1171       ($llx,$lly, $urx,$ury) = $pdf->artbox()
1172           Sets the global Art Box (or page's Art Box, if "$page->" instead).
1173           This is supposed to define "the extent of the page's meaningful
1174           content (including [margins])". It might exclude some content, such
1175           as Headlines or headings. Any binding or punched-holes margin would
1176           typically be outside of the Art Box, as would be page numbers and
1177           running headers and footers. The default value is equal to the Crop
1178           Box, although normally it would be no larger than any Trim Box. The
1179           Art Box may often be used for defining "important" content (e.g.,
1180           excluding advertisements) that may or may not be brought over to
1181           another page (e.g., N-up printing).
1182
1183           If no arguments are given, the current Art Box (global or page)
1184           coordinates are returned instead. The former "get_artbox" (page
1185           only) function is deprecated and will likely be removed some time
1186           in the future. If an Art Box has not been defined, the Crop Box
1187           coordinates (if defined) will be returned, otherwise the Media Box
1188           coordinates (which always exist) will be returned.  In addition,
1189           when setting the Art Box, the resulting coordinates are returned.
1190           This permits you to specify the art box by a name (alias) and get
1191           the dimensions back, all in one call.
1192
1193           A global (PDF level) artbox setting is inherited by each page, or
1194           can be overridden by setting artbox in the page.  As with
1195           "mediabox", only one art box may be set at this (PDF) level.  As
1196           with "mediabox", a named media size may have an orientation (l or
1197           L) for Landscape mode.  Note that the PDF level global Art Box will
1198           be used even if the page gets its own Crop Box. That is, the page's
1199           Art Box inherits the global Art Box, not the page Crop Box, even if
1200           the page has its own media size! If you set the page's own Media
1201           Box or Crop Box, you should consider also explicitly setting the
1202           page Art Box (and other boxes).
1203
1204       Suggested Box Usage
1205
1206       See "examples/Boxes.pl" for an example of using boxes.
1207
1208       How you define your boxes (or let them default) is up to you, depending
1209       on whether you're duplex printing US Letter or A4 on your laser
1210       printer, to be spiral bound on the bind margin, or engaging a
1211       professional printer. In the latter case, discuss in advance with the
1212       print firm what capabilities (and limitations) they have and what
1213       information they need from a PDF file. For instance, they may not want
1214       a Crop Box defined, and may call for very specific box sizes. For large
1215       press runs, they may print multiple pages (N-up) duplexed on large web
1216       roll "signatures", which are then intricately folded and guillotined
1217       (trimmed) and bound together into books or magazines. You would usually
1218       just supply a PDF with all the pages; they would take care of the
1219       signature layout (which includes offsets and 180 degree rotations).
1220
1221       (As an aside, don't count on a printer having any particular font
1222       available, so be sure to ask. Usually they will want you to embed all
1223       fonts used, but ask first, and double-check before handing over the
1224       print job! TTF/OTF fonts (ttfont()) are embedded by default, but other
1225       fonts (core, ps, bdf, cjk) are not! A printer may have a core font
1226       collection, but they are free to substitute a "workalike" font for any
1227       given core font, and the results may not match what you saw on your
1228       PC!)
1229
1230       On the assumption that you're using a single sheet (US Letter or A4)
1231       laser or inkjet printer, are you planning to trim each sheet down to a
1232       smaller final size? If so, you can do true bleeds by defining a Trim
1233       Box and a slightly larger Bleed Box. You would print bleeds (all the
1234       way to the finished edge) out to the Bleed Box, but nothing is enforced
1235       about the Bleed Box. At the other end of the spectrum, you would define
1236       the Media Box to be the physical paper size being printed on. Most
1237       printers reserve a little space on the sides (and possibly top and
1238       bottom) for paper handling, so it is often good to define your Crop Box
1239       as the printable area. Remember that the Media Box sets the coordinate
1240       system used, so you still need to avoid going outside the Crop Box with
1241       content (most readers and printers will not show any ink outside of the
1242       Crop Box). Whether or not you define a Crop Box, you're going to almost
1243       always end up with white paper on at least the sides.
1244
1245       For small in-house jobs, you probably won't need color alignment dots
1246       and other such professional instructions and information between the
1247       Bleed Box and the Crop Box, but crop marks for trimming (if used)
1248       should go just outside the Trim Box (partly or wholly within the Bleed
1249       Box), and be drawn after all content. If you're not trimming the paper,
1250       don't try to do any bleed effects (including solid background color
1251       pages/covers), as you will usually have a white edge around the sheet
1252       anyway. Don't count on a PDF document never being physically printed,
1253       and not just displayed (where you can do things like bleed all the way
1254       to the media edge). Finally, for single sheet printing, an Art Box is
1255       probably unnecessary, but if you're combining pages into N-up prints,
1256       or doing other manipulations, it may be useful.
1257
1258       Box Inheritance
1259
1260       What Media, Crop, Bleed, Trim, and Art Boxes a page gets can be a
1261       little complicated. Note that usually, only the Media and Crop Boxes
1262       will have a clear visual effect. The visual effect of the other boxes
1263       (if any) may be very subtle.
1264
1265       First, everything is set at the global (PDF) level. The Media Box is
1266       always defined, and defaults to US Letter (8.5 inches wide by 11 inches
1267       high). The global Crop Box inherits the Media Box, unless explicitly
1268       defined. The Bleed, Trim, and Art Boxes inherit the Crop Box, unless
1269       explicitly defined. A global box should only be defined once, as the
1270       last one defined is the one that will be written to the PDF!
1271
1272       Second, a page inherits the global boxes, for its initial settings. You
1273       may call any of the box set methods ("cropbox", "trimbox", etc.) to
1274       explicitly set (override) any box for this page. Note that setting a
1275       new Media Box for the page does not reset the page's Crop Box -- it
1276       still uses whatever it inherited from the global Crop Box. You would
1277       need to explicitly set the page's Crop Box if you want a different
1278       setting. Likewise, the page's Bleed, Trim, and Art Boxes will not be
1279       reset by a new page Crop Box -- they will still inherit from the global
1280       (PDF) settings.
1281
1282       Third, the page Media Box (the one actually used for output pages),
1283       clips or limits all the other boxes to extend no larger than its size.
1284       For example, if the Media Box is US Letter, and you set a Crop Box of
1285       A4 size, the smaller of the two heights (11 inches) would be effective,
1286       and the smaller of the two widths (8.26 inches, 595 Points) would be
1287       effective.  The given dimensions of a box are returned on query (get),
1288       not the effective dimensions clipped by the Media Box.
1289
1290   FONT METHODS
1291       Core Fonts
1292
1293       Core fonts are limited to single byte encodings. You cannot use UTF-8
1294       or other multibyte encodings with core fonts. The default encoding for
1295       the core fonts is WinAnsiEncoding (roughly the CP-1252 superset of
1296       ISO-8859-1). See the "encode" option below to change this encoding.
1297       See "font automap" in PDF::Builder::Resource::Font method for
1298       information on accessing more than 256 glyphs in a font, using planes,
1299       although there is no guarantee that future changes to font files will
1300       permit consistent results.
1301
1302       Note that core fonts use fixed lists of expected glyphs, along with
1303       metrics such as their widths. This may not exactly match up with
1304       whatever local font file is used by the PDF reader. It's usually pretty
1305       close, but many cases have been found where the list of glyphs is
1306       different between the core fonts and various local font files, so be
1307       aware of this.
1308
1309       To allow UTF-8 text and extended glyph counts, you should consider
1310       replacing your use of core fonts with TrueType (.ttf) and OpenType
1311       (.otf) fonts. There are tools, such as FontForge, which can do a fairly
1312       good (though, not perfect) job of converting a Type1 font library to
1313       OTF.
1314
1315       Examples:
1316
1317           $font1 = $pdf->corefont('Times-Roman', encode => 'latin2');
1318           $font2 = $pdf->corefont('Times-Bold');
1319           $font3 = $pdf->corefont('Helvetica');
1320           $font4 = $pdf->corefont('ZapfDingbats');
1321
1322       Valid %options are:
1323
1324       encode
1325           Changes the encoding of the font from its default. Notice that the
1326           encoding (not the entire font's glyph list) is shown in a PDF
1327           object (record), listing 256 glyphs associated with this encoding
1328           (and that are available in this font).
1329
1330       dokern
1331           Enables kerning if data is available.
1332
1333       Notes:
1334
1335       Even though these are called "core" fonts, they are not shipped with
1336       PDF::Builder, but are expected to be found on the machine with the PDF
1337       reader. Most core fonts are installed with a PDF reader, and thus are
1338       not coordinated with PDF::Builder. PDF::Builder does ship with core
1339       font metrics files (width, glyph names, etc.), but these cannot be
1340       guaranteed to be in sync with what the PDF reader has installed!
1341
1342       There are some 14 core fonts (regular, italic, bold, and bold-italic
1343       for Times [serif], Helvetica [sans serif], Courier [fixed pitch]; plus
1344       two symbol fonts) that are supposed to be available on any PDF reader,
1345       although other fonts with very similar metrics are often substituted.
1346       You should not count on any of the 15 Windows core fonts (Bank Gothic,
1347       Georgia, Trebuchet, Verdana, and two more symbol fonts) being present,
1348       especially on Linux, Mac, or other non-Windows platforms. Be aware if
1349       you are producing PDFs to be read on a variety of different systems!
1350
1351       If you want to ensure the widest portability for a PDF document you
1352       produce, you should consider using TTF fonts (instead of core fonts)
1353       and embedding them in the document. This ensures that there will be no
1354       substitutions, that all metrics are known and match the glyphs, UTF-8
1355       encoding can be used, and that the glyphs will be available on the
1356       reader's machine. At least on Windows platforms, most of the fonts are
1357       TTF anyway, which are used behind the scenes for "core" fonts, while
1358       missing most of the capabilities of TTF (now or possibly later in
1359       PDF::Builder) such as embedding, ligatures, UTF-8, etc.  The downside
1360       is, obviously, that the resulting PDF file will be larger because it
1361       includes the font(s). There might also be copyright or licensing issues
1362       with the redistribution of font files in this manner (you might want to
1363       check, before widely distributing a PDF document with embedded fonts,
1364       although many do permit the part of the font used, to be embedded.).
1365
1366       See also PDF::Builder::Resource::Font::CoreFont.
1367
1368       PS Fonts
1369
1370       PS (T1) fonts are limited to single byte encodings. You cannot use
1371       UTF-8 or other multibyte encodings with T1 fonts.  The default encoding
1372       for the T1 fonts is WinAnsiEncoding (roughly the CP-1252 superset of
1373       ISO-8859-1). See the "encode" option below to change this encoding.
1374       See "font automap" in PDF::Builder::Resource::Font method for
1375       information on accessing more than 256 glyphs in a font, using planes,
1376       although there is no guarantee that future changes to font files will
1377       permit consistent results.  Note: many Type1 fonts are limited to 256
1378       glyphs, but some are available with more than 256 glyphs. Still, a
1379       maximum of 256 at a time are usable.
1380
1381       "psfont" accepts both ASCII (.pfa) and binary (.pfb) Type1 glyph files.
1382       Font metrics can be supplied in either ASCII (.afm) or binary (.pfm)
1383       format, as can be seen in the examples given below. It is possible to
1384       use .pfa with .pfm and .pfb with .afm if that's what's available. The
1385       ASCII and binary files have the same content, just in different
1386       formats.
1387
1388       To allow UTF-8 text and extended glyph counts in one font, you should
1389       consider replacing your use of Type1 fonts with TrueType (.ttf) and
1390       OpenType (.otf) fonts. There are tools, such as FontForge, which can do
1391       a fairly good (though, not perfect) job of converting your font library
1392       to OTF.
1393
1394       Examples:
1395
1396           $font1 = $pdf->psfont('Times-Book.pfa', afmfile => 'Times-Book.afm');
1397           $font2 = $pdf->psfont('/fonts/Synest-FB.pfb', pfmfile => '/fonts/Synest-FB.pfm');
1398
1399       Valid %options are:
1400
1401       encode
1402           Changes the encoding of the font from its default. Notice that the
1403           encoding (not the entire font's glyph list) is shown in a PDF
1404           object (record), listing 256 glyphs associated with this encoding
1405           (and that are available in this font).
1406
1407       afmfile
1408           Specifies the location of the ASCII font metrics file (.afm). It
1409           may be used with either an ASCII (.pfa) or binary (.pfb) glyph
1410           file.
1411
1412       pfmfile
1413           Specifies the location of the binary font metrics file (.pfm). It
1414           may be used with either an ASCII (.pfa) or binary (.pfb) glyph
1415           file.
1416
1417       dokern
1418           Enables kerning if data is available.
1419
1420       Note: these T1 (Type1) fonts are not shipped with PDF::Builder, but are
1421       expected to be found on the machine with the PDF reader. Most PDF
1422       readers do not install T1 fonts, and it is up to the user of the PDF
1423       reader to install the needed fonts. Unlike TrueType fonts, PS (T1)
1424       fonts are not embedded in the PDF, and must be supplied on the Reader
1425       end.
1426
1427       See also PDF::Builder::Resource::Font::Postscript.
1428
1429       TrueType Fonts
1430
1431       Warning: BaseEncoding is not set by default for TrueType fonts, so text
1432       in the PDF isn't searchable (by the PDF reader) unless a ToUnicode CMap
1433       is included. A ToUnicode CMap is included by default (unicodemap set to
1434       1) by PDF::Builder, but allows it to be disabled (for performance and
1435       file size reasons) by setting unicodemap to 0. This will produce non-
1436       searchable text, which, besides being annoying to users, may prevent
1437       screen readers and other aids to disabled users from working correctly!
1438
1439       Examples:
1440
1441           $font1 = $pdf->ttfont('Times.ttf');
1442           $font2 = $pdf->ttfont('Georgia.otf');
1443
1444       Valid %options are:
1445
1446       encode
1447           Changes the encoding of the font from its default
1448           (WinAnsiEncoding).
1449
1450           Note that for a single byte encoding (e.g., 'latin1'), you are
1451           limited to 256 characters defined for that encoding. 'automap' does
1452           not work with TrueType.  If you want more characters than that, use
1453           'utf8' encoding with a UTF-8 encoded text string.
1454
1455       isocmap
1456           Use the ISO Unicode Map instead of the default MS Unicode Map.
1457
1458       unicodemap
1459           If 1 (default), output ToUnicode CMap to permit text searches and
1460           screen readers. Set to 0 to save space by not including the
1461           ToUnicode CMap, but text searching and screen reading will not be
1462           possible.
1463
1464       dokern
1465           Enables kerning if data is available.
1466
1467       noembed
1468           Disables embedding of the font file. Note that this is potentially
1469           hazardous, as the glyphs provided on the PDF reader machine may not
1470           match what was used on the PDF writer machine (the one running
1471           PDF::Builder)! If you know for sure that all PDF readers will be
1472           using the same TTF or OTF file you're using with PDF::Builder; not
1473           embedding the font may be acceptable, in return for a smaller PDF
1474           file size. Note that the Reader needs to know where to find the
1475           font file -- it can't be in any random place, but typically needs
1476           to be listed in a path that the Reader follows. Otherwise, it will
1477           be unable to render the text!
1478
1479           The only value for the "noembed" flag currently checked for is 1,
1480           which means to not embed the font file in the PDF. Any other value
1481           currently results in the font file being embedded (by default),
1482           although in the future, other values might be given significance
1483           (such as checking permission bits).
1484
1485           Some additional comments on embedding font file(s) into the PDF:
1486           besides substantially increasing the size of the PDF (even if the
1487           font is subsetted, by default), PDF::Builder does not check the
1488           font file for any flags indicating font licensing issues and
1489           limitations on use. A font foundry may not permit embedding at all,
1490           may permit a subset of the font to be embedded, may permit a full
1491           font to be embedded, and may specify what can be done with an
1492           embedded font (e.g., may or may not be extracted for further use
1493           beyond displaying this one PDF). When you choose to use (and embed)
1494           a font, you should be aware of any such licensing issues.
1495
1496       nosubset
1497           Disables subsetting of a TTF/OTF font, when embedded. By default,
1498           only the glyphs used by a document are included in the file, and
1499           not the entire font.  This can result in a tremendous savings in
1500           PDF file size. If you intend to allow the PDF to be edited by
1501           users, not having the entire font glyph set available may cause
1502           problems, so be aware of that (and consider using "nosubset => 1".
1503           Setting this flag to any value results in the entire font glyph set
1504           being embedded in the file. It might be a good idea to use only the
1505           value 1, in case other values are assigned roles in the future.
1506
1507       debug
1508           If set to 1 (default is 0), diagnostic information is output about
1509           the CMap processing.
1510
1511       usecmf
1512           If set to 1 (default is 0), the first priority is to make use of
1513           one of the four ".cmap" files for CJK fonts. This is the old way of
1514           processing TTF files. If, after all is said and done, a working
1515           internal CMap hasn't been found (for usecmf=>0), ttfont() will fall
1516           back to using a ".cmap" file if possible.
1517
1518       cmaps
1519           This flag may be set to a string listing the Platform/Encoding
1520           pairs to look for of any internal CMaps in the font file, in the
1521           desired order (highest priority first). If one list (comma and/or
1522           space-separated pairs) is given, it is used for both Windows and
1523           non-Windows platforms (on which PDF::Builder is running, not the
1524           PDF reader's). Two lists, separated by a semicolon ; may be given,
1525           with the first being used for a Windows platform and the second for
1526           non-Windows. The default list is "0/6 3/10 0/4 3/1 0/3; 0/6 0/4
1527           3/10 0/3 3/1".  Finally, instead of a P/E list, a string "find_ms"
1528           may be given to tell it to simply call the Font::TTF find_ms()
1529           method to find a (preferably Windows) internal CMap. "cmaps" set to
1530           'find_ms' would emulate the old way of looking for CMaps. Symbol
1531           fonts (3/0) always use find_ms(), and the new default lookup is (if
1532           ".cmap" isn't used, see "usecmf") to try to get a match with the
1533           default list for the appropriate OS. If none can be found,
1534           find_ms() is tried, and as last resort use the ".cmap" (if
1535           available), even if "usecmf" is not 1.
1536
1537       CJK Fonts
1538
1539       Examples:
1540
1541           $font = $pdf->cjkfont('korean');
1542           $font = $pdf->cjkfont('traditional');
1543
1544       Valid %options are:
1545
1546       encode
1547           Changes the encoding of the font from its default.
1548
1549       Warning: Unlike "ttfont", the font file is not embedded in the output
1550       PDF file. This is evidently behavior left over from the early days of
1551       CJK fonts, where the "Cmap" and "Data" were always external files,
1552       rather than internal tables.  If you need a CJK-using PDF file to embed
1553       the font, for portability, you can create a PDF using "cjkfont", and
1554       then use an external utility (e.g., "pdfcairo") to embed the font in
1555       the PDF. It may also be possible to use "ttfont" instead, to produce
1556       the PDF, provided you can deduce the correct font file name from
1557       examining the PDF file (e.g., on my Windows system, the "Ming" font
1558       would be "$font = $pdf->ttfont("C:/Program Files/Adobe/Acrobat
1559       DC/Resource/CIDFont/AdobeMingStd-Light.otf")".  Of course, the font
1560       file used would have to be ".ttf" or ".otf".  It may act a little
1561       differently than "cjkfont" (due a a different Cmap), but you should be
1562       able to embed the font file into the PDF.
1563
1564       See also PDF::Builder::Resource::CIDFont::CJKFont
1565
1566       Synthetic Fonts
1567
1568       Warning: BaseEncoding is not set by default for these fonts, so text in
1569       the PDF isn't searchable (by the PDF reader) unless a ToUnicode CMap is
1570       included. A ToUnicode CMap is included by default (unicodemap set to 1)
1571       by PDF::Builder, but allows it to be disabled (for performance and file
1572       size reasons) by setting unicodemap to 0. This will produce non-
1573       searchable text, which, besides being annoying to users, may prevent
1574       screen readers and other aids to disabled users from working correctly!
1575
1576       Examples:
1577
1578           $cf  = $pdf->corefont('Times-Roman', encode => 'latin1');
1579           $sf  = $pdf->synfont($cf, condense => 0.85);   # compressed 85%
1580           $sfb = $pdf->synfont($cf, bold => 1);          # embolden by 10em
1581           $sfi = $pdf->synfont($cf, oblique => -12);     # italic at -12 degrees
1582
1583       Valid %options are:
1584
1585       condense
1586           Character width condense/expand factor (0.1-0.9 = condense, 1 =
1587           normal/default, 1.1+ = expand). It is the multiplier to apply to
1588           the width of each character.
1589
1590       oblique
1591           Italic angle (+/- degrees, default 0), sets skew of character box.
1592
1593       bold
1594           Emboldening factor (0.1+, bold = 1, heavy = 2, ...), additional
1595           thickness to draw outline of character (with a heavier line width)
1596           before filling.
1597
1598       space
1599           Additional character spacing in milliems (0-1000)
1600
1601       caps
1602           0 for normal text, 1 for small caps.  Implemented by asking the
1603           font what the uppercased translation (single character) is for a
1604           given character, and outputting it at 80% height and 88% width
1605           (heavier vertical stems are better looking than a straight 80%
1606           scale).
1607
1608           Note that only lower case letters which appear in the "standard"
1609           font (plane 0 for core fonts and PS fonts) will be small-capped.
1610           This may include eszett (German sharp s), which becomes SS, and
1611           dotless i and j which become I and J respectively. There are many
1612           other accented Latin alphabet letters which may show up in planes 1
1613           and higher. Ligatures (e.g., ij and ffl) do not have uppercase
1614           equivalents, nor does a long s. If you have text which includes
1615           such characters, you may want to consider preprocessing it to
1616           replace them with Latin character expansions (e.g., i+j and f+f+l)
1617           before small-capping.
1618
1619       Note that CJK fonts (created with the "cjkfont" method) do not work
1620       properly with "synfont". This is due to a different internal structure
1621       of the CJK fonts, as compared to corefont, ttfont, and psfont base
1622       fonts.  If you require a synthesized (modified) CJK font, you might try
1623       finding the TTF or OTF original, use "ttfont" to create the base font,
1624       and running "synfont" against that, in the manner described for
1625       embedding "CJK Fonts".
1626
1627       See also PDF::Builder::Resource::Font::SynFont
1628
1629   IMAGE METHODS
1630       This is additional information on enhanced libraries available for TIFF
1631       and PNG images. See specific information listings for GD, GIF, JPEG,
1632       and PNM image formats. In addition, see "examples/Content.pl" for an
1633       example of placing an image on a page, as well as using in a "Form".
1634
1635       Why is my image flipped or rotated?
1636
1637       Something not uncommonly seen when using JPEG photos in a PDF is that
1638       the images will be rotated and/or mirrored (flipped). This may happen
1639       when using TIFF images too. What happens is that the camera stores an
1640       image just as it comes off the CCD sensor, regardless of the camera
1641       orientation, and does not rotate it to the correct orientation! It does
1642       store a separate "orientation" flag to suggest how the image might be
1643       corrected, but not all image processing obeys this flag (PDF::Builder
1644       does not.). For example, if you take a "portrait" (tall) photo of a
1645       tree (with the phone held vertically), and then use it in a PDF, the
1646       tree may appear to have been cut down! (appears in landscape mode)
1647
1648       I have found some code that should allow the "image_jpeg" or "image"
1649       routine to auto-rotate to (supposedly) the correct orientation, by
1650       looking for the Exif metadata "Orientation" tag in the file. However,
1651       three problems arise: 1) if a photo has been edited, and rotated or
1652       flipped in the process, there is no guarantee that the Orientation tag
1653       has been corrected.  2) more than one Orientation tag may exist (e.g.,
1654       in the binary APP1/Exif header, and in XML data), and they may not
1655       agree with each other -- which should be used?  3) the code would need
1656       to uncompress the raster data, swap and/or transpose rows and/or
1657       columns, and recompress the raster data for inclusion into the PDF.
1658       This is costly and error-prone.  In any case, the user would need to be
1659       able to override any auto-rotate function.
1660
1661       For the time being, PDF::Builder will simply leave it up to the user of
1662       the library to take care of rotating and/or flipping an image which
1663       displays incorrectly. It is possible that we will consider adding some
1664       sort of query or warning that the image appears to not be "normally"
1665       oriented (Orientation value 1 or "Top-left"), according to the
1666       Orientation flag. You can consider either (re-)saving the photo in an
1667       editor such as PhotoShop or GIMP, or using PDF::Builder code similar to
1668       the following (for images rotated 180 degrees):
1669
1670           $pW = 612; $pH = 792;  # page dimensions (US Letter)
1671           my $img = $pdf->image_jpeg("AliceLake.jpeg");
1672           # raw size WxH 4032x3024, scaled down to 504x378
1673           $sW = 4032/8; $sH = 3024/8;
1674           # intent is to center on US Letter sized page (LL at 54,207)
1675           # Orientation flag on this image is 3 (rotated 180 degrees).
1676           # if naively displayed (just $gfx->image call), it will be upside down
1677
1678           $gfx->save();
1679
1680           ## method 0: simple display, is rotated 180 degrees!
1681           #$gfx->image($img, ($pW-$sW)/2,($pH-$sH)/2, $sW,$sH);
1682
1683           ## method 1: translate, then rotate
1684           #$gfx->translate($pW,$pH);             # to new origin (media UR corner)
1685           #$gfx->rotate(180);                    # rotate around new origin
1686           #$gfx->image($img, ($pW-$sW)/2,($pH-$sH)/2, $sW,$sH);
1687                                                  # image's UR corner, not LL
1688
1689           # method 2: rotate, then translate
1690           $gfx->rotate(180);                     # rotate around current origin
1691           $gfx->translate(-$sW,-$sH);            # translate in rotated coordinates
1692           $gfx->image($img, -($pW-$sW)/2,-($pH-$sH)/2, $sW,$sH);
1693                                                  # image's UR corner, not LL
1694
1695           ## method 3: flip (mirror) twice
1696           #$scale = 1;  # not rescaling here
1697           #$size_page = $pH/$scale;
1698           #$invScale = 1.0/$scale;
1699           #$gfx->add("-$invScale 0 0 -$invScale 0 $size_page cm");
1700           #$gfx->image($img, -($pW-$sW)/2-$sW,($pH-$sH)/2, $sW,$sH);
1701
1702           $gfx->restore();
1703
1704       If your image is also mirrored (flipped about an axis), simple rotation
1705       will not suffice. You could do something with a reversal of the
1706       coordinate system, as in "method 3" above (see "Advanced Methods" in
1707       PDF::Builder::Content). To mirror only left/right, the second $invScale
1708       would be positive; to mirror only top/bottom, the first would be
1709       positive. If all else fails, you could save a mirrored copy in a photo
1710       editor.  90 or 270 degree rotations will require a "rotate" call,
1711       possibly with "cm" usage to reverse mirroring.  Incidentally, do not
1712       confuse this issue with the coordinate flipping performed by some
1713       Chrome browsers when printing a page to PDF.
1714
1715       Note that TIFF images may have the same rotation/mirroring problems as
1716       JPEG, which is not surprising, as the Exif format was lifted from TIFF
1717       for use in JPEG. The cure will be similar to JPEG's.
1718
1719       TIFF Images
1720
1721       Note that the Graphics::TIFF support library does not currently permit
1722       a filehandle for $file.
1723
1724       PDF::Builder will use the Graphics::TIFF support library for TIFF
1725       functions, if it is available, unless explicitly told not to. Your code
1726       can test whether Graphics::TIFF is available by examining
1727       "$tiff->usesLib()" or "$pdf->LA_GT()".
1728
1729       = -1
1730           Graphics::TIFF is installed, but your code has specified "nouseGT",
1731           to not use it. The old, pure Perl, code (buggy!) will be used
1732           instead, as if Graphics::TIFF was not installed.
1733
1734       = 0 Graphics::TIFF is not installed. Not all systems are able to
1735           successfully install this package, as it requires libtiff.a.
1736
1737       = 1 Graphics::TIFF is installed and is being used.
1738
1739       Options:
1740
1741       nouseGT => 1
1742           Do not use the Graphics::TIFF library, even if it's available.
1743           Normally you would want to use this library, but there may be cases
1744           where you don't, such as when you want to use a file handle instead
1745           of a name.
1746
1747       silent => 1
1748           Do not give the message that Graphics::TIFF is not installed. This
1749           message will be given only once, but you may want to suppress it,
1750           such as during t-tests.
1751
1752       PNG Images
1753
1754       PDF::Builder will use the Image::PNG::Libpng support library for PNG
1755       functions, if it is available, unless explicitly told not to. Your code
1756       can test whether Image::PNG::Libpng is available by examining
1757       "$png->usesLib()" or "$pdf->LA_IPL()".
1758
1759       = -1
1760           Image::PNG::Libpng is installed, but your code has specified
1761           "nouseIPL", to not use it. The old, pure Perl, code (slower and
1762           less capable) will be used instead, as if Image::PNG::Libpng was
1763           not installed.
1764
1765       = 0 Image::PNG::Libpng is not installed. Not all systems are able to
1766           successfully install this package, as it requires libpng.a.
1767
1768       = 1 Image::PNG::Libpng is installed and is being used.
1769
1770       Options:
1771
1772       nouseIPL => 1
1773           Do not use the Image::PNG::Libpng library, even if it's available.
1774           Normally you would want to use this library, when available, but
1775           there may be cases where you don't.
1776
1777       silent => 1
1778           Do not give the message that Image::PNG::Libpng is not installed.
1779           This message will be given only once, but you may want to suppress
1780           it, such as during t-tests.
1781
1782       notrans => 1
1783           No transparency -- ignore tRNS chunk if provided, ignore Alpha
1784           channel if provided.
1785
1786   USING SHAPER (HarfBuzz::Shaper library)
1787           # if HarfBuzz::Shaper is not installed, either bail out, or try to
1788           # use regular TTF calls instead
1789           my $rc;
1790           $rc = eval {
1791               require HarfBuzz::Shaper;
1792               1;
1793           };
1794           if (!defined $rc) { $rc = 0; }
1795           if ($rc == 0) {
1796               # bail out in some manner
1797           } else {
1798               # can use Shaper
1799           }
1800
1801           my $fontfile = '/WINDOWS/Fonts/times.ttf'; # used by both Shaper and textHS
1802           my $fontsize = 15;                         # used by both Shaper and textHS
1803           my $font = $pdf->ttfont($fontfile);
1804           $text->font($font, $fontsize);
1805
1806           my $hb = HarfBuzz::Shaper->new(); # only need to set up once
1807           my %settings; # for textHS(), not Shaper
1808           $settings{'dump'} = 1; # see the diagnostics
1809           $settings{'script'} = 'Latn';
1810           $settings('dir'} = 'L';  # LTR
1811           $settings{'features'} = ();  # required
1812
1813           # -- set language (override automatic setting)
1814           #$settings{'language'} = 'en';
1815           #$hb->set_language( 'en_US' );
1816           # -- turn OFF ligatures
1817           #push @{ $settings{'features'} }, 'liga';
1818           #$hb->add_features( 'liga' );
1819           # -- turn OFF kerning
1820           #push @{ $settings{'features'} }, 'kern';
1821           #$hb->add_features( 'kern' );
1822           $hb->set_font($fontfile);
1823           $hb->set_size($fontsize);
1824           $hb->set_text("Let's eat waffles in the field for brunch.");
1825             # expect ffl and fi ligatures, and perhaps some kerning
1826
1827           my $info = $hb->shaper();
1828           $text->textHS($info, \%settings); # strikethru, underline allowed
1829
1830       The package HarfBuzz::Shaper may be optionally installed in order to
1831       use the text-shaping capabilities of the HarfBuzz library. These
1832       include kerning and ligatures in Western scripts (such as the Latin
1833       alphabet). More complex scripts can be handled, such as Arabic family
1834       and Indic scripts, where multiple forms of a character may be
1835       automatically selected, characters may be reordered, and other
1836       modifications made. The examples/HarfBuzz.pl script gives some examples
1837       of what may be done.
1838
1839       Keep in mind that HarfBuzz works only with TrueType (.ttf) and OpenType
1840       (.otf) font files. It will not work with PostScript (Type1), core,
1841       bitmapped, or CJK fonts. Not all .ttf fonts have the instructions
1842       necessary to guide HarfBuzz, but most proper .otf fonts do. In other
1843       words, there are no guarantees that a particular font file will work
1844       with Shaper!
1845
1846       The basic idea is to break up text into "chunks" which are of the same
1847       script (alphabet), language, direction, font face, font size, and
1848       variant (italic, bold, etc.). These could range from a single character
1849       to paragraph-length strings of text. These are fed to HarfBuzz::Shaper,
1850       along with flags, the font file to be used, and other supporting
1851       information, to create an array of output glyphs. Each element is a
1852       hash describing the glyph to be output, including its name (if
1853       available), its glyph ID (number) in the selected font, its x and y
1854       displacement (usually 0), and its "advance" x and y values, all in
1855       points. For horizontal languages (LTR and RTL), the y advance is
1856       normally 0 and the x advance is the font's character width, less any
1857       kerning amount.
1858
1859       Shaper will attempt to figure out the script used and the text
1860       direction, based on the Unicode range; and a reasonable guess at the
1861       language used. The language can be overridden, but currently the script
1862       and text direction cannot be overridden.
1863
1864       An important note: the number of glyphs (array elements) may not be
1865       equal to the number of Unicode points (characters) given in the chunk's
1866       text string!  Sometimes a character will be decomposed into several
1867       pieces (multiple glyphs); sometimes multiple characters may be combined
1868       into a single ligature glyph; and characters may be reordered
1869       (especially in Indic and Southeast Asian languages).  As well, for
1870       Right-to-Left (bidirectional) scripts such as Hebrew and Arabic
1871       families, the text is output in Left-to-Right order (reversed from the
1872       input).
1873
1874       With due care, a Shaper array can be manipulated in code. The elements
1875       are more or less independent of each other, so elements can be
1876       modified, rearranged, inserted, or deleted. You might adjust the
1877       position of a glyph with 'dx' and 'dy' hash elements. The 'ax' value
1878       should be left alone, so that the wrong kerning isn't calculated, but
1879       you might need to adjust the "advance x" value by means of one of the
1880       following:
1881
1882       axs is a value to be substituted for 'ax' (points)
1883       axsp is a substituted value (percentage) of the original 'ax'
1884       axr reduces 'ax' by the value (points). If negative, increase 'ax'
1885       axrp reduces 'ax' by the given percentage. Again, negative increases
1886       'ax'
1887
1888       Caution: a given character's glyph ID is not necessarily going to be
1889       the same between any two fonts! For example, an ASCII space (U+0020)
1890       might be "<0001>" in one font, and "<0003>" in another font (even one
1891       closely related!). A U+00A0 required blank (non-breaking space) may be
1892       output as a regular ASCII space U+0020. Take care if you need to find a
1893       particular glyph in the array, especially if the number of elements
1894       don't match. Consider making a text string of "marker" characters
1895       (space, nbsp, hyphen, soft hyphen, etc.) and processing it through
1896       HarfBuzz::Shaper to get the corresponding glyph numbers. You may have
1897       to count spaces, say, to see where you could break a glyph array to fit
1898       a line.
1899
1900       The advancewidthHS() method uses the same inputs as does textHS().
1901       Like advancewidth(), it returns the chunk length in points. Unlike
1902       advancewidth(), you cannot override the glyph array's font, font size,
1903       etc.
1904
1905       Once you have your (possibly modified) array of glyphs, you feed it to
1906       the textHS() method to render it to the page. Remember that this method
1907       handles only a single line of text; it does not do line splitting or
1908       fitting -- that you currently need to do manually. For Western scripts
1909       (e.g., Latin), that might not be too difficult, but for other scripts
1910       that involve extensive modification of the raw characters, it may be
1911       quite difficult to split words, but you still may be able to split at
1912       inter-word spaces.
1913
1914       A useful, but not exhaustive, set of functions are allowed by textHS()
1915       use.  Support includes direction setting (top-to-bottom and bottom-to-
1916       top directions, e.g., for Far Eastern languages in traditional
1917       orientation), and explicit script names and language (depending on what
1918       support HarfBuzz itself gives).  Not yet supported are features such as
1919       discretionary ligatures and manual selection of glyphs (e.g., swashes
1920       and alternate forms).
1921
1922       Currently, textHS() can only handle a single text string. We are
1923       looking at how fitting to a line length (splitting up an array) could
1924       be done, as well as how words might be split on hard and soft hyphens.
1925       At some point, full paragraph and page shaping could be possible.
1926
1927
1928
1929perl v5.38.0                      2023-07-21             PDF::Builder::Docs(3)
Impressum