1PERLCOMPILE(1)         Perl Programmers Reference Guide         PERLCOMPILE(1)
2
3
4

NAME

6       perlcompile - Introduction to the Perl Compiler-Translator
7

DESCRIPTION

9       Perl has always had a compiler: your source is compiled into an inter‐
10       nal form (a parse tree) which is then optimized before being run.
11       Since version 5.005, Perl has shipped with a module capable of inspect‐
12       ing the optimized parse tree ("B"), and this has been used to write
13       many useful utilities, including a module that lets you turn your Perl
14       into C source code that can be compiled into a native executable.
15
16       The "B" module provides access to the parse tree, and other modules
17       ("back ends") do things with the tree.  Some write it out as bytecode,
18       C source code, or a semi-human-readable text.  Another traverses the
19       parse tree to build a cross-reference of which subroutines, formats,
20       and variables are used where.  Another checks your code for dubious
21       constructs.  Yet another back end dumps the parse tree back out as Perl
22       source, acting as a source code beautifier or deobfuscator.
23
24       Because its original purpose was to be a way to produce C code corre‐
25       sponding to a Perl program, and in turn a native executable, the "B"
26       module and its associated back ends are known as "the compiler", even
27       though they don't really compile anything.  Different parts of the com‐
28       piler are more accurately a "translator", or an "inspector", but people
29       want Perl to have a "compiler option" not an "inspector gadget".  What
30       can you do?
31
32       This document covers the use of the Perl compiler: which modules it
33       comprises, how to use the most important of the back end modules, what
34       problems there are, and how to work around them.
35
36       Layout
37
38       The compiler back ends are in the "B::" hierarchy, and the front-end
39       (the module that you, the user of the compiler, will sometimes interact
40       with) is the O module.  Some back ends (e.g., "B::C") have programs
41       (e.g., perlcc) to hide the modules' complexity.
42
43       Here are the important back ends to know about, with their status
44       expressed as a number from 0 (outline for later implementation) to 10
45       (if there's a bug in it, we're very surprised):
46
47       B::Bytecode
48           Stores the parse tree in a machine-independent format, suitable for
49           later reloading through the ByteLoader module.  Status: 5 (some
50           things work, some things don't, some things are untested).
51
52       B::C
53           Creates a C source file containing code to rebuild the parse tree
54           and resume the interpreter.  Status: 6 (many things work ade‐
55           quately, including programs using Tk).
56
57       B::CC
58           Creates a C source file corresponding to the run time code path in
59           the parse tree.  This is the closest to a Perl-to-C translator
60           there is, but the code it generates is almost incomprehensible
61           because it translates the parse tree into a giant switch structure
62           that manipulates Perl structures.  Eventual goal is to reduce
63           (given sufficient type information in the Perl program) some of the
64           Perl data structure manipulations into manipulations of C-level
65           ints, floats, etc.  Status: 5 (some things work, including uncom‐
66           plicated Tk examples).
67
68       B::Lint
69           Complains if it finds dubious constructs in your source code.  Sta‐
70           tus: 6 (it works adequately, but only has a very limited number of
71           areas that it checks).
72
73       B::Deparse
74           Recreates the Perl source, making an attempt to format it coher‐
75           ently.  Status: 8 (it works nicely, but a few obscure things are
76           missing).
77
78       B::Xref
79           Reports on the declaration and use of subroutines and variables.
80           Status: 8 (it works nicely, but still has a few lingering bugs).
81

Using The Back Ends

83       The following sections describe how to use the various compiler back
84       ends.  They're presented roughly in order of maturity, so that the most
85       stable and proven back ends are described first, and the most experi‐
86       mental and incomplete back ends are described last.
87
88       The O module automatically enabled the -c flag to Perl, which prevents
89       Perl from executing your code once it has been compiled.  This is why
90       all the back ends print:
91
92         myperlprogram syntax OK
93
94       before producing any other output.
95
96       The Cross Referencing Back End
97
98       The cross referencing back end (B::Xref) produces a report on your pro‐
99       gram, breaking down declarations and uses of subroutines and variables
100       (and formats) by file and subroutine.  For instance, here's part of the
101       report from the pod2man program that comes with Perl:
102
103         Subroutine clear_noremap
104           Package (lexical)
105             $ready_to_print   i1069, 1079
106           Package main
107             $&                1086
108             $.                1086
109             $0                1086
110             $1                1087
111             $2                1085, 1085
112             $3                1085, 1085
113             $ARGV             1086
114             %HTML_Escapes     1085, 1085
115
116       This shows the variables used in the subroutine "clear_noremap".  The
117       variable $ready_to_print is a my() (lexical) variable, introduced
118       (first declared with my()) on line 1069, and used on line 1079.  The
119       variable $& from the main package is used on 1086, and so on.
120
121       A line number may be prefixed by a single letter:
122
123       i   Lexical variable introduced (declared with my()) for the first
124           time.
125
126       &   Subroutine or method call.
127
128       s   Subroutine defined.
129
130       r   Format defined.
131
132       The most useful option the cross referencer has is to save the report
133       to a separate file.  For instance, to save the report on myperlprogram
134       to the file report:
135
136         $ perl -MO=Xref,-oreport myperlprogram
137
138       The Decompiling Back End
139
140       The Deparse back end turns your Perl source back into Perl source.  It
141       can reformat along the way, making it useful as a de-obfuscator.  The
142       most basic way to use it is:
143
144         $ perl -MO=Deparse myperlprogram
145
146       You'll notice immediately that Perl has no idea of how to paragraph
147       your code.  You'll have to separate chunks of code from each other with
148       newlines by hand.  However, watch what it will do with one-liners:
149
150         $ perl -MO=Deparse -e '$op=shift⎪⎪die "usage: $0
151         code [...]";chomp(@ARGV=<>)unless@ARGV; for(@ARGV){$was=$_;eval$op;
152         die$@ if$@; rename$was,$_ unless$was eq $_}'
153         -e syntax OK
154         $op = shift @ARGV ⎪⎪ die("usage: $0 code [...]");
155         chomp(@ARGV = <ARGV>) unless @ARGV;
156         foreach $_ (@ARGV) {
157             $was = $_;
158             eval $op;
159             die $@ if $@;
160             rename $was, $_ unless $was eq $_;
161         }
162
163       The decompiler has several options for the code it generates.  For
164       instance, you can set the size of each indent from 4 (as above) to 2
165       with:
166
167         $ perl -MO=Deparse,-si2 myperlprogram
168
169       The -p option adds parentheses where normally they are omitted:
170
171         $ perl -MO=Deparse -e 'print "Hello, world\n"'
172         -e syntax OK
173         print "Hello, world\n";
174         $ perl -MO=Deparse,-p -e 'print "Hello, world\n"'
175         -e syntax OK
176         print("Hello, world\n");
177
178       See B::Deparse for more information on the formatting options.
179
180       The Lint Back End
181
182       The lint back end (B::Lint) inspects programs for poor style.  One pro‐
183       grammer's bad style is another programmer's useful tool, so options let
184       you select what is complained about.
185
186       To run the style checker across your source code:
187
188         $ perl -MO=Lint myperlprogram
189
190       To disable context checks and undefined subroutines:
191
192         $ perl -MO=Lint,-context,-undefined-subs myperlprogram
193
194       See B::Lint for information on the options.
195
196       The Simple C Back End
197
198       This module saves the internal compiled state of your Perl program to a
199       C source file, which can be turned into a native executable for that
200       particular platform using a C compiler.  The resulting program links
201       against the Perl interpreter library, so it will not save you disk
202       space (unless you build Perl with a shared library) or program size.
203       It may, however, save you startup time.
204
205       The "perlcc" tool generates such executables by default.
206
207         perlcc myperlprogram.pl
208
209       The Bytecode Back End
210
211       This back end is only useful if you also have a way to load and execute
212       the bytecode that it produces.  The ByteLoader module provides this
213       functionality.
214
215       To turn a Perl program into executable byte code, you can use "perlcc"
216       with the "-B" switch:
217
218         perlcc -B myperlprogram.pl
219
220       The byte code is machine independent, so once you have a compiled mod‐
221       ule or program, it is as portable as Perl source (assuming that the
222       user of the module or program has a modern-enough Perl interpreter to
223       decode the byte code).
224
225       See B::Bytecode for information on options to control the optimization
226       and nature of the code generated by the Bytecode module.
227
228       The Optimized C Back End
229
230       The optimized C back end will turn your Perl program's run time code-
231       path into an equivalent (but optimized) C program that manipulates the
232       Perl data structures directly.  The program will still link against the
233       Perl interpreter library, to allow for eval(), "s///e", "require", etc.
234
235       The "perlcc" tool generates such executables when using the -O switch.
236       To compile a Perl program (ending in ".pl" or ".p"):
237
238         perlcc -O myperlprogram.pl
239
240       To produce a shared library from a Perl module (ending in ".pm"):
241
242         perlcc -O Myperlmodule.pm
243
244       For more information, see perlcc and B::CC.
245

Module List for the Compiler Suite

247       B   This module is the introspective ("reflective" in Java terms) mod‐
248           ule, which allows a Perl program to inspect its innards.  The back
249           end modules all use this module to gain access to the compiled
250           parse tree.  You, the user of a back end module, will not need to
251           interact with B.
252
253       O   This module is the front-end to the compiler's back ends.  Normally
254           called something like this:
255
256             $ perl -MO=Deparse myperlprogram
257
258           This is like saying "use O 'Deparse'" in your Perl program.
259
260       B::Asmdata
261           This module is used by the B::Assembler module, which is in turn
262           used by the B::Bytecode module, which stores a parse-tree as byte‐
263           code for later loading.  It's not a back end itself, but rather a
264           component of a back end.
265
266       B::Assembler
267           This module turns a parse-tree into data suitable for storing and
268           later decoding back into a parse-tree.  It's not a back end itself,
269           but rather a component of a back end.  It's used by the assemble
270           program that produces bytecode.
271
272       B::Bblock
273           This module is used by the B::CC back end.  It walks "basic
274           blocks".  A basic block is a series of operations which is known to
275           execute from start to finish, with no possibility of branching or
276           halting.
277
278       B::Bytecode
279           This module is a back end that generates bytecode from a program's
280           parse tree.  This bytecode is written to a file, from where it can
281           later be reconstructed back into a parse tree.  The goal is to do
282           the expensive program compilation once, save the interpreter's
283           state into a file, and then restore the state from the file when
284           the program is to be executed.  See "The Bytecode Back End" for
285           details about usage.
286
287       B::C
288           This module writes out C code corresponding to the parse tree and
289           other interpreter internal structures.  You compile the correspond‐
290           ing C file, and get an executable file that will restore the inter‐
291           nal structures and the Perl interpreter will begin running the pro‐
292           gram.  See "The Simple C Back End" for details about usage.
293
294       B::CC
295           This module writes out C code corresponding to your program's oper‐
296           ations.  Unlike the B::C module, which merely stores the inter‐
297           preter and its state in a C program, the B::CC module makes a C
298           program that does not involve the interpreter.  As a consequence,
299           programs translated into C by B::CC can execute faster than normal
300           interpreted programs.  See "The Optimized C Back End" for details
301           about usage.
302
303       B::Concise
304           This module prints a concise (but complete) version of the Perl
305           parse tree.  Its output is more customizable than the one of
306           B::Terse or B::Debug (and it can emulate them). This module useful
307           for people who are writing their own back end, or who are learning
308           about the Perl internals.  It's not useful to the average program‐
309           mer.
310
311       B::Debug
312           This module dumps the Perl parse tree in verbose detail to STDOUT.
313           It's useful for people who are writing their own back end, or who
314           are learning about the Perl internals.  It's not useful to the
315           average programmer.
316
317       B::Deparse
318           This module produces Perl source code from the compiled parse tree.
319           It is useful in debugging and deconstructing other people's code,
320           also as a pretty-printer for your own source.  See "The Decompiling
321           Back End" for details about usage.
322
323       B::Disassembler
324           This module turns bytecode back into a parse tree.  It's not a back
325           end itself, but rather a component of a back end.  It's used by the
326           disassemble program that comes with the bytecode.
327
328       B::Lint
329           This module inspects the compiled form of your source code for
330           things which, while some people frown on them, aren't necessarily
331           bad enough to justify a warning.  For instance, use of an array in
332           scalar context without explicitly saying "scalar(@array)" is some‐
333           thing that Lint can identify.  See "The Lint Back End" for details
334           about usage.
335
336       B::Showlex
337           This module prints out the my() variables used in a function or a
338           file.  To get a list of the my() variables used in the subroutine
339           mysub() defined in the file myperlprogram:
340
341             $ perl -MO=Showlex,mysub myperlprogram
342
343           To get a list of the my() variables used in the file myperlprogram:
344
345             $ perl -MO=Showlex myperlprogram
346
347           [BROKEN]
348
349       B::Stackobj
350           This module is used by the B::CC module.  It's not a back end
351           itself, but rather a component of a back end.
352
353       B::Stash
354           This module is used by the perlcc program, which compiles a module
355           into an executable.  B::Stash prints the symbol tables in use by a
356           program, and is used to prevent B::CC from producing C code for the
357           B::* and O modules.  It's not a back end itself, but rather a com‐
358           ponent of a back end.
359
360       B::Terse
361           This module prints the contents of the parse tree, but without as
362           much information as B::Debug.  For comparison, "print "Hello,
363           world.""  produced 96 lines of output from B::Debug, but only 6
364           from B::Terse.
365
366           This module is useful for people who are writing their own back
367           end, or who are learning about the Perl internals.  It's not useful
368           to the average programmer.
369
370       B::Xref
371           This module prints a report on where the variables, subroutines,
372           and formats are defined and used within a program and the modules
373           it loads.  See "The Cross Referencing Back End" for details about
374           usage.
375

KNOWN PROBLEMS

377       The simple C backend currently only saves typeglobs with alphanumeric
378       names.
379
380       The optimized C backend outputs code for more modules than it should
381       (e.g., DirHandle).  It also has little hope of properly handling "goto
382       LABEL" outside the running subroutine ("goto &sub" is okay).  "goto
383       LABEL" currently does not work at all in this backend.  It also creates
384       a huge initialization function that gives C compilers headaches.
385       Splitting the initialization function gives better results.  Other
386       problems include: unsigned math does not work correctly; some opcodes
387       are handled incorrectly by default opcode handling mechanism.
388
389       BEGIN{} blocks are executed while compiling your code.  Any external
390       state that is initialized in BEGIN{}, such as opening files, initiating
391       database connections etc., do not behave properly.  To work around
392       this, Perl has an INIT{} block that corresponds to code being executed
393       before your program begins running but after your program has finished
394       being compiled.  Execution order: BEGIN{}, (possible save of state
395       through compiler back-end), INIT{}, program runs, END{}.
396

AUTHOR

398       This document was originally written by Nathan Torkington, and is now
399       maintained by the perl5-porters mailing list perl5-porters@perl.org.
400
401
402
403perl v5.8.8                       2006-01-07                    PERLCOMPILE(1)
Impressum