1PERLCOMPILE(1) Perl Programmers Reference Guide PERLCOMPILE(1)
2
3
4
6 perlcompile - Introduction to the Perl Compiler-Translator
7
9 Perl has always had a compiler: your source is compiled into an inter‐
10 nal form (a parse tree) which is then optimized before being run.
11 Since version 5.005, Perl has shipped with a module capable of inspect‐
12 ing the optimized parse tree ("B"), and this has been used to write
13 many useful utilities, including a module that lets you turn your Perl
14 into C source code that can be compiled into a native executable.
15
16 The "B" module provides access to the parse tree, and other modules
17 ("back ends") do things with the tree. Some write it out as bytecode,
18 C source code, or a semi-human-readable text. Another traverses the
19 parse tree to build a cross-reference of which subroutines, formats,
20 and variables are used where. Another checks your code for dubious
21 constructs. Yet another back end dumps the parse tree back out as Perl
22 source, acting as a source code beautifier or deobfuscator.
23
24 Because its original purpose was to be a way to produce C code corre‐
25 sponding to a Perl program, and in turn a native executable, the "B"
26 module and its associated back ends are known as "the compiler", even
27 though they don't really compile anything. Different parts of the com‐
28 piler are more accurately a "translator", or an "inspector", but people
29 want Perl to have a "compiler option" not an "inspector gadget". What
30 can you do?
31
32 This document covers the use of the Perl compiler: which modules it
33 comprises, how to use the most important of the back end modules, what
34 problems there are, and how to work around them.
35
36 Layout
37
38 The compiler back ends are in the "B::" hierarchy, and the front-end
39 (the module that you, the user of the compiler, will sometimes interact
40 with) is the O module. Some back ends (e.g., "B::C") have programs
41 (e.g., perlcc) to hide the modules' complexity.
42
43 Here are the important back ends to know about, with their status
44 expressed as a number from 0 (outline for later implementation) to 10
45 (if there's a bug in it, we're very surprised):
46
47 B::Bytecode
48 Stores the parse tree in a machine-independent format, suitable for
49 later reloading through the ByteLoader module. Status: 5 (some
50 things work, some things don't, some things are untested).
51
52 B::C
53 Creates a C source file containing code to rebuild the parse tree
54 and resume the interpreter. Status: 6 (many things work ade‐
55 quately, including programs using Tk).
56
57 B::CC
58 Creates a C source file corresponding to the run time code path in
59 the parse tree. This is the closest to a Perl-to-C translator
60 there is, but the code it generates is almost incomprehensible
61 because it translates the parse tree into a giant switch structure
62 that manipulates Perl structures. Eventual goal is to reduce
63 (given sufficient type information in the Perl program) some of the
64 Perl data structure manipulations into manipulations of C-level
65 ints, floats, etc. Status: 5 (some things work, including uncom‐
66 plicated Tk examples).
67
68 B::Lint
69 Complains if it finds dubious constructs in your source code. Sta‐
70 tus: 6 (it works adequately, but only has a very limited number of
71 areas that it checks).
72
73 B::Deparse
74 Recreates the Perl source, making an attempt to format it coher‐
75 ently. Status: 8 (it works nicely, but a few obscure things are
76 missing).
77
78 B::Xref
79 Reports on the declaration and use of subroutines and variables.
80 Status: 8 (it works nicely, but still has a few lingering bugs).
81
83 The following sections describe how to use the various compiler back
84 ends. They're presented roughly in order of maturity, so that the most
85 stable and proven back ends are described first, and the most experi‐
86 mental and incomplete back ends are described last.
87
88 The O module automatically enabled the -c flag to Perl, which prevents
89 Perl from executing your code once it has been compiled. This is why
90 all the back ends print:
91
92 myperlprogram syntax OK
93
94 before producing any other output.
95
96 The Cross Referencing Back End
97
98 The cross referencing back end (B::Xref) produces a report on your pro‐
99 gram, breaking down declarations and uses of subroutines and variables
100 (and formats) by file and subroutine. For instance, here's part of the
101 report from the pod2man program that comes with Perl:
102
103 Subroutine clear_noremap
104 Package (lexical)
105 $ready_to_print i1069, 1079
106 Package main
107 $& 1086
108 $. 1086
109 $0 1086
110 $1 1087
111 $2 1085, 1085
112 $3 1085, 1085
113 $ARGV 1086
114 %HTML_Escapes 1085, 1085
115
116 This shows the variables used in the subroutine "clear_noremap". The
117 variable $ready_to_print is a my() (lexical) variable, introduced
118 (first declared with my()) on line 1069, and used on line 1079. The
119 variable $& from the main package is used on 1086, and so on.
120
121 A line number may be prefixed by a single letter:
122
123 i Lexical variable introduced (declared with my()) for the first
124 time.
125
126 & Subroutine or method call.
127
128 s Subroutine defined.
129
130 r Format defined.
131
132 The most useful option the cross referencer has is to save the report
133 to a separate file. For instance, to save the report on myperlprogram
134 to the file report:
135
136 $ perl -MO=Xref,-oreport myperlprogram
137
138 The Decompiling Back End
139
140 The Deparse back end turns your Perl source back into Perl source. It
141 can reformat along the way, making it useful as a de-obfuscator. The
142 most basic way to use it is:
143
144 $ perl -MO=Deparse myperlprogram
145
146 You'll notice immediately that Perl has no idea of how to paragraph
147 your code. You'll have to separate chunks of code from each other with
148 newlines by hand. However, watch what it will do with one-liners:
149
150 $ perl -MO=Deparse -e '$op=shift⎪⎪die "usage: $0
151 code [...]";chomp(@ARGV=<>)unless@ARGV; for(@ARGV){$was=$_;eval$op;
152 die$@ if$@; rename$was,$_ unless$was eq $_}'
153 -e syntax OK
154 $op = shift @ARGV ⎪⎪ die("usage: $0 code [...]");
155 chomp(@ARGV = <ARGV>) unless @ARGV;
156 foreach $_ (@ARGV) {
157 $was = $_;
158 eval $op;
159 die $@ if $@;
160 rename $was, $_ unless $was eq $_;
161 }
162
163 The decompiler has several options for the code it generates. For
164 instance, you can set the size of each indent from 4 (as above) to 2
165 with:
166
167 $ perl -MO=Deparse,-si2 myperlprogram
168
169 The -p option adds parentheses where normally they are omitted:
170
171 $ perl -MO=Deparse -e 'print "Hello, world\n"'
172 -e syntax OK
173 print "Hello, world\n";
174 $ perl -MO=Deparse,-p -e 'print "Hello, world\n"'
175 -e syntax OK
176 print("Hello, world\n");
177
178 See B::Deparse for more information on the formatting options.
179
180 The Lint Back End
181
182 The lint back end (B::Lint) inspects programs for poor style. One pro‐
183 grammer's bad style is another programmer's useful tool, so options let
184 you select what is complained about.
185
186 To run the style checker across your source code:
187
188 $ perl -MO=Lint myperlprogram
189
190 To disable context checks and undefined subroutines:
191
192 $ perl -MO=Lint,-context,-undefined-subs myperlprogram
193
194 See B::Lint for information on the options.
195
196 The Simple C Back End
197
198 This module saves the internal compiled state of your Perl program to a
199 C source file, which can be turned into a native executable for that
200 particular platform using a C compiler. The resulting program links
201 against the Perl interpreter library, so it will not save you disk
202 space (unless you build Perl with a shared library) or program size.
203 It may, however, save you startup time.
204
205 The "perlcc" tool generates such executables by default.
206
207 perlcc myperlprogram.pl
208
209 The Bytecode Back End
210
211 This back end is only useful if you also have a way to load and execute
212 the bytecode that it produces. The ByteLoader module provides this
213 functionality.
214
215 To turn a Perl program into executable byte code, you can use "perlcc"
216 with the "-B" switch:
217
218 perlcc -B myperlprogram.pl
219
220 The byte code is machine independent, so once you have a compiled mod‐
221 ule or program, it is as portable as Perl source (assuming that the
222 user of the module or program has a modern-enough Perl interpreter to
223 decode the byte code).
224
225 See B::Bytecode for information on options to control the optimization
226 and nature of the code generated by the Bytecode module.
227
228 The Optimized C Back End
229
230 The optimized C back end will turn your Perl program's run time code-
231 path into an equivalent (but optimized) C program that manipulates the
232 Perl data structures directly. The program will still link against the
233 Perl interpreter library, to allow for eval(), "s///e", "require", etc.
234
235 The "perlcc" tool generates such executables when using the -O switch.
236 To compile a Perl program (ending in ".pl" or ".p"):
237
238 perlcc -O myperlprogram.pl
239
240 To produce a shared library from a Perl module (ending in ".pm"):
241
242 perlcc -O Myperlmodule.pm
243
244 For more information, see perlcc and B::CC.
245
247 B This module is the introspective ("reflective" in Java terms) mod‐
248 ule, which allows a Perl program to inspect its innards. The back
249 end modules all use this module to gain access to the compiled
250 parse tree. You, the user of a back end module, will not need to
251 interact with B.
252
253 O This module is the front-end to the compiler's back ends. Normally
254 called something like this:
255
256 $ perl -MO=Deparse myperlprogram
257
258 This is like saying "use O 'Deparse'" in your Perl program.
259
260 B::Asmdata
261 This module is used by the B::Assembler module, which is in turn
262 used by the B::Bytecode module, which stores a parse-tree as byte‐
263 code for later loading. It's not a back end itself, but rather a
264 component of a back end.
265
266 B::Assembler
267 This module turns a parse-tree into data suitable for storing and
268 later decoding back into a parse-tree. It's not a back end itself,
269 but rather a component of a back end. It's used by the assemble
270 program that produces bytecode.
271
272 B::Bblock
273 This module is used by the B::CC back end. It walks "basic
274 blocks". A basic block is a series of operations which is known to
275 execute from start to finish, with no possibility of branching or
276 halting.
277
278 B::Bytecode
279 This module is a back end that generates bytecode from a program's
280 parse tree. This bytecode is written to a file, from where it can
281 later be reconstructed back into a parse tree. The goal is to do
282 the expensive program compilation once, save the interpreter's
283 state into a file, and then restore the state from the file when
284 the program is to be executed. See "The Bytecode Back End" for
285 details about usage.
286
287 B::C
288 This module writes out C code corresponding to the parse tree and
289 other interpreter internal structures. You compile the correspond‐
290 ing C file, and get an executable file that will restore the inter‐
291 nal structures and the Perl interpreter will begin running the pro‐
292 gram. See "The Simple C Back End" for details about usage.
293
294 B::CC
295 This module writes out C code corresponding to your program's oper‐
296 ations. Unlike the B::C module, which merely stores the inter‐
297 preter and its state in a C program, the B::CC module makes a C
298 program that does not involve the interpreter. As a consequence,
299 programs translated into C by B::CC can execute faster than normal
300 interpreted programs. See "The Optimized C Back End" for details
301 about usage.
302
303 B::Concise
304 This module prints a concise (but complete) version of the Perl
305 parse tree. Its output is more customizable than the one of
306 B::Terse or B::Debug (and it can emulate them). This module useful
307 for people who are writing their own back end, or who are learning
308 about the Perl internals. It's not useful to the average program‐
309 mer.
310
311 B::Debug
312 This module dumps the Perl parse tree in verbose detail to STDOUT.
313 It's useful for people who are writing their own back end, or who
314 are learning about the Perl internals. It's not useful to the
315 average programmer.
316
317 B::Deparse
318 This module produces Perl source code from the compiled parse tree.
319 It is useful in debugging and deconstructing other people's code,
320 also as a pretty-printer for your own source. See "The Decompiling
321 Back End" for details about usage.
322
323 B::Disassembler
324 This module turns bytecode back into a parse tree. It's not a back
325 end itself, but rather a component of a back end. It's used by the
326 disassemble program that comes with the bytecode.
327
328 B::Lint
329 This module inspects the compiled form of your source code for
330 things which, while some people frown on them, aren't necessarily
331 bad enough to justify a warning. For instance, use of an array in
332 scalar context without explicitly saying "scalar(@array)" is some‐
333 thing that Lint can identify. See "The Lint Back End" for details
334 about usage.
335
336 B::Showlex
337 This module prints out the my() variables used in a function or a
338 file. To get a list of the my() variables used in the subroutine
339 mysub() defined in the file myperlprogram:
340
341 $ perl -MO=Showlex,mysub myperlprogram
342
343 To get a list of the my() variables used in the file myperlprogram:
344
345 $ perl -MO=Showlex myperlprogram
346
347 [BROKEN]
348
349 B::Stackobj
350 This module is used by the B::CC module. It's not a back end
351 itself, but rather a component of a back end.
352
353 B::Stash
354 This module is used by the perlcc program, which compiles a module
355 into an executable. B::Stash prints the symbol tables in use by a
356 program, and is used to prevent B::CC from producing C code for the
357 B::* and O modules. It's not a back end itself, but rather a com‐
358 ponent of a back end.
359
360 B::Terse
361 This module prints the contents of the parse tree, but without as
362 much information as B::Debug. For comparison, "print "Hello,
363 world."" produced 96 lines of output from B::Debug, but only 6
364 from B::Terse.
365
366 This module is useful for people who are writing their own back
367 end, or who are learning about the Perl internals. It's not useful
368 to the average programmer.
369
370 B::Xref
371 This module prints a report on where the variables, subroutines,
372 and formats are defined and used within a program and the modules
373 it loads. See "The Cross Referencing Back End" for details about
374 usage.
375
377 The simple C backend currently only saves typeglobs with alphanumeric
378 names.
379
380 The optimized C backend outputs code for more modules than it should
381 (e.g., DirHandle). It also has little hope of properly handling "goto
382 LABEL" outside the running subroutine ("goto &sub" is okay). "goto
383 LABEL" currently does not work at all in this backend. It also creates
384 a huge initialization function that gives C compilers headaches.
385 Splitting the initialization function gives better results. Other
386 problems include: unsigned math does not work correctly; some opcodes
387 are handled incorrectly by default opcode handling mechanism.
388
389 BEGIN{} blocks are executed while compiling your code. Any external
390 state that is initialized in BEGIN{}, such as opening files, initiating
391 database connections etc., do not behave properly. To work around
392 this, Perl has an INIT{} block that corresponds to code being executed
393 before your program begins running but after your program has finished
394 being compiled. Execution order: BEGIN{}, (possible save of state
395 through compiler back-end), INIT{}, program runs, END{}.
396
398 This document was originally written by Nathan Torkington, and is now
399 maintained by the perl5-porters mailing list perl5-porters@perl.org.
400
401
402
403perl v5.8.8 2006-01-07 PERLCOMPILE(1)