perlmod(1)

1PERLMOD(1)             Perl Programmers Reference Guide             PERLMOD(1)
2
3
4

NAME

6       perlmod - Perl modules (packages and symbol tables)
7

DESCRIPTION

9   Is this the document you were after?
10       There are other documents which might contain the information that
11       you're looking for:
12
13       This doc
14         Perl's packages, namespaces, and some info on classes.
15
16       perlnewmod
17         Tutorial on making a new module.
18
19       perlmodstyle
20         Best practices for making a new module.
21
22   Packages
23       Unlike Perl 4, in which all the variables were dynamic and shared one
24       global name space, causing maintainability problems, Perl 5 provides
25       two mechanisms for protecting code from having its variables stomped on
26       by other code: lexically scoped variables created with "my" or "state"
27       and namespaced global variables, which are exposed via the "vars"
28       pragma, or the "our" keyword. Any global variable is considered to be
29       part of a namespace and can be accessed via a "fully qualified form".
30       Conversely, any lexically scoped variable is considered to be part of
31       that lexical-scope, and does not have a "fully qualified form".
32
33       In perl namespaces are called "packages" and the "package" declaration
34       tells the compiler which namespace to prefix to "our" variables and
35       unqualified dynamic names.  This both protects against accidental
36       stomping and provides an interface for deliberately clobbering global
37       dynamic variables declared and used in other scopes or packages, when
38       that is what you want to do.
39
40       The scope of the "package" declaration is from the declaration itself
41       through the end of the enclosing block, "eval", or file, whichever
42       comes first (the same scope as the my(), our(), state(), and local()
43       operators, and also the effect of the experimental "reference
44       aliasing," which may change), or until the next "package" declaration.
45       Unqualified dynamic identifiers will be in this namespace, except for
46       those few identifiers that, if unqualified, default to the main package
47       instead of the current one as described below.  A "package" statement
48       affects only dynamic global symbols, including subroutine names, and
49       variables you've used local() on, but not lexical variables created
50       with my(), our() or state().
51
52       Typically, a "package" statement is the first declaration in a file
53       included in a program by one of the "do", "require", or "use"
54       operators.  You can switch into a package in more than one place:
55       "package" has no effect beyond specifying which symbol table the
56       compiler will use for dynamic symbols for the rest of that block or
57       until the next "package" statement.  You can refer to variables and
58       filehandles in other packages by prefixing the identifier with the
59       package name and a double colon: $Package::Variable.  If the package
60       name is null, the "main" package is assumed.  That is, $::sail is
61       equivalent to $main::sail.
62
63       The old package delimiter was a single quote, but double colon is now
64       the preferred delimiter, in part because it's more readable to humans,
65       and in part because it's more readable to emacs macros.  It also makes
66       C++ programmers feel like they know what's going on--as opposed to
67       using the single quote as separator, which was there to make Ada
68       programmers feel like they knew what was going on.  Because the old-
69       fashioned syntax is still supported for backwards compatibility, if you
70       try to use a string like "This is $owner's house", you'll be accessing
71       $owner::s; that is, the $s variable in package "owner", which is
72       probably not what you meant.  Use braces to disambiguate, as in "This
73       is ${owner}'s house".
74
75       Packages may themselves contain package separators, as in
76       $OUTER::INNER::var.  This implies nothing about the order of name
77       lookups, however.  There are no relative packages: all symbols are
78       either local to the current package, or must be fully qualified from
79       the outer package name down.  For instance, there is nowhere within
80       package "OUTER" that $INNER::var refers to $OUTER::INNER::var.  "INNER"
81       refers to a totally separate global package. The custom of treating
82       package names as a hierarchy is very strong, but the language in no way
83       enforces it.
84
85       Only identifiers starting with letters (or underscore) are stored in a
86       package's symbol table.  All other symbols are kept in package "main",
87       including all punctuation variables, like $_.  In addition, when
88       unqualified, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV,
89       INC, and SIG are forced to be in package "main", even when used for
90       other purposes than their built-in ones.  If you have a package called
91       "m", "s", or "y", then you can't use the qualified form of an
92       identifier because it would be instead interpreted as a pattern match,
93       a substitution, or a transliteration.
94
95       Variables beginning with underscore used to be forced into package
96       main, but we decided it was more useful for package writers to be able
97       to use leading underscore to indicate private variables and method
98       names.  However, variables and functions named with a single "_", such
99       as $_ and "sub _", are still forced into the package "main".  See also
100       "The Syntax of Variable Names" in perlvar.
101
102       "eval"ed strings are compiled in the package in which the eval() was
103       compiled.  (Assignments to $SIG{}, however, assume the signal handler
104       specified is in the "main" package.  Qualify the signal handler name if
105       you wish to have a signal handler in a package.)  For an example,
106       examine perldb.pl in the Perl library.  It initially switches to the
107       "DB" package so that the debugger doesn't interfere with variables in
108       the program you are trying to debug.  At various points, however, it
109       temporarily switches back to the "main" package to evaluate various
110       expressions in the context of the "main" package (or wherever you came
111       from).  See perldebug.
112
113       The special symbol "__PACKAGE__" contains the current package, but
114       cannot (easily) be used to construct variable names. After "my($foo)"
115       has hidden package variable $foo, it can still be accessed, without
116       knowing what package you are in, as "${__PACKAGE__.'::foo'}".
117
118       See perlsub for other scoping issues related to my() and local(), and
119       perlref regarding closures.
120
121   Symbol Tables
122       The symbol table for a package happens to be stored in the hash of that
123       name with two colons appended.  The main symbol table's name is thus
124       %main::, or %:: for short.  Likewise the symbol table for the nested
125       package mentioned earlier is named %OUTER::INNER::.
126
127       The value in each entry of the hash is what you are referring to when
128       you use the *name typeglob notation.
129
130           local *main::foo    = *main::bar;
131
132       You can use this to print out all the variables in a package, for
133       instance.  The standard but antiquated dumpvar.pl library and the CPAN
134       module Devel::Symdump make use of this.
135
136       The results of creating new symbol table entries directly or modifying
137       any entries that are not already typeglobs are undefined and subject to
138       change between releases of perl.
139
140       Assignment to a typeglob performs an aliasing operation, i.e.,
141
142           *dick = *richard;
143
144       causes variables, subroutines, formats, and file and directory handles
145       accessible via the identifier "richard" also to be accessible via the
146       identifier "dick".  If you want to alias only a particular variable or
147       subroutine, assign a reference instead:
148
149           *dick = \$richard;
150
151       Which makes $richard and $dick the same variable, but leaves @richard
152       and @dick as separate arrays.  Tricky, eh?
153
154       There is one subtle difference between the following statements:
155
156           *foo = *bar;
157           *foo = \$bar;
158
159       "*foo = *bar" makes the typeglobs themselves synonymous while "*foo =
160       \$bar" makes the SCALAR portions of two distinct typeglobs refer to the
161       same scalar value. This means that the following code:
162
163           $bar = 1;
164           *foo = \$bar;       # Make $foo an alias for $bar
165
166           {
167               local $bar = 2; # Restrict changes to block
168               print $foo;     # Prints '1'!
169           }
170
171       Would print '1', because $foo holds a reference to the original $bar.
172       The one that was stuffed away by "local()" and which will be restored
173       when the block ends. Because variables are accessed through the
174       typeglob, you can use "*foo = *bar" to create an alias which can be
175       localized. (But be aware that this means you can't have a separate @foo
176       and @bar, etc.)
177
178       What makes all of this important is that the Exporter module uses glob
179       aliasing as the import/export mechanism. Whether or not you can
180       properly localize a variable that has been exported from a module
181       depends on how it was exported:
182
183           @EXPORT = qw($FOO); # Usual form, can't be localized
184           @EXPORT = qw(*FOO); # Can be localized
185
186       You can work around the first case by using the fully qualified name
187       ($Package::FOO) where you need a local value, or by overriding it by
188       saying "*FOO = *Package::FOO" in your script.
189
190       The "*x = \$y" mechanism may be used to pass and return cheap
191       references into or from subroutines if you don't want to copy the whole
192       thing.  It only works when assigning to dynamic variables, not
193       lexicals.
194
195           %some_hash = ();                    # can't be my()
196           *some_hash = fn( \%another_hash );
197           sub fn {
198               local *hashsym = shift;
199               # now use %hashsym normally, and you
200               # will affect the caller's %another_hash
201               my %nhash = (); # do what you want
202               return \%nhash;
203           }
204
205       On return, the reference will overwrite the hash slot in the symbol
206       table specified by the *some_hash typeglob.  This is a somewhat tricky
207       way of passing around references cheaply when you don't want to have to
208       remember to dereference variables explicitly.
209
210       Another use of symbol tables is for making "constant" scalars.
211
212           *PI = \3.14159265358979;
213
214       Now you cannot alter $PI, which is probably a good thing all in all.
215       This isn't the same as a constant subroutine, which is subject to
216       optimization at compile-time.  A constant subroutine is one prototyped
217       to take no arguments and to return a constant expression.  See perlsub
218       for details on these.  The "use constant" pragma is a convenient
219       shorthand for these.
220
221       You can say *foo{PACKAGE} and *foo{NAME} to find out what name and
222       package the *foo symbol table entry comes from.  This may be useful in
223       a subroutine that gets passed typeglobs as arguments:
224
225           sub identify_typeglob {
226               my $glob = shift;
227               print 'You gave me ', *{$glob}{PACKAGE},
228                   '::', *{$glob}{NAME}, "\n";
229           }
230           identify_typeglob *foo;
231           identify_typeglob *bar::baz;
232
233       This prints
234
235           You gave me main::foo
236           You gave me bar::baz
237
238       The *foo{THING} notation can also be used to obtain references to the
239       individual elements of *foo.  See perlref.
240
241       Subroutine definitions (and declarations, for that matter) need not
242       necessarily be situated in the package whose symbol table they occupy.
243       You can define a subroutine outside its package by explicitly
244       qualifying the name of the subroutine:
245
246           package main;
247           sub Some_package::foo { ... }   # &foo defined in Some_package
248
249       This is just a shorthand for a typeglob assignment at compile time:
250
251           BEGIN { *Some_package::foo = sub { ... } }
252
253       and is not the same as writing:
254
255           {
256               package Some_package;
257               sub foo { ... }
258           }
259
260       In the first two versions, the body of the subroutine is lexically in
261       the main package, not in Some_package. So something like this:
262
263           package main;
264
265           $Some_package::name = "fred";
266           $main::name = "barney";
267
268           sub Some_package::foo {
269               print "in ", __PACKAGE__, ": \$name is '$name'\n";
270           }
271
272           Some_package::foo();
273
274       prints:
275
276           in main: $name is 'barney'
277
278       rather than:
279
280           in Some_package: $name is 'fred'
281
282       This also has implications for the use of the SUPER:: qualifier (see
283       perlobj).
284
285   BEGIN, UNITCHECK, CHECK, INIT and END
286       Five specially named code blocks are executed at the beginning and at
287       the end of a running Perl program.  These are the "BEGIN", "UNITCHECK",
288       "CHECK", "INIT", and "END" blocks.
289
290       These code blocks can be prefixed with "sub" to give the appearance of
291       a subroutine (although this is not considered good style).  One should
292       note that these code blocks don't really exist as named subroutines
293       (despite their appearance). The thing that gives this away is the fact
294       that you can have more than one of these code blocks in a program, and
295       they will get all executed at the appropriate moment.  So you can't
296       execute any of these code blocks by name.
297
298       A "BEGIN" code block is executed as soon as possible, that is, the
299       moment it is completely defined, even before the rest of the containing
300       file (or string) is parsed.  You may have multiple "BEGIN" blocks
301       within a file (or eval'ed string); they will execute in order of
302       definition.  Because a "BEGIN" code block executes immediately, it can
303       pull in definitions of subroutines and such from other files in time to
304       be visible to the rest of the compile and run time.  Once a "BEGIN" has
305       run, it is immediately undefined and any code it used is returned to
306       Perl's memory pool.
307
308       An "END" code block is executed as late as possible, that is, after
309       perl has finished running the program and just before the interpreter
310       is being exited, even if it is exiting as a result of a die() function.
311       (But not if it's morphing into another program via "exec", or being
312       blown out of the water by a signal--you have to trap that yourself (if
313       you can).)  You may have multiple "END" blocks within a file--they will
314       execute in reverse order of definition; that is: last in, first out
315       (LIFO).  "END" blocks are not executed when you run perl with the "-c"
316       switch, or if compilation fails.
317
318       Note that "END" code blocks are not executed at the end of a string
319       "eval()": if any "END" code blocks are created in a string "eval()",
320       they will be executed just as any other "END" code block of that
321       package in LIFO order just before the interpreter is being exited.
322
323       Inside an "END" code block, $? contains the value that the program is
324       going to pass to "exit()".  You can modify $? to change the exit value
325       of the program.  Beware of changing $? by accident (e.g. by running
326       something via "system").
327
328       Inside of a "END" block, the value of "${^GLOBAL_PHASE}" will be "END".
329
330       "UNITCHECK", "CHECK" and "INIT" code blocks are useful to catch the
331       transition between the compilation phase and the execution phase of the
332       main program.
333
334       "UNITCHECK" blocks are run just after the unit which defined them has
335       been compiled.  The main program file and each module it loads are
336       compilation units, as are string "eval"s, run-time code compiled using
337       the "(?{ })" construct in a regex, calls to "do FILE", "require FILE",
338       and code after the "-e" switch on the command line.
339
340       "BEGIN" and "UNITCHECK" blocks are not directly related to the phase of
341       the interpreter.  They can be created and executed during any phase.
342
343       "CHECK" code blocks are run just after the initial Perl compile phase
344       ends and before the run time begins, in LIFO order.  "CHECK" code
345       blocks are used in the Perl compiler suite to save the compiled state
346       of the program.
347
348       Inside of a "CHECK" block, the value of "${^GLOBAL_PHASE}" will be
349       "CHECK".
350
351       "INIT" blocks are run just before the Perl runtime begins execution, in
352       "first in, first out" (FIFO) order.
353
354       Inside of an "INIT" block, the value of "${^GLOBAL_PHASE}" will be
355       "INIT".
356
357       The "CHECK" and "INIT" blocks in code compiled by "require", string
358       "do", or string "eval" will not be executed if they occur after the end
359       of the main compilation phase; that can be a problem in mod_perl and
360       other persistent environments which use those functions to load code at
361       runtime.
362
363       When you use the -n and -p switches to Perl, "BEGIN" and "END" work
364       just as they do in awk, as a degenerate case.  Both "BEGIN" and "CHECK"
365       blocks are run when you use the -c switch for a compile-only syntax
366       check, although your main code is not.
367
368       The begincheck program makes it all clear, eventually:
369
370         #!/usr/bin/perl
371
372         # begincheck
373
374         print         "10. Ordinary code runs at runtime.\n";
375
376         END { print   "16.   So this is the end of the tale.\n" }
377         INIT { print  " 7. INIT blocks run FIFO just before runtime.\n" }
378         UNITCHECK {
379           print       " 4.   And therefore before any CHECK blocks.\n"
380         }
381         CHECK { print " 6.   So this is the sixth line.\n" }
382
383         print         "11.   It runs in order, of course.\n";
384
385         BEGIN { print " 1. BEGIN blocks run FIFO during compilation.\n" }
386         END { print   "15.   Read perlmod for the rest of the story.\n" }
387         CHECK { print " 5. CHECK blocks run LIFO after all compilation.\n" }
388         INIT { print  " 8.   Run this again, using Perl's -c switch.\n" }
389
390         print         "12.   This is anti-obfuscated code.\n";
391
392         END { print   "14. END blocks run LIFO at quitting time.\n" }
393         BEGIN { print " 2.   So this line comes out second.\n" }
394         UNITCHECK {
395          print " 3. UNITCHECK blocks run LIFO after each file is compiled.\n"
396         }
397         INIT { print  " 9.   You'll see the difference right away.\n" }
398
399         print         "13.   It only _looks_ like it should be confusing.\n";
400
401         __END__
402
403   Perl Classes
404       There is no special class syntax in Perl, but a package may act as a
405       class if it provides subroutines to act as methods.  Such a package may
406       also derive some of its methods from another class (package) by listing
407       the other package name(s) in its global @ISA array (which must be a
408       package global, not a lexical).
409
410       For more on this, see perlootut and perlobj.
411
412   Perl Modules
413       A module is just a set of related functions in a library file, i.e., a
414       Perl package with the same name as the file.  It is specifically
415       designed to be reusable by other modules or programs.  It may do this
416       by providing a mechanism for exporting some of its symbols into the
417       symbol table of any package using it, or it may function as a class
418       definition and make its semantics available implicitly through method
419       calls on the class and its objects, without explicitly exporting
420       anything.  Or it can do a little of both.
421
422       For example, to start a traditional, non-OO module called Some::Module,
423       create a file called Some/Module.pm and start with this template:
424
425           package Some::Module;  # assumes Some/Module.pm
426
427           use strict;
428           use warnings;
429
430           # Get the import method from Exporter to export functions and
431           # variables
432           use Exporter 5.57 'import';
433
434           # set the version for version checking
435           our $VERSION     = '1.00';
436
437           # Functions and variables which are exported by default
438           our @EXPORT      = qw(func1 func2);
439
440           # Functions and variables which can be optionally exported
441           our @EXPORT_OK   = qw($Var1 %Hashit func3);
442
443           # exported package globals go here
444           our $Var1    = '';
445           our %Hashit  = ();
446
447           # non-exported package globals go here
448           # (they are still accessible as $Some::Module::stuff)
449           our @more    = ();
450           our $stuff   = '';
451
452           # file-private lexicals go here, before any functions which use them
453           my $priv_var    = '';
454           my %secret_hash = ();
455
456           # here's a file-private function as a closure,
457           # callable as $priv_func->();
458           my $priv_func = sub {
459               ...
460           };
461
462           # make all your functions, whether exported or not;
463           # remember to put something interesting in the {} stubs
464           sub func1      { ... }
465           sub func2      { ... }
466
467           # this one isn't always exported, but could be called directly
468           # as Some::Module::func3()
469           sub func3      { ... }
470
471           END { ... }       # module clean-up code here (global destructor)
472
473           1;  # don't forget to return a true value from the file
474
475       Then go on to declare and use your variables in functions without any
476       qualifications.  See Exporter and the perlmodlib for details on
477       mechanics and style issues in module creation.
478
479       Perl modules are included into your program by saying
480
481           use Module;
482
483       or
484
485           use Module LIST;
486
487       This is exactly equivalent to
488
489           BEGIN { require 'Module.pm'; 'Module'->import; }
490
491       or
492
493           BEGIN { require 'Module.pm'; 'Module'->import( LIST ); }
494
495       As a special case
496
497           use Module ();
498
499       is exactly equivalent to
500
501           BEGIN { require 'Module.pm'; }
502
503       All Perl module files have the extension .pm.  The "use" operator
504       assumes this so you don't have to spell out "Module.pm" in quotes.
505       This also helps to differentiate new modules from old .pl and .ph
506       files.  Module names are also capitalized unless they're functioning as
507       pragmas; pragmas are in effect compiler directives, and are sometimes
508       called "pragmatic modules" (or even "pragmata" if you're a classicist).
509
510       The two statements:
511
512           require SomeModule;
513           require "SomeModule.pm";
514
515       differ from each other in two ways.  In the first case, any double
516       colons in the module name, such as "Some::Module", are translated into
517       your system's directory separator, usually "/".   The second case does
518       not, and would have to be specified literally.  The other difference is
519       that seeing the first "require" clues in the compiler that uses of
520       indirect object notation involving "SomeModule", as in "$ob = purge
521       SomeModule", are method calls, not function calls.  (Yes, this really
522       can make a difference.)
523
524       Because the "use" statement implies a "BEGIN" block, the importing of
525       semantics happens as soon as the "use" statement is compiled, before
526       the rest of the file is compiled.  This is how it is able to function
527       as a pragma mechanism, and also how modules are able to declare
528       subroutines that are then visible as list or unary operators for the
529       rest of the current file.  This will not work if you use "require"
530       instead of "use".  With "require" you can get into this problem:
531
532           require Cwd;                # make Cwd:: accessible
533           $here = Cwd::getcwd();
534
535           use Cwd;                    # import names from Cwd::
536           $here = getcwd();
537
538           require Cwd;                # make Cwd:: accessible
539           $here = getcwd();           # oops! no main::getcwd()
540
541       In general, "use Module ()" is recommended over "require Module",
542       because it determines module availability at compile time, not in the
543       middle of your program's execution.  An exception would be if two
544       modules each tried to "use" each other, and each also called a function
545       from that other module.  In that case, it's easy to use "require"
546       instead.
547
548       Perl packages may be nested inside other package names, so we can have
549       package names containing "::".  But if we used that package name
550       directly as a filename it would make for unwieldy or impossible
551       filenames on some systems.  Therefore, if a module's name is, say,
552       "Text::Soundex", then its definition is actually found in the library
553       file Text/Soundex.pm.
554
555       Perl modules always have a .pm file, but there may also be dynamically
556       linked executables (often ending in .so) or autoloaded subroutine
557       definitions (often ending in .al) associated with the module.  If so,
558       these will be entirely transparent to the user of the module.  It is
559       the responsibility of the .pm file to load (or arrange to autoload) any
560       additional functionality.  For example, although the POSIX module
561       happens to do both dynamic loading and autoloading, the user can say
562       just "use POSIX" to get it all.
563
564   Making your module threadsafe
565       Perl supports a type of threads called interpreter threads (ithreads).
566       These threads can be used explicitly and implicitly.
567
568       Ithreads work by cloning the data tree so that no data is shared
569       between different threads. These threads can be used by using the
570       "threads" module or by doing fork() on win32 (fake fork() support).
571       When a thread is cloned all Perl data is cloned, however non-Perl data
572       cannot be cloned automatically.  Perl after 5.8.0 has support for the
573       "CLONE" special subroutine.  In "CLONE" you can do whatever you need to
574       do, like for example handle the cloning of non-Perl data, if necessary.
575       "CLONE" will be called once as a class method for every package that
576       has it defined (or inherits it).  It will be called in the context of
577       the new thread, so all modifications are made in the new area.
578       Currently CLONE is called with no parameters other than the invocant
579       package name, but code should not assume that this will remain
580       unchanged, as it is likely that in future extra parameters will be
581       passed in to give more information about the state of cloning.
582
583       If you want to CLONE all objects you will need to keep track of them
584       per package. This is simply done using a hash and
585       Scalar::Util::weaken().
586
587       Perl after 5.8.7 has support for the "CLONE_SKIP" special subroutine.
588       Like "CLONE", "CLONE_SKIP" is called once per package; however, it is
589       called just before cloning starts, and in the context of the parent
590       thread. If it returns a true value, then no objects of that class will
591       be cloned; or rather, they will be copied as unblessed, undef values.
592       For example: if in the parent there are two references to a single
593       blessed hash, then in the child there will be two references to a
594       single undefined scalar value instead.  This provides a simple
595       mechanism for making a module threadsafe; just add "sub CLONE_SKIP { 1
596       }" at the top of the class, and "DESTROY()" will now only be called
597       once per object. Of course, if the child thread needs to make use of
598       the objects, then a more sophisticated approach is needed.
599
600       Like "CLONE", "CLONE_SKIP" is currently called with no parameters other
601       than the invocant package name, although that may change. Similarly, to
602       allow for future expansion, the return value should be a single 0 or 1
603       value.
604

NAME

DESCRIPTION

SEE ALSO