1PERLSUB(1) Perl Programmers Reference Guide PERLSUB(1)
2
3
4
6 perlsub - Perl subroutines
7
9 To declare subroutines:
10
11 sub NAME; # A "forward" declaration.
12 sub NAME(PROTO); # ditto, but with prototypes
13 sub NAME : ATTRS; # with attributes
14 sub NAME(PROTO) : ATTRS; # with attributes and prototypes
15
16 sub NAME BLOCK # A declaration and a definition.
17 sub NAME(PROTO) BLOCK # ditto, but with prototypes
18 sub NAME : ATTRS BLOCK # with attributes
19 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
20
21 use feature 'signatures';
22 sub NAME(SIG) BLOCK # with signature
23 sub NAME :ATTRS (SIG) BLOCK # with signature, attributes
24 sub NAME :prototype(PROTO) (SIG) BLOCK # with signature, prototype
25
26 To define an anonymous subroutine at runtime:
27
28 $subref = sub BLOCK; # no proto
29 $subref = sub (PROTO) BLOCK; # with proto
30 $subref = sub : ATTRS BLOCK; # with attributes
31 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
32
33 use feature 'signatures';
34 $subref = sub (SIG) BLOCK; # with signature
35 $subref = sub : ATTRS(SIG) BLOCK; # with signature, attributes
36
37 To import subroutines:
38
39 use MODULE qw(NAME1 NAME2 NAME3);
40
41 To call subroutines:
42
43 NAME(LIST); # & is optional with parentheses.
44 NAME LIST; # Parentheses optional if predeclared/imported.
45 &NAME(LIST); # Circumvent prototypes.
46 &NAME; # Makes current @_ visible to called subroutine.
47
49 Like many languages, Perl provides for user-defined subroutines. These
50 may be located anywhere in the main program, loaded in from other files
51 via the "do", "require", or "use" keywords, or generated on the fly
52 using "eval" or anonymous subroutines. You can even call a function
53 indirectly using a variable containing its name or a CODE reference.
54
55 The Perl model for function call and return values is simple: all
56 functions are passed as parameters one single flat list of scalars, and
57 all functions likewise return to their caller one single flat list of
58 scalars. Any arrays or hashes in these call and return lists will
59 collapse, losing their identities--but you may always use pass-by-
60 reference instead to avoid this. Both call and return lists may
61 contain as many or as few scalar elements as you'd like. (Often a
62 function without an explicit return statement is called a subroutine,
63 but there's really no difference from Perl's perspective.)
64
65 Any arguments passed in show up in the array @_. (They may also show
66 up in lexical variables introduced by a signature; see "Signatures"
67 below.) Therefore, if you called a function with two arguments, those
68 would be stored in $_[0] and $_[1]. The array @_ is a local array, but
69 its elements are aliases for the actual scalar parameters. In
70 particular, if an element $_[0] is updated, the corresponding argument
71 is updated (or an error occurs if it is not updatable). If an argument
72 is an array or hash element which did not exist when the function was
73 called, that element is created only when (and if) it is modified or a
74 reference to it is taken. (Some earlier versions of Perl created the
75 element whether or not the element was assigned to.) Assigning to the
76 whole array @_ removes that aliasing, and does not update any
77 arguments.
78
79 A "return" statement may be used to exit a subroutine, optionally
80 specifying the returned value, which will be evaluated in the
81 appropriate context (list, scalar, or void) depending on the context of
82 the subroutine call. If you specify no return value, the subroutine
83 returns an empty list in list context, the undefined value in scalar
84 context, or nothing in void context. If you return one or more
85 aggregates (arrays and hashes), these will be flattened together into
86 one large indistinguishable list.
87
88 If no "return" is found and if the last statement is an expression, its
89 value is returned. If the last statement is a loop control structure
90 like a "foreach" or a "while", the returned value is unspecified. The
91 empty sub returns the empty list.
92
93 Aside from an experimental facility (see "Signatures" below), Perl does
94 not have named formal parameters. In practice all you do is assign to
95 a "my()" list of these. Variables that aren't declared to be private
96 are global variables. For gory details on creating private variables,
97 see "Private Variables via my()" and "Temporary Values via local()".
98 To create protected environments for a set of functions in a separate
99 package (and probably a separate file), see "Packages" in perlmod.
100
101 Example:
102
103 sub max {
104 my $max = shift(@_);
105 foreach $foo (@_) {
106 $max = $foo if $max < $foo;
107 }
108 return $max;
109 }
110 $bestday = max($mon,$tue,$wed,$thu,$fri);
111
112 Example:
113
114 # get a line, combining continuation lines
115 # that start with whitespace
116
117 sub get_line {
118 $thisline = $lookahead; # global variables!
119 LINE: while (defined($lookahead = <STDIN>)) {
120 if ($lookahead =~ /^[ \t]/) {
121 $thisline .= $lookahead;
122 }
123 else {
124 last LINE;
125 }
126 }
127 return $thisline;
128 }
129
130 $lookahead = <STDIN>; # get first line
131 while (defined($line = get_line())) {
132 ...
133 }
134
135 Assigning to a list of private variables to name your arguments:
136
137 sub maybeset {
138 my($key, $value) = @_;
139 $Foo{$key} = $value unless $Foo{$key};
140 }
141
142 Because the assignment copies the values, this also has the effect of
143 turning call-by-reference into call-by-value. Otherwise a function is
144 free to do in-place modifications of @_ and change its caller's values.
145
146 upcase_in($v1, $v2); # this changes $v1 and $v2
147 sub upcase_in {
148 for (@_) { tr/a-z/A-Z/ }
149 }
150
151 You aren't allowed to modify constants in this way, of course. If an
152 argument were actually literal and you tried to change it, you'd take a
153 (presumably fatal) exception. For example, this won't work:
154
155 upcase_in("frederick");
156
157 It would be much safer if the "upcase_in()" function were written to
158 return a copy of its parameters instead of changing them in place:
159
160 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
161 sub upcase {
162 return unless defined wantarray; # void context, do nothing
163 my @parms = @_;
164 for (@parms) { tr/a-z/A-Z/ }
165 return wantarray ? @parms : $parms[0];
166 }
167
168 Notice how this (unprototyped) function doesn't care whether it was
169 passed real scalars or arrays. Perl sees all arguments as one big,
170 long, flat parameter list in @_. This is one area where Perl's simple
171 argument-passing style shines. The "upcase()" function would work
172 perfectly well without changing the "upcase()" definition even if we
173 fed it things like this:
174
175 @newlist = upcase(@list1, @list2);
176 @newlist = upcase( split /:/, $var );
177
178 Do not, however, be tempted to do this:
179
180 (@a, @b) = upcase(@list1, @list2);
181
182 Like the flattened incoming parameter list, the return list is also
183 flattened on return. So all you have managed to do here is stored
184 everything in @a and made @b empty. See "Pass by Reference" for
185 alternatives.
186
187 A subroutine may be called using an explicit "&" prefix. The "&" is
188 optional in modern Perl, as are parentheses if the subroutine has been
189 predeclared. The "&" is not optional when just naming the subroutine,
190 such as when it's used as an argument to defined() or undef(). Nor is
191 it optional when you want to do an indirect subroutine call with a
192 subroutine name or reference using the "&$subref()" or "&{$subref}()"
193 constructs, although the "$subref->()" notation solves that problem.
194 See perlref for more about all that.
195
196 Subroutines may be called recursively. If a subroutine is called using
197 the "&" form, the argument list is optional, and if omitted, no @_
198 array is set up for the subroutine: the @_ array at the time of the
199 call is visible to subroutine instead. This is an efficiency mechanism
200 that new users may wish to avoid.
201
202 &foo(1,2,3); # pass three arguments
203 foo(1,2,3); # the same
204
205 foo(); # pass a null list
206 &foo(); # the same
207
208 &foo; # foo() get current args, like foo(@_) !!
209 use strict 'subs';
210 foo; # like foo() iff sub foo predeclared, else
211 # a compile-time error
212 no strict 'subs';
213 foo; # like foo() iff sub foo predeclared, else
214 # a literal string "foo"
215
216 Not only does the "&" form make the argument list optional, it also
217 disables any prototype checking on arguments you do provide. This is
218 partly for historical reasons, and partly for having a convenient way
219 to cheat if you know what you're doing. See "Prototypes" below.
220
221 Since Perl 5.16.0, the "__SUB__" token is available under "use feature
222 'current_sub'" and "use 5.16.0". It will evaluate to a reference to
223 the currently-running sub, which allows for recursive calls without
224 knowing your subroutine's name.
225
226 use 5.16.0;
227 my $factorial = sub {
228 my ($x) = @_;
229 return 1 if $x == 1;
230 return($x * __SUB__->( $x - 1 ) );
231 };
232
233 The behavior of "__SUB__" within a regex code block (such as
234 "/(?{...})/") is subject to change.
235
236 Subroutines whose names are in all upper case are reserved to the Perl
237 core, as are modules whose names are in all lower case. A subroutine
238 in all capitals is a loosely-held convention meaning it will be called
239 indirectly by the run-time system itself, usually due to a triggered
240 event. Subroutines whose name start with a left parenthesis are also
241 reserved the same way. The following is a list of some subroutines
242 that currently do special, pre-defined things.
243
244 documented later in this document
245 "AUTOLOAD"
246
247 documented in perlmod
248 "CLONE", "CLONE_SKIP"
249
250 documented in perlobj
251 "DESTROY", "DOES"
252
253 documented in perltie
254 "BINMODE", "CLEAR", "CLOSE", "DELETE", "DESTROY", "EOF", "EXISTS",
255 "EXTEND", "FETCH", "FETCHSIZE", "FILENO", "FIRSTKEY", "GETC",
256 "NEXTKEY", "OPEN", "POP", "PRINT", "PRINTF", "PUSH", "READ",
257 "READLINE", "SCALAR", "SEEK", "SHIFT", "SPLICE", "STORE",
258 "STORESIZE", "TELL", "TIEARRAY", "TIEHANDLE", "TIEHASH",
259 "TIESCALAR", "UNSHIFT", "UNTIE", "WRITE"
260
261 documented in PerlIO::via
262 "BINMODE", "CLEARERR", "CLOSE", "EOF", "ERROR", "FDOPEN", "FILENO",
263 "FILL", "FLUSH", "OPEN", "POPPED", "PUSHED", "READ", "SEEK",
264 "SETLINEBUF", "SYSOPEN", "TELL", "UNREAD", "UTF8", "WRITE"
265
266 documented in perlfunc
267 "import" , "unimport" , "INC"
268
269 documented in UNIVERSAL
270 "VERSION"
271
272 documented in perldebguts
273 "DB::DB", "DB::sub", "DB::lsub", "DB::goto", "DB::postponed"
274
275 undocumented, used internally by the overload feature
276 any starting with "("
277
278 The "BEGIN", "UNITCHECK", "CHECK", "INIT" and "END" subroutines are not
279 so much subroutines as named special code blocks, of which you can have
280 more than one in a package, and which you can not call explicitly. See
281 "BEGIN, UNITCHECK, CHECK, INIT and END" in perlmod
282
283 Signatures
284 WARNING: Subroutine signatures are experimental. The feature may be
285 modified or removed in future versions of Perl.
286
287 Perl has an experimental facility to allow a subroutine's formal
288 parameters to be introduced by special syntax, separate from the
289 procedural code of the subroutine body. The formal parameter list is
290 known as a signature. The facility must be enabled first by a
291 pragmatic declaration, "use feature 'signatures'", and it will produce
292 a warning unless the "experimental::signatures" warnings category is
293 disabled.
294
295 The signature is part of a subroutine's body. Normally the body of a
296 subroutine is simply a braced block of code, but when using a
297 signature, the signature is a parenthesised list that goes immediately
298 before the block, after any name or attributes.
299
300 For example,
301
302 sub foo :lvalue ($a, $b = 1, @c) { .... }
303
304 The signature declares lexical variables that are in scope for the
305 block. When the subroutine is called, the signature takes control
306 first. It populates the signature variables from the list of arguments
307 that were passed. If the argument list doesn't meet the requirements
308 of the signature, then it will throw an exception. When the signature
309 processing is complete, control passes to the block.
310
311 Positional parameters are handled by simply naming scalar variables in
312 the signature. For example,
313
314 sub foo ($left, $right) {
315 return $left + $right;
316 }
317
318 takes two positional parameters, which must be filled at runtime by two
319 arguments. By default the parameters are mandatory, and it is not
320 permitted to pass more arguments than expected. So the above is
321 equivalent to
322
323 sub foo {
324 die "Too many arguments for subroutine" unless @_ <= 2;
325 die "Too few arguments for subroutine" unless @_ >= 2;
326 my $left = $_[0];
327 my $right = $_[1];
328 return $left + $right;
329 }
330
331 An argument can be ignored by omitting the main part of the name from a
332 parameter declaration, leaving just a bare "$" sigil. For example,
333
334 sub foo ($first, $, $third) {
335 return "first=$first, third=$third";
336 }
337
338 Although the ignored argument doesn't go into a variable, it is still
339 mandatory for the caller to pass it.
340
341 A positional parameter is made optional by giving a default value,
342 separated from the parameter name by "=":
343
344 sub foo ($left, $right = 0) {
345 return $left + $right;
346 }
347
348 The above subroutine may be called with either one or two arguments.
349 The default value expression is evaluated when the subroutine is
350 called, so it may provide different default values for different calls.
351 It is only evaluated if the argument was actually omitted from the
352 call. For example,
353
354 my $auto_id = 0;
355 sub foo ($thing, $id = $auto_id++) {
356 print "$thing has ID $id";
357 }
358
359 automatically assigns distinct sequential IDs to things for which no ID
360 was supplied by the caller. A default value expression may also refer
361 to parameters earlier in the signature, making the default for one
362 parameter vary according to the earlier parameters. For example,
363
364 sub foo ($first_name, $surname, $nickname = $first_name) {
365 print "$first_name $surname is known as \"$nickname\"";
366 }
367
368 An optional parameter can be nameless just like a mandatory parameter.
369 For example,
370
371 sub foo ($thing, $ = 1) {
372 print $thing;
373 }
374
375 The parameter's default value will still be evaluated if the
376 corresponding argument isn't supplied, even though the value won't be
377 stored anywhere. This is in case evaluating it has important side
378 effects. However, it will be evaluated in void context, so if it
379 doesn't have side effects and is not trivial it will generate a warning
380 if the "void" warning category is enabled. If a nameless optional
381 parameter's default value is not important, it may be omitted just as
382 the parameter's name was:
383
384 sub foo ($thing, $=) {
385 print $thing;
386 }
387
388 Optional positional parameters must come after all mandatory positional
389 parameters. (If there are no mandatory positional parameters then an
390 optional positional parameters can be the first thing in the
391 signature.) If there are multiple optional positional parameters and
392 not enough arguments are supplied to fill them all, they will be filled
393 from left to right.
394
395 After positional parameters, additional arguments may be captured in a
396 slurpy parameter. The simplest form of this is just an array variable:
397
398 sub foo ($filter, @inputs) {
399 print $filter->($_) foreach @inputs;
400 }
401
402 With a slurpy parameter in the signature, there is no upper limit on
403 how many arguments may be passed. A slurpy array parameter may be
404 nameless just like a positional parameter, in which case its only
405 effect is to turn off the argument limit that would otherwise apply:
406
407 sub foo ($thing, @) {
408 print $thing;
409 }
410
411 A slurpy parameter may instead be a hash, in which case the arguments
412 available to it are interpreted as alternating keys and values. There
413 must be as many keys as values: if there is an odd argument then an
414 exception will be thrown. Keys will be stringified, and if there are
415 duplicates then the later instance takes precedence over the earlier,
416 as with standard hash construction.
417
418 sub foo ($filter, %inputs) {
419 print $filter->($_, $inputs{$_}) foreach sort keys %inputs;
420 }
421
422 A slurpy hash parameter may be nameless just like other kinds of
423 parameter. It still insists that the number of arguments available to
424 it be even, even though they're not being put into a variable.
425
426 sub foo ($thing, %) {
427 print $thing;
428 }
429
430 A slurpy parameter, either array or hash, must be the last thing in the
431 signature. It may follow mandatory and optional positional parameters;
432 it may also be the only thing in the signature. Slurpy parameters
433 cannot have default values: if no arguments are supplied for them then
434 you get an empty array or empty hash.
435
436 A signature may be entirely empty, in which case all it does is check
437 that the caller passed no arguments:
438
439 sub foo () {
440 return 123;
441 }
442
443 When using a signature, the arguments are still available in the
444 special array variable @_, in addition to the lexical variables of the
445 signature. There is a difference between the two ways of accessing the
446 arguments: @_ aliases the arguments, but the signature variables get
447 copies of the arguments. So writing to a signature variable only
448 changes that variable, and has no effect on the caller's variables, but
449 writing to an element of @_ modifies whatever the caller used to supply
450 that argument.
451
452 There is a potential syntactic ambiguity between signatures and
453 prototypes (see "Prototypes"), because both start with an opening
454 parenthesis and both can appear in some of the same places, such as
455 just after the name in a subroutine declaration. For historical
456 reasons, when signatures are not enabled, any opening parenthesis in
457 such a context will trigger very forgiving prototype parsing. Most
458 signatures will be interpreted as prototypes in those circumstances,
459 but won't be valid prototypes. (A valid prototype cannot contain any
460 alphabetic character.) This will lead to somewhat confusing error
461 messages.
462
463 To avoid ambiguity, when signatures are enabled the special syntax for
464 prototypes is disabled. There is no attempt to guess whether a
465 parenthesised group was intended to be a prototype or a signature. To
466 give a subroutine a prototype under these circumstances, use a
467 prototype attribute. For example,
468
469 sub foo :prototype($) { $_[0] }
470
471 It is entirely possible for a subroutine to have both a prototype and a
472 signature. They do different jobs: the prototype affects compilation
473 of calls to the subroutine, and the signature puts argument values into
474 lexical variables at runtime. You can therefore write
475
476 sub foo :prototype($$) ($left, $right) {
477 return $left + $right;
478 }
479
480 The prototype attribute, and any other attributes, must come before the
481 signature. The signature always immediately precedes the block of the
482 subroutine's body.
483
484 Private Variables via my()
485 Synopsis:
486
487 my $foo; # declare $foo lexically local
488 my (@wid, %get); # declare list of variables local
489 my $foo = "flurp"; # declare $foo lexical, and init it
490 my @oof = @bar; # declare @oof lexical, and init it
491 my $x : Foo = $y; # similar, with an attribute applied
492
493 WARNING: The use of attribute lists on "my" declarations is still
494 evolving. The current semantics and interface are subject to change.
495 See attributes and Attribute::Handlers.
496
497 The "my" operator declares the listed variables to be lexically
498 confined to the enclosing block, conditional
499 ("if"/"unless"/"elsif"/"else"), loop
500 ("for"/"foreach"/"while"/"until"/"continue"), subroutine, "eval", or
501 "do"/"require"/"use"'d file. If more than one value is listed, the
502 list must be placed in parentheses. All listed elements must be legal
503 lvalues. Only alphanumeric identifiers may be lexically
504 scoped--magical built-ins like $/ must currently be "local"ized with
505 "local" instead.
506
507 Unlike dynamic variables created by the "local" operator, lexical
508 variables declared with "my" are totally hidden from the outside world,
509 including any called subroutines. This is true if it's the same
510 subroutine called from itself or elsewhere--every call gets its own
511 copy.
512
513 This doesn't mean that a "my" variable declared in a statically
514 enclosing lexical scope would be invisible. Only dynamic scopes are
515 cut off. For example, the "bumpx()" function below has access to the
516 lexical $x variable because both the "my" and the "sub" occurred at the
517 same scope, presumably file scope.
518
519 my $x = 10;
520 sub bumpx { $x++ }
521
522 An "eval()", however, can see lexical variables of the scope it is
523 being evaluated in, so long as the names aren't hidden by declarations
524 within the "eval()" itself. See perlref.
525
526 The parameter list to my() may be assigned to if desired, which allows
527 you to initialize your variables. (If no initializer is given for a
528 particular variable, it is created with the undefined value.) Commonly
529 this is used to name input parameters to a subroutine. Examples:
530
531 $arg = "fred"; # "global" variable
532 $n = cube_root(27);
533 print "$arg thinks the root is $n\n";
534 fred thinks the root is 3
535
536 sub cube_root {
537 my $arg = shift; # name doesn't matter
538 $arg **= 1/3;
539 return $arg;
540 }
541
542 The "my" is simply a modifier on something you might assign to. So
543 when you do assign to variables in its argument list, "my" doesn't
544 change whether those variables are viewed as a scalar or an array. So
545
546 my ($foo) = <STDIN>; # WRONG?
547 my @FOO = <STDIN>;
548
549 both supply a list context to the right-hand side, while
550
551 my $foo = <STDIN>;
552
553 supplies a scalar context. But the following declares only one
554 variable:
555
556 my $foo, $bar = 1; # WRONG
557
558 That has the same effect as
559
560 my $foo;
561 $bar = 1;
562
563 The declared variable is not introduced (is not visible) until after
564 the current statement. Thus,
565
566 my $x = $x;
567
568 can be used to initialize a new $x with the value of the old $x, and
569 the expression
570
571 my $x = 123 and $x == 123
572
573 is false unless the old $x happened to have the value 123.
574
575 Lexical scopes of control structures are not bounded precisely by the
576 braces that delimit their controlled blocks; control expressions are
577 part of that scope, too. Thus in the loop
578
579 while (my $line = <>) {
580 $line = lc $line;
581 } continue {
582 print $line;
583 }
584
585 the scope of $line extends from its declaration throughout the rest of
586 the loop construct (including the "continue" clause), but not beyond
587 it. Similarly, in the conditional
588
589 if ((my $answer = <STDIN>) =~ /^yes$/i) {
590 user_agrees();
591 } elsif ($answer =~ /^no$/i) {
592 user_disagrees();
593 } else {
594 chomp $answer;
595 die "'$answer' is neither 'yes' nor 'no'";
596 }
597
598 the scope of $answer extends from its declaration through the rest of
599 that conditional, including any "elsif" and "else" clauses, but not
600 beyond it. See "Simple Statements" in perlsyn for information on the
601 scope of variables in statements with modifiers.
602
603 The "foreach" loop defaults to scoping its index variable dynamically
604 in the manner of "local". However, if the index variable is prefixed
605 with the keyword "my", or if there is already a lexical by that name in
606 scope, then a new lexical is created instead. Thus in the loop
607
608 for my $i (1, 2, 3) {
609 some_function();
610 }
611
612 the scope of $i extends to the end of the loop, but not beyond it,
613 rendering the value of $i inaccessible within "some_function()".
614
615 Some users may wish to encourage the use of lexically scoped variables.
616 As an aid to catching implicit uses to package variables, which are
617 always global, if you say
618
619 use strict 'vars';
620
621 then any variable mentioned from there to the end of the enclosing
622 block must either refer to a lexical variable, be predeclared via "our"
623 or "use vars", or else must be fully qualified with the package name.
624 A compilation error results otherwise. An inner block may countermand
625 this with "no strict 'vars'".
626
627 A "my" has both a compile-time and a run-time effect. At compile time,
628 the compiler takes notice of it. The principal usefulness of this is
629 to quiet "use strict 'vars'", but it is also essential for generation
630 of closures as detailed in perlref. Actual initialization is delayed
631 until run time, though, so it gets executed at the appropriate time,
632 such as each time through a loop, for example.
633
634 Variables declared with "my" are not part of any package and are
635 therefore never fully qualified with the package name. In particular,
636 you're not allowed to try to make a package variable (or other global)
637 lexical:
638
639 my $pack::var; # ERROR! Illegal syntax
640
641 In fact, a dynamic variable (also known as package or global variables)
642 are still accessible using the fully qualified "::" notation even while
643 a lexical of the same name is also visible:
644
645 package main;
646 local $x = 10;
647 my $x = 20;
648 print "$x and $::x\n";
649
650 That will print out 20 and 10.
651
652 You may declare "my" variables at the outermost scope of a file to hide
653 any such identifiers from the world outside that file. This is similar
654 in spirit to C's static variables when they are used at the file level.
655 To do this with a subroutine requires the use of a closure (an
656 anonymous function that accesses enclosing lexicals). If you want to
657 create a private subroutine that cannot be called from outside that
658 block, it can declare a lexical variable containing an anonymous sub
659 reference:
660
661 my $secret_version = '1.001-beta';
662 my $secret_sub = sub { print $secret_version };
663 &$secret_sub();
664
665 As long as the reference is never returned by any function within the
666 module, no outside module can see the subroutine, because its name is
667 not in any package's symbol table. Remember that it's not REALLY
668 called $some_pack::secret_version or anything; it's just
669 $secret_version, unqualified and unqualifiable.
670
671 This does not work with object methods, however; all object methods
672 have to be in the symbol table of some package to be found. See
673 "Function Templates" in perlref for something of a work-around to this.
674
675 Persistent Private Variables
676 There are two ways to build persistent private variables in Perl 5.10.
677 First, you can simply use the "state" feature. Or, you can use
678 closures, if you want to stay compatible with releases older than 5.10.
679
680 Persistent variables via state()
681
682 Beginning with Perl 5.10.0, you can declare variables with the "state"
683 keyword in place of "my". For that to work, though, you must have
684 enabled that feature beforehand, either by using the "feature" pragma,
685 or by using "-E" on one-liners (see feature). Beginning with Perl
686 5.16, the "CORE::state" form does not require the "feature" pragma.
687
688 The "state" keyword creates a lexical variable (following the same
689 scoping rules as "my") that persists from one subroutine call to the
690 next. If a state variable resides inside an anonymous subroutine, then
691 each copy of the subroutine has its own copy of the state variable.
692 However, the value of the state variable will still persist between
693 calls to the same copy of the anonymous subroutine. (Don't forget that
694 "sub { ... }" creates a new subroutine each time it is executed.)
695
696 For example, the following code maintains a private counter,
697 incremented each time the gimme_another() function is called:
698
699 use feature 'state';
700 sub gimme_another { state $x; return ++$x }
701
702 And this example uses anonymous subroutines to create separate
703 counters:
704
705 use feature 'state';
706 sub create_counter {
707 return sub { state $x; return ++$x }
708 }
709
710 Also, since $x is lexical, it can't be reached or modified by any Perl
711 code outside.
712
713 When combined with variable declaration, simple assignment to "state"
714 variables (as in "state $x = 42") is executed only the first time.
715 When such statements are evaluated subsequent times, the assignment is
716 ignored. The behavior of assignment to "state" declarations where the
717 left hand side of the assignment involves any parentheses is currently
718 undefined.
719
720 Persistent variables with closures
721
722 Just because a lexical variable is lexically (also called statically)
723 scoped to its enclosing block, "eval", or "do" FILE, this doesn't mean
724 that within a function it works like a C static. It normally works
725 more like a C auto, but with implicit garbage collection.
726
727 Unlike local variables in C or C++, Perl's lexical variables don't
728 necessarily get recycled just because their scope has exited. If
729 something more permanent is still aware of the lexical, it will stick
730 around. So long as something else references a lexical, that lexical
731 won't be freed--which is as it should be. You wouldn't want memory
732 being free until you were done using it, or kept around once you were
733 done. Automatic garbage collection takes care of this for you.
734
735 This means that you can pass back or save away references to lexical
736 variables, whereas to return a pointer to a C auto is a grave error.
737 It also gives us a way to simulate C's function statics. Here's a
738 mechanism for giving a function private variables with both lexical
739 scoping and a static lifetime. If you do want to create something like
740 C's static variables, just enclose the whole function in an extra
741 block, and put the static variable outside the function but in the
742 block.
743
744 {
745 my $secret_val = 0;
746 sub gimme_another {
747 return ++$secret_val;
748 }
749 }
750 # $secret_val now becomes unreachable by the outside
751 # world, but retains its value between calls to gimme_another
752
753 If this function is being sourced in from a separate file via "require"
754 or "use", then this is probably just fine. If it's all in the main
755 program, you'll need to arrange for the "my" to be executed early,
756 either by putting the whole block above your main program, or more
757 likely, placing merely a "BEGIN" code block around it to make sure it
758 gets executed before your program starts to run:
759
760 BEGIN {
761 my $secret_val = 0;
762 sub gimme_another {
763 return ++$secret_val;
764 }
765 }
766
767 See "BEGIN, UNITCHECK, CHECK, INIT and END" in perlmod about the
768 special triggered code blocks, "BEGIN", "UNITCHECK", "CHECK", "INIT"
769 and "END".
770
771 If declared at the outermost scope (the file scope), then lexicals work
772 somewhat like C's file statics. They are available to all functions in
773 that same file declared below them, but are inaccessible from outside
774 that file. This strategy is sometimes used in modules to create
775 private variables that the whole module can see.
776
777 Temporary Values via local()
778 WARNING: In general, you should be using "my" instead of "local",
779 because it's faster and safer. Exceptions to this include the global
780 punctuation variables, global filehandles and formats, and direct
781 manipulation of the Perl symbol table itself. "local" is mostly used
782 when the current value of a variable must be visible to called
783 subroutines.
784
785 Synopsis:
786
787 # localization of values
788
789 local $foo; # make $foo dynamically local
790 local (@wid, %get); # make list of variables local
791 local $foo = "flurp"; # make $foo dynamic, and init it
792 local @oof = @bar; # make @oof dynamic, and init it
793
794 local $hash{key} = "val"; # sets a local value for this hash entry
795 delete local $hash{key}; # delete this entry for the current block
796 local ($cond ? $v1 : $v2); # several types of lvalues support
797 # localization
798
799 # localization of symbols
800
801 local *FH; # localize $FH, @FH, %FH, &FH ...
802 local *merlyn = *randal; # now $merlyn is really $randal, plus
803 # @merlyn is really @randal, etc
804 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
805 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
806
807 A "local" modifies its listed variables to be "local" to the enclosing
808 block, "eval", or "do FILE"--and to any subroutine called from within
809 that block. A "local" just gives temporary values to global (meaning
810 package) variables. It does not create a local variable. This is
811 known as dynamic scoping. Lexical scoping is done with "my", which
812 works more like C's auto declarations.
813
814 Some types of lvalues can be localized as well: hash and array elements
815 and slices, conditionals (provided that their result is always
816 localizable), and symbolic references. As for simple variables, this
817 creates new, dynamically scoped values.
818
819 If more than one variable or expression is given to "local", they must
820 be placed in parentheses. This operator works by saving the current
821 values of those variables in its argument list on a hidden stack and
822 restoring them upon exiting the block, subroutine, or eval. This means
823 that called subroutines can also reference the local variable, but not
824 the global one. The argument list may be assigned to if desired, which
825 allows you to initialize your local variables. (If no initializer is
826 given for a particular variable, it is created with an undefined
827 value.)
828
829 Because "local" is a run-time operator, it gets executed each time
830 through a loop. Consequently, it's more efficient to localize your
831 variables outside the loop.
832
833 Grammatical note on local()
834
835 A "local" is simply a modifier on an lvalue expression. When you
836 assign to a "local"ized variable, the "local" doesn't change whether
837 its list is viewed as a scalar or an array. So
838
839 local($foo) = <STDIN>;
840 local @FOO = <STDIN>;
841
842 both supply a list context to the right-hand side, while
843
844 local $foo = <STDIN>;
845
846 supplies a scalar context.
847
848 Localization of special variables
849
850 If you localize a special variable, you'll be giving a new value to it,
851 but its magic won't go away. That means that all side-effects related
852 to this magic still work with the localized value.
853
854 This feature allows code like this to work :
855
856 # Read the whole contents of FILE in $slurp
857 { local $/ = undef; $slurp = <FILE>; }
858
859 Note, however, that this restricts localization of some values ; for
860 example, the following statement dies, as of perl 5.10.0, with an error
861 Modification of a read-only value attempted, because the $1 variable is
862 magical and read-only :
863
864 local $1 = 2;
865
866 One exception is the default scalar variable: starting with perl 5.14
867 "local($_)" will always strip all magic from $_, to make it possible to
868 safely reuse $_ in a subroutine.
869
870 WARNING: Localization of tied arrays and hashes does not currently work
871 as described. This will be fixed in a future release of Perl; in the
872 meantime, avoid code that relies on any particular behavior of
873 localising tied arrays or hashes (localising individual elements is
874 still okay). See "Localising Tied Arrays and Hashes Is Broken" in
875 perl58delta for more details.
876
877 Localization of globs
878
879 The construct
880
881 local *name;
882
883 creates a whole new symbol table entry for the glob "name" in the
884 current package. That means that all variables in its glob slot
885 ($name, @name, %name, &name, and the "name" filehandle) are dynamically
886 reset.
887
888 This implies, among other things, that any magic eventually carried by
889 those variables is locally lost. In other words, saying "local */"
890 will not have any effect on the internal value of the input record
891 separator.
892
893 Localization of elements of composite types
894
895 It's also worth taking a moment to explain what happens when you
896 "local"ize a member of a composite type (i.e. an array or hash
897 element). In this case, the element is "local"ized by name. This
898 means that when the scope of the "local()" ends, the saved value will
899 be restored to the hash element whose key was named in the "local()",
900 or the array element whose index was named in the "local()". If that
901 element was deleted while the "local()" was in effect (e.g. by a
902 "delete()" from a hash or a "shift()" of an array), it will spring back
903 into existence, possibly extending an array and filling in the skipped
904 elements with "undef". For instance, if you say
905
906 %hash = ( 'This' => 'is', 'a' => 'test' );
907 @ary = ( 0..5 );
908 {
909 local($ary[5]) = 6;
910 local($hash{'a'}) = 'drill';
911 while (my $e = pop(@ary)) {
912 print "$e . . .\n";
913 last unless $e > 3;
914 }
915 if (@ary) {
916 $hash{'only a'} = 'test';
917 delete $hash{'a'};
918 }
919 }
920 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
921 print "The array has ",scalar(@ary)," elements: ",
922 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
923
924 Perl will print
925
926 6 . . .
927 4 . . .
928 3 . . .
929 This is a test only a test.
930 The array has 6 elements: 0, 1, 2, undef, undef, 5
931
932 The behavior of local() on non-existent members of composite types is
933 subject to change in future. The behavior of local() on array elements
934 specified using negative indexes is particularly surprising, and is
935 very likely to change.
936
937 Localized deletion of elements of composite types
938
939 You can use the "delete local $array[$idx]" and "delete local
940 $hash{key}" constructs to delete a composite type entry for the current
941 block and restore it when it ends. They return the array/hash value
942 before the localization, which means that they are respectively
943 equivalent to
944
945 do {
946 my $val = $array[$idx];
947 local $array[$idx];
948 delete $array[$idx];
949 $val
950 }
951
952 and
953
954 do {
955 my $val = $hash{key};
956 local $hash{key};
957 delete $hash{key};
958 $val
959 }
960
961 except that for those the "local" is scoped to the "do" block. Slices
962 are also accepted.
963
964 my %hash = (
965 a => [ 7, 8, 9 ],
966 b => 1,
967 )
968
969 {
970 my $a = delete local $hash{a};
971 # $a is [ 7, 8, 9 ]
972 # %hash is (b => 1)
973
974 {
975 my @nums = delete local @$a[0, 2]
976 # @nums is (7, 9)
977 # $a is [ undef, 8 ]
978
979 $a[0] = 999; # will be erased when the scope ends
980 }
981 # $a is back to [ 7, 8, 9 ]
982
983 }
984 # %hash is back to its original state
985
986 Lvalue subroutines
987 It is possible to return a modifiable value from a subroutine. To do
988 this, you have to declare the subroutine to return an lvalue.
989
990 my $val;
991 sub canmod : lvalue {
992 $val; # or: return $val;
993 }
994 sub nomod {
995 $val;
996 }
997
998 canmod() = 5; # assigns to $val
999 nomod() = 5; # ERROR
1000
1001 The scalar/list context for the subroutine and for the right-hand side
1002 of assignment is determined as if the subroutine call is replaced by a
1003 scalar. For example, consider:
1004
1005 data(2,3) = get_data(3,4);
1006
1007 Both subroutines here are called in a scalar context, while in:
1008
1009 (data(2,3)) = get_data(3,4);
1010
1011 and in:
1012
1013 (data(2),data(3)) = get_data(3,4);
1014
1015 all the subroutines are called in a list context.
1016
1017 Lvalue subroutines are convenient, but you have to keep in mind that,
1018 when used with objects, they may violate encapsulation. A normal
1019 mutator can check the supplied argument before setting the attribute it
1020 is protecting, an lvalue subroutine cannot. If you require any special
1021 processing when storing and retrieving the values, consider using the
1022 CPAN module Sentinel or something similar.
1023
1024 Lexical Subroutines
1025 Beginning with Perl 5.18, you can declare a private subroutine with
1026 "my" or "state". As with state variables, the "state" keyword is only
1027 available under "use feature 'state'" or "use 5.010" or higher.
1028
1029 Prior to Perl 5.26, lexical subroutines were deemed experimental and
1030 were available only under the "use feature 'lexical_subs'" pragma.
1031 They also produced a warning unless the "experimental::lexical_subs"
1032 warnings category was disabled.
1033
1034 These subroutines are only visible within the block in which they are
1035 declared, and only after that declaration:
1036
1037 # Include these two lines if your code is intended to run under Perl
1038 # versions earlier than 5.26.
1039 no warnings "experimental::lexical_subs";
1040 use feature 'lexical_subs';
1041
1042 foo(); # calls the package/global subroutine
1043 state sub foo {
1044 foo(); # also calls the package subroutine
1045 }
1046 foo(); # calls "state" sub
1047 my $ref = \&foo; # take a reference to "state" sub
1048
1049 my sub bar { ... }
1050 bar(); # calls "my" sub
1051
1052 You can't (directly) write a recursive lexical subroutine:
1053
1054 # WRONG
1055 my sub baz {
1056 baz();
1057 }
1058
1059 This example fails because "baz()" refers to the package/global
1060 subroutine "baz", not the lexical subroutine currently being defined.
1061
1062 The solution is to use "__SUB__":
1063
1064 my sub baz {
1065 __SUB__->(); # calls itself
1066 }
1067
1068 It is possible to predeclare a lexical subroutine. The "sub foo {...}"
1069 subroutine definition syntax respects any previous "my sub;" or "state
1070 sub;" declaration. Using this to define recursive subroutines is a bad
1071 idea, however:
1072
1073 my sub baz; # predeclaration
1074 sub baz { # define the "my" sub
1075 baz(); # WRONG: calls itself, but leaks memory
1076 }
1077
1078 Just like "my $f; $f = sub { $f->() }", this example leaks memory. The
1079 name "baz" is a reference to the subroutine, and the subroutine uses
1080 the name "baz"; they keep each other alive (see "Circular References"
1081 in perlref).
1082
1083 "state sub" vs "my sub"
1084
1085 What is the difference between "state" subs and "my" subs? Each time
1086 that execution enters a block when "my" subs are declared, a new copy
1087 of each sub is created. "State" subroutines persist from one execution
1088 of the containing block to the next.
1089
1090 So, in general, "state" subroutines are faster. But "my" subs are
1091 necessary if you want to create closures:
1092
1093 sub whatever {
1094 my $x = shift;
1095 my sub inner {
1096 ... do something with $x ...
1097 }
1098 inner();
1099 }
1100
1101 In this example, a new $x is created when "whatever" is called, and
1102 also a new "inner", which can see the new $x. A "state" sub will only
1103 see the $x from the first call to "whatever".
1104
1105 "our" subroutines
1106
1107 Like "our $variable", "our sub" creates a lexical alias to the package
1108 subroutine of the same name.
1109
1110 The two main uses for this are to switch back to using the package sub
1111 inside an inner scope:
1112
1113 sub foo { ... }
1114
1115 sub bar {
1116 my sub foo { ... }
1117 {
1118 # need to use the outer foo here
1119 our sub foo;
1120 foo();
1121 }
1122 }
1123
1124 and to make a subroutine visible to other packages in the same scope:
1125
1126 package MySneakyModule;
1127
1128 our sub do_something { ... }
1129
1130 sub do_something_with_caller {
1131 package DB;
1132 () = caller 1; # sets @DB::args
1133 do_something(@args); # uses MySneakyModule::do_something
1134 }
1135
1136 Passing Symbol Table Entries (typeglobs)
1137 WARNING: The mechanism described in this section was originally the
1138 only way to simulate pass-by-reference in older versions of Perl.
1139 While it still works fine in modern versions, the new reference
1140 mechanism is generally easier to work with. See below.
1141
1142 Sometimes you don't want to pass the value of an array to a subroutine
1143 but rather the name of it, so that the subroutine can modify the global
1144 copy of it rather than working with a local copy. In perl you can
1145 refer to all objects of a particular name by prefixing the name with a
1146 star: *foo. This is often known as a "typeglob", because the star on
1147 the front can be thought of as a wildcard match for all the funny
1148 prefix characters on variables and subroutines and such.
1149
1150 When evaluated, the typeglob produces a scalar value that represents
1151 all the objects of that name, including any filehandle, format, or
1152 subroutine. When assigned to, it causes the name mentioned to refer to
1153 whatever "*" value was assigned to it. Example:
1154
1155 sub doubleary {
1156 local(*someary) = @_;
1157 foreach $elem (@someary) {
1158 $elem *= 2;
1159 }
1160 }
1161 doubleary(*foo);
1162 doubleary(*bar);
1163
1164 Scalars are already passed by reference, so you can modify scalar
1165 arguments without using this mechanism by referring explicitly to $_[0]
1166 etc. You can modify all the elements of an array by passing all the
1167 elements as scalars, but you have to use the "*" mechanism (or the
1168 equivalent reference mechanism) to "push", "pop", or change the size of
1169 an array. It will certainly be faster to pass the typeglob (or
1170 reference).
1171
1172 Even if you don't want to modify an array, this mechanism is useful for
1173 passing multiple arrays in a single LIST, because normally the LIST
1174 mechanism will merge all the array values so that you can't extract out
1175 the individual arrays. For more on typeglobs, see "Typeglobs and
1176 Filehandles" in perldata.
1177
1178 When to Still Use local()
1179 Despite the existence of "my", there are still three places where the
1180 "local" operator still shines. In fact, in these three places, you
1181 must use "local" instead of "my".
1182
1183 1. You need to give a global variable a temporary value, especially
1184 $_.
1185
1186 The global variables, like @ARGV or the punctuation variables, must
1187 be "local"ized with "local()". This block reads in /etc/motd, and
1188 splits it up into chunks separated by lines of equal signs, which
1189 are placed in @Fields.
1190
1191 {
1192 local @ARGV = ("/etc/motd");
1193 local $/ = undef;
1194 local $_ = <>;
1195 @Fields = split /^\s*=+\s*$/;
1196 }
1197
1198 It particular, it's important to "local"ize $_ in any routine that
1199 assigns to it. Look out for implicit assignments in "while"
1200 conditionals.
1201
1202 2. You need to create a local file or directory handle or a local
1203 function.
1204
1205 A function that needs a filehandle of its own must use "local()" on
1206 a complete typeglob. This can be used to create new symbol table
1207 entries:
1208
1209 sub ioqueue {
1210 local (*READER, *WRITER); # not my!
1211 pipe (READER, WRITER) or die "pipe: $!";
1212 return (*READER, *WRITER);
1213 }
1214 ($head, $tail) = ioqueue();
1215
1216 See the Symbol module for a way to create anonymous symbol table
1217 entries.
1218
1219 Because assignment of a reference to a typeglob creates an alias,
1220 this can be used to create what is effectively a local function, or
1221 at least, a local alias.
1222
1223 {
1224 local *grow = \&shrink; # only until this block exits
1225 grow(); # really calls shrink()
1226 move(); # if move() grow()s, it shrink()s too
1227 }
1228 grow(); # get the real grow() again
1229
1230 See "Function Templates" in perlref for more about manipulating
1231 functions by name in this way.
1232
1233 3. You want to temporarily change just one element of an array or
1234 hash.
1235
1236 You can "local"ize just one element of an aggregate. Usually this
1237 is done on dynamics:
1238
1239 {
1240 local $SIG{INT} = 'IGNORE';
1241 funct(); # uninterruptible
1242 }
1243 # interruptibility automatically restored here
1244
1245 But it also works on lexically declared aggregates.
1246
1247 Pass by Reference
1248 If you want to pass more than one array or hash into a function--or
1249 return them from it--and have them maintain their integrity, then
1250 you're going to have to use an explicit pass-by-reference. Before you
1251 do that, you need to understand references as detailed in perlref.
1252 This section may not make much sense to you otherwise.
1253
1254 Here are a few simple examples. First, let's pass in several arrays to
1255 a function and have it "pop" all of then, returning a new list of all
1256 their former last elements:
1257
1258 @tailings = popmany ( \@a, \@b, \@c, \@d );
1259
1260 sub popmany {
1261 my $aref;
1262 my @retlist;
1263 foreach $aref ( @_ ) {
1264 push @retlist, pop @$aref;
1265 }
1266 return @retlist;
1267 }
1268
1269 Here's how you might write a function that returns a list of keys
1270 occurring in all the hashes passed to it:
1271
1272 @common = inter( \%foo, \%bar, \%joe );
1273 sub inter {
1274 my ($k, $href, %seen); # locals
1275 foreach $href (@_) {
1276 while ( $k = each %$href ) {
1277 $seen{$k}++;
1278 }
1279 }
1280 return grep { $seen{$_} == @_ } keys %seen;
1281 }
1282
1283 So far, we're using just the normal list return mechanism. What
1284 happens if you want to pass or return a hash? Well, if you're using
1285 only one of them, or you don't mind them concatenating, then the normal
1286 calling convention is ok, although a little expensive.
1287
1288 Where people get into trouble is here:
1289
1290 (@a, @b) = func(@c, @d);
1291 or
1292 (%a, %b) = func(%c, %d);
1293
1294 That syntax simply won't work. It sets just @a or %a and clears the @b
1295 or %b. Plus the function didn't get passed into two separate arrays or
1296 hashes: it got one long list in @_, as always.
1297
1298 If you can arrange for everyone to deal with this through references,
1299 it's cleaner code, although not so nice to look at. Here's a function
1300 that takes two array references as arguments, returning the two array
1301 elements in order of how many elements they have in them:
1302
1303 ($aref, $bref) = func(\@c, \@d);
1304 print "@$aref has more than @$bref\n";
1305 sub func {
1306 my ($cref, $dref) = @_;
1307 if (@$cref > @$dref) {
1308 return ($cref, $dref);
1309 } else {
1310 return ($dref, $cref);
1311 }
1312 }
1313
1314 It turns out that you can actually do this also:
1315
1316 (*a, *b) = func(\@c, \@d);
1317 print "@a has more than @b\n";
1318 sub func {
1319 local (*c, *d) = @_;
1320 if (@c > @d) {
1321 return (\@c, \@d);
1322 } else {
1323 return (\@d, \@c);
1324 }
1325 }
1326
1327 Here we're using the typeglobs to do symbol table aliasing. It's a tad
1328 subtle, though, and also won't work if you're using "my" variables,
1329 because only globals (even in disguise as "local"s) are in the symbol
1330 table.
1331
1332 If you're passing around filehandles, you could usually just use the
1333 bare typeglob, like *STDOUT, but typeglobs references work, too. For
1334 example:
1335
1336 splutter(\*STDOUT);
1337 sub splutter {
1338 my $fh = shift;
1339 print $fh "her um well a hmmm\n";
1340 }
1341
1342 $rec = get_rec(\*STDIN);
1343 sub get_rec {
1344 my $fh = shift;
1345 return scalar <$fh>;
1346 }
1347
1348 If you're planning on generating new filehandles, you could do this.
1349 Notice to pass back just the bare *FH, not its reference.
1350
1351 sub openit {
1352 my $path = shift;
1353 local *FH;
1354 return open (FH, $path) ? *FH : undef;
1355 }
1356
1357 Prototypes
1358 Perl supports a very limited kind of compile-time argument checking
1359 using function prototyping. This can be declared in either the PROTO
1360 section or with a prototype attribute. If you declare either of
1361
1362 sub mypush (\@@)
1363 sub mypush :prototype(\@@)
1364
1365 then "mypush()" takes arguments exactly like "push()" does.
1366
1367 If subroutine signatures are enabled (see "Signatures"), then the
1368 shorter PROTO syntax is unavailable, because it would clash with
1369 signatures. In that case, a prototype can only be declared in the form
1370 of an attribute.
1371
1372 The function declaration must be visible at compile time. The
1373 prototype affects only interpretation of new-style calls to the
1374 function, where new-style is defined as not using the "&" character.
1375 In other words, if you call it like a built-in function, then it
1376 behaves like a built-in function. If you call it like an old-fashioned
1377 subroutine, then it behaves like an old-fashioned subroutine. It
1378 naturally falls out from this rule that prototypes have no influence on
1379 subroutine references like "\&foo" or on indirect subroutine calls like
1380 "&{$subref}" or "$subref->()".
1381
1382 Method calls are not influenced by prototypes either, because the
1383 function to be called is indeterminate at compile time, since the exact
1384 code called depends on inheritance.
1385
1386 Because the intent of this feature is primarily to let you define
1387 subroutines that work like built-in functions, here are prototypes for
1388 some other functions that parse almost exactly like the corresponding
1389 built-in.
1390
1391 Declared as Called as
1392
1393 sub mylink ($$) mylink $old, $new
1394 sub myvec ($$$) myvec $var, $offset, 1
1395 sub myindex ($$;$) myindex &getstring, "substr"
1396 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
1397 sub myreverse (@) myreverse $a, $b, $c
1398 sub myjoin ($@) myjoin ":", $a, $b, $c
1399 sub mypop (\@) mypop @array
1400 sub mysplice (\@$$@) mysplice @array, 0, 2, @pushme
1401 sub mykeys (\[%@]) mykeys %{$hashref}
1402 sub myopen (*;$) myopen HANDLE, $name
1403 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
1404 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
1405 sub myrand (;$) myrand 42
1406 sub mytime () mytime
1407
1408 Any backslashed prototype character represents an actual argument that
1409 must start with that character (optionally preceded by "my", "our" or
1410 "local"), with the exception of "$", which will accept any scalar
1411 lvalue expression, such as "$foo = 7" or "my_function()->[0]". The
1412 value passed as part of @_ will be a reference to the actual argument
1413 given in the subroutine call, obtained by applying "\" to that
1414 argument.
1415
1416 You can use the "\[]" backslash group notation to specify more than one
1417 allowed argument type. For example:
1418
1419 sub myref (\[$@%&*])
1420
1421 will allow calling myref() as
1422
1423 myref $var
1424 myref @array
1425 myref %hash
1426 myref &sub
1427 myref *glob
1428
1429 and the first argument of myref() will be a reference to a scalar, an
1430 array, a hash, a code, or a glob.
1431
1432 Unbackslashed prototype characters have special meanings. Any
1433 unbackslashed "@" or "%" eats all remaining arguments, and forces list
1434 context. An argument represented by "$" forces scalar context. An "&"
1435 requires an anonymous subroutine, which, if passed as the first
1436 argument, does not require the "sub" keyword or a subsequent comma.
1437
1438 A "*" allows the subroutine to accept a bareword, constant, scalar
1439 expression, typeglob, or a reference to a typeglob in that slot. The
1440 value will be available to the subroutine either as a simple scalar, or
1441 (in the latter two cases) as a reference to the typeglob. If you wish
1442 to always convert such arguments to a typeglob reference, use
1443 Symbol::qualify_to_ref() as follows:
1444
1445 use Symbol 'qualify_to_ref';
1446
1447 sub foo (*) {
1448 my $fh = qualify_to_ref(shift, caller);
1449 ...
1450 }
1451
1452 The "+" prototype is a special alternative to "$" that will act like
1453 "\[@%]" when given a literal array or hash variable, but will otherwise
1454 force scalar context on the argument. This is useful for functions
1455 which should accept either a literal array or an array reference as the
1456 argument:
1457
1458 sub mypush (+@) {
1459 my $aref = shift;
1460 die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
1461 push @$aref, @_;
1462 }
1463
1464 When using the "+" prototype, your function must check that the
1465 argument is of an acceptable type.
1466
1467 A semicolon (";") separates mandatory arguments from optional
1468 arguments. It is redundant before "@" or "%", which gobble up
1469 everything else.
1470
1471 As the last character of a prototype, or just before a semicolon, a "@"
1472 or a "%", you can use "_" in place of "$": if this argument is not
1473 provided, $_ will be used instead.
1474
1475 Note how the last three examples in the table above are treated
1476 specially by the parser. "mygrep()" is parsed as a true list operator,
1477 "myrand()" is parsed as a true unary operator with unary precedence the
1478 same as "rand()", and "mytime()" is truly without arguments, just like
1479 "time()". That is, if you say
1480
1481 mytime +2;
1482
1483 you'll get "mytime() + 2", not mytime(2), which is how it would be
1484 parsed without a prototype. If you want to force a unary function to
1485 have the same precedence as a list operator, add ";" to the end of the
1486 prototype:
1487
1488 sub mygetprotobynumber($;);
1489 mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)
1490
1491 The interesting thing about "&" is that you can generate new syntax
1492 with it, provided it's in the initial position:
1493
1494 sub try (&@) {
1495 my($try,$catch) = @_;
1496 eval { &$try };
1497 if ($@) {
1498 local $_ = $@;
1499 &$catch;
1500 }
1501 }
1502 sub catch (&) { $_[0] }
1503
1504 try {
1505 die "phooey";
1506 } catch {
1507 /phooey/ and print "unphooey\n";
1508 };
1509
1510 That prints "unphooey". (Yes, there are still unresolved issues having
1511 to do with visibility of @_. I'm ignoring that question for the
1512 moment. (But note that if we make @_ lexically scoped, those anonymous
1513 subroutines can act like closures... (Gee, is this sounding a little
1514 Lispish? (Never mind.))))
1515
1516 And here's a reimplementation of the Perl "grep" operator:
1517
1518 sub mygrep (&@) {
1519 my $code = shift;
1520 my @result;
1521 foreach $_ (@_) {
1522 push(@result, $_) if &$code;
1523 }
1524 @result;
1525 }
1526
1527 Some folks would prefer full alphanumeric prototypes. Alphanumerics
1528 have been intentionally left out of prototypes for the express purpose
1529 of someday in the future adding named, formal parameters. The current
1530 mechanism's main goal is to let module writers provide better
1531 diagnostics for module users. Larry feels the notation quite
1532 understandable to Perl programmers, and that it will not intrude
1533 greatly upon the meat of the module, nor make it harder to read. The
1534 line noise is visually encapsulated into a small pill that's easy to
1535 swallow.
1536
1537 If you try to use an alphanumeric sequence in a prototype you will
1538 generate an optional warning - "Illegal character in prototype...".
1539 Unfortunately earlier versions of Perl allowed the prototype to be used
1540 as long as its prefix was a valid prototype. The warning may be
1541 upgraded to a fatal error in a future version of Perl once the majority
1542 of offending code is fixed.
1543
1544 It's probably best to prototype new functions, not retrofit prototyping
1545 into older ones. That's because you must be especially careful about
1546 silent impositions of differing list versus scalar contexts. For
1547 example, if you decide that a function should take just one parameter,
1548 like this:
1549
1550 sub func ($) {
1551 my $n = shift;
1552 print "you gave me $n\n";
1553 }
1554
1555 and someone has been calling it with an array or expression returning a
1556 list:
1557
1558 func(@foo);
1559 func( $text =~ /\w+/g );
1560
1561 Then you've just supplied an automatic "scalar" in front of their
1562 argument, which can be more than a bit surprising. The old @foo which
1563 used to hold one thing doesn't get passed in. Instead, "func()" now
1564 gets passed in a 1; that is, the number of elements in @foo. And the
1565 "m//g" gets called in scalar context so instead of a list of words it
1566 returns a boolean result and advances "pos($text)". Ouch!
1567
1568 If a sub has both a PROTO and a BLOCK, the prototype is not applied
1569 until after the BLOCK is completely defined. This means that a
1570 recursive function with a prototype has to be predeclared for the
1571 prototype to take effect, like so:
1572
1573 sub foo($$);
1574 sub foo($$) {
1575 foo 1, 2;
1576 }
1577
1578 This is all very powerful, of course, and should be used only in
1579 moderation to make the world a better place.
1580
1581 Constant Functions
1582 Functions with a prototype of "()" are potential candidates for
1583 inlining. If the result after optimization and constant folding is
1584 either a constant or a lexically-scoped scalar which has no other
1585 references, then it will be used in place of function calls made
1586 without "&". Calls made using "&" are never inlined. (See constant.pm
1587 for an easy way to declare most constants.)
1588
1589 The following functions would all be inlined:
1590
1591 sub pi () { 3.14159 } # Not exact, but close.
1592 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
1593 # and it's inlined, too!
1594 sub ST_DEV () { 0 }
1595 sub ST_INO () { 1 }
1596
1597 sub FLAG_FOO () { 1 << 8 }
1598 sub FLAG_BAR () { 1 << 9 }
1599 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
1600
1601 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
1602
1603 sub N () { int(OPT_BAZ) / 3 }
1604
1605 sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
1606 sub FOO_SET2 () { if (FLAG_MASK & FLAG_FOO) { 1 } }
1607
1608 (Be aware that the last example was not always inlined in Perl 5.20 and
1609 earlier, which did not behave consistently with subroutines containing
1610 inner scopes.) You can countermand inlining by using an explicit
1611 "return":
1612
1613 sub baz_val () {
1614 if (OPT_BAZ) {
1615 return 23;
1616 }
1617 else {
1618 return 42;
1619 }
1620 }
1621 sub bonk_val () { return 12345 }
1622
1623 As alluded to earlier you can also declare inlined subs dynamically at
1624 BEGIN time if their body consists of a lexically-scoped scalar which
1625 has no other references. Only the first example here will be inlined:
1626
1627 BEGIN {
1628 my $var = 1;
1629 no strict 'refs';
1630 *INLINED = sub () { $var };
1631 }
1632
1633 BEGIN {
1634 my $var = 1;
1635 my $ref = \$var;
1636 no strict 'refs';
1637 *NOT_INLINED = sub () { $var };
1638 }
1639
1640 A not so obvious caveat with this (see [RT #79908]) is that the
1641 variable will be immediately inlined, and will stop behaving like a
1642 normal lexical variable, e.g. this will print 79907, not 79908:
1643
1644 BEGIN {
1645 my $x = 79907;
1646 *RT_79908 = sub () { $x };
1647 $x++;
1648 }
1649 print RT_79908(); # prints 79907
1650
1651 As of Perl 5.22, this buggy behavior, while preserved for backward
1652 compatibility, is detected and emits a deprecation warning. If you
1653 want the subroutine to be inlined (with no warning), make sure the
1654 variable is not used in a context where it could be modified aside from
1655 where it is declared.
1656
1657 # Fine, no warning
1658 BEGIN {
1659 my $x = 54321;
1660 *INLINED = sub () { $x };
1661 }
1662 # Warns. Future Perl versions will stop inlining it.
1663 BEGIN {
1664 my $x;
1665 $x = 54321;
1666 *ALSO_INLINED = sub () { $x };
1667 }
1668
1669 Perl 5.22 also introduces the experimental "const" attribute as an
1670 alternative. (Disable the "experimental::const_attr" warnings if you
1671 want to use it.) When applied to an anonymous subroutine, it forces
1672 the sub to be called when the "sub" expression is evaluated. The
1673 return value is captured and turned into a constant subroutine:
1674
1675 my $x = 54321;
1676 *INLINED = sub : const { $x };
1677 $x++;
1678
1679 The return value of "INLINED" in this example will always be 54321,
1680 regardless of later modifications to $x. You can also put any
1681 arbitrary code inside the sub, at it will be executed immediately and
1682 its return value captured the same way.
1683
1684 If you really want a subroutine with a "()" prototype that returns a
1685 lexical variable you can easily force it to not be inlined by adding an
1686 explicit "return":
1687
1688 BEGIN {
1689 my $x = 79907;
1690 *RT_79908 = sub () { return $x };
1691 $x++;
1692 }
1693 print RT_79908(); # prints 79908
1694
1695 The easiest way to tell if a subroutine was inlined is by using
1696 B::Deparse. Consider this example of two subroutines returning 1, one
1697 with a "()" prototype causing it to be inlined, and one without (with
1698 deparse output truncated for clarity):
1699
1700 $ perl -MO=Deparse -le 'sub ONE { 1 } if (ONE) { print ONE if ONE }'
1701 sub ONE {
1702 1;
1703 }
1704 if (ONE ) {
1705 print ONE() if ONE ;
1706 }
1707 $ perl -MO=Deparse -le 'sub ONE () { 1 } if (ONE) { print ONE if ONE }'
1708 sub ONE () { 1 }
1709 do {
1710 print 1
1711 };
1712
1713 If you redefine a subroutine that was eligible for inlining, you'll get
1714 a warning by default. You can use this warning to tell whether or not
1715 a particular subroutine is considered inlinable, since it's different
1716 than the warning for overriding non-inlined subroutines:
1717
1718 $ perl -e 'sub one () {1} sub one () {2}'
1719 Constant subroutine one redefined at -e line 1.
1720 $ perl -we 'sub one {1} sub one {2}'
1721 Subroutine one redefined at -e line 1.
1722
1723 The warning is considered severe enough not to be affected by the -w
1724 switch (or its absence) because previously compiled invocations of the
1725 function will still be using the old value of the function. If you
1726 need to be able to redefine the subroutine, you need to ensure that it
1727 isn't inlined, either by dropping the "()" prototype (which changes
1728 calling semantics, so beware) or by thwarting the inlining mechanism in
1729 some other way, e.g. by adding an explicit "return", as mentioned
1730 above:
1731
1732 sub not_inlined () { return 23 }
1733
1734 Overriding Built-in Functions
1735 Many built-in functions may be overridden, though this should be tried
1736 only occasionally and for good reason. Typically this might be done by
1737 a package attempting to emulate missing built-in functionality on a
1738 non-Unix system.
1739
1740 Overriding may be done only by importing the name from a module at
1741 compile time--ordinary predeclaration isn't good enough. However, the
1742 "use subs" pragma lets you, in effect, predeclare subs via the import
1743 syntax, and these names may then override built-in ones:
1744
1745 use subs 'chdir', 'chroot', 'chmod', 'chown';
1746 chdir $somewhere;
1747 sub chdir { ... }
1748
1749 To unambiguously refer to the built-in form, precede the built-in name
1750 with the special package qualifier "CORE::". For example, saying
1751 "CORE::open()" always refers to the built-in "open()", even if the
1752 current package has imported some other subroutine called "&open()"
1753 from elsewhere. Even though it looks like a regular function call, it
1754 isn't: the CORE:: prefix in that case is part of Perl's syntax, and
1755 works for any keyword, regardless of what is in the CORE package.
1756 Taking a reference to it, that is, "\&CORE::open", only works for some
1757 keywords. See CORE.
1758
1759 Library modules should not in general export built-in names like "open"
1760 or "chdir" as part of their default @EXPORT list, because these may
1761 sneak into someone else's namespace and change the semantics
1762 unexpectedly. Instead, if the module adds that name to @EXPORT_OK,
1763 then it's possible for a user to import the name explicitly, but not
1764 implicitly. That is, they could say
1765
1766 use Module 'open';
1767
1768 and it would import the "open" override. But if they said
1769
1770 use Module;
1771
1772 they would get the default imports without overrides.
1773
1774 The foregoing mechanism for overriding built-in is restricted, quite
1775 deliberately, to the package that requests the import. There is a
1776 second method that is sometimes applicable when you wish to override a
1777 built-in everywhere, without regard to namespace boundaries. This is
1778 achieved by importing a sub into the special namespace
1779 "CORE::GLOBAL::". Here is an example that quite brazenly replaces the
1780 "glob" operator with something that understands regular expressions.
1781
1782 package REGlob;
1783 require Exporter;
1784 @ISA = 'Exporter';
1785 @EXPORT_OK = 'glob';
1786
1787 sub import {
1788 my $pkg = shift;
1789 return unless @_;
1790 my $sym = shift;
1791 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1792 $pkg->export($where, $sym, @_);
1793 }
1794
1795 sub glob {
1796 my $pat = shift;
1797 my @got;
1798 if (opendir my $d, '.') {
1799 @got = grep /$pat/, readdir $d;
1800 closedir $d;
1801 }
1802 return @got;
1803 }
1804 1;
1805
1806 And here's how it could be (ab)used:
1807
1808 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1809 package Foo;
1810 use REGlob 'glob'; # override glob() in Foo:: only
1811 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1812
1813 The initial comment shows a contrived, even dangerous example. By
1814 overriding "glob" globally, you would be forcing the new (and
1815 subversive) behavior for the "glob" operator for every namespace,
1816 without the complete cognizance or cooperation of the modules that own
1817 those namespaces. Naturally, this should be done with extreme
1818 caution--if it must be done at all.
1819
1820 The "REGlob" example above does not implement all the support needed to
1821 cleanly override perl's "glob" operator. The built-in "glob" has
1822 different behaviors depending on whether it appears in a scalar or list
1823 context, but our "REGlob" doesn't. Indeed, many perl built-in have
1824 such context sensitive behaviors, and these must be adequately
1825 supported by a properly written override. For a fully functional
1826 example of overriding "glob", study the implementation of
1827 "File::DosGlob" in the standard library.
1828
1829 When you override a built-in, your replacement should be consistent (if
1830 possible) with the built-in native syntax. You can achieve this by
1831 using a suitable prototype. To get the prototype of an overridable
1832 built-in, use the "prototype" function with an argument of
1833 "CORE::builtin_name" (see "prototype" in perlfunc).
1834
1835 Note however that some built-ins can't have their syntax expressed by a
1836 prototype (such as "system" or "chomp"). If you override them you
1837 won't be able to fully mimic their original syntax.
1838
1839 The built-ins "do", "require" and "glob" can also be overridden, but
1840 due to special magic, their original syntax is preserved, and you don't
1841 have to define a prototype for their replacements. (You can't override
1842 the "do BLOCK" syntax, though).
1843
1844 "require" has special additional dark magic: if you invoke your
1845 "require" replacement as "require Foo::Bar", it will actually receive
1846 the argument "Foo/Bar.pm" in @_. See "require" in perlfunc.
1847
1848 And, as you'll have noticed from the previous example, if you override
1849 "glob", the "<*>" glob operator is overridden as well.
1850
1851 In a similar fashion, overriding the "readline" function also overrides
1852 the equivalent I/O operator "<FILEHANDLE>". Also, overriding
1853 "readpipe" also overrides the operators "``" and "qx//".
1854
1855 Finally, some built-ins (e.g. "exists" or "grep") can't be overridden.
1856
1857 Autoloading
1858 If you call a subroutine that is undefined, you would ordinarily get an
1859 immediate, fatal error complaining that the subroutine doesn't exist.
1860 (Likewise for subroutines being used as methods, when the method
1861 doesn't exist in any base class of the class's package.) However, if
1862 an "AUTOLOAD" subroutine is defined in the package or packages used to
1863 locate the original subroutine, then that "AUTOLOAD" subroutine is
1864 called with the arguments that would have been passed to the original
1865 subroutine. The fully qualified name of the original subroutine
1866 magically appears in the global $AUTOLOAD variable of the same package
1867 as the "AUTOLOAD" routine. The name is not passed as an ordinary
1868 argument because, er, well, just because, that's why. (As an
1869 exception, a method call to a nonexistent "import" or "unimport" method
1870 is just skipped instead. Also, if the AUTOLOAD subroutine is an XSUB,
1871 there are other ways to retrieve the subroutine name. See "Autoloading
1872 with XSUBs" in perlguts for details.)
1873
1874 Many "AUTOLOAD" routines load in a definition for the requested
1875 subroutine using eval(), then execute that subroutine using a special
1876 form of goto() that erases the stack frame of the "AUTOLOAD" routine
1877 without a trace. (See the source to the standard module documented in
1878 AutoLoader, for example.) But an "AUTOLOAD" routine can also just
1879 emulate the routine and never define it. For example, let's pretend
1880 that a function that wasn't defined should just invoke "system" with
1881 those arguments. All you'd do is:
1882
1883 sub AUTOLOAD {
1884 our $AUTOLOAD; # keep 'use strict' happy
1885 my $program = $AUTOLOAD;
1886 $program =~ s/.*:://;
1887 system($program, @_);
1888 }
1889 date();
1890 who();
1891 ls('-l');
1892
1893 In fact, if you predeclare functions you want to call that way, you
1894 don't even need parentheses:
1895
1896 use subs qw(date who ls);
1897 date;
1898 who;
1899 ls '-l';
1900
1901 A more complete example of this is the Shell module on CPAN, which can
1902 treat undefined subroutine calls as calls to external programs.
1903
1904 Mechanisms are available to help modules writers split their modules
1905 into autoloadable files. See the standard AutoLoader module described
1906 in AutoLoader and in AutoSplit, the standard SelfLoader modules in
1907 SelfLoader, and the document on adding C functions to Perl code in
1908 perlxs.
1909
1910 Subroutine Attributes
1911 A subroutine declaration or definition may have a list of attributes
1912 associated with it. If such an attribute list is present, it is broken
1913 up at space or colon boundaries and treated as though a "use
1914 attributes" had been seen. See attributes for details about what
1915 attributes are currently supported. Unlike the limitation with the
1916 obsolescent "use attrs", the "sub : ATTRLIST" syntax works to associate
1917 the attributes with a pre-declaration, and not just with a subroutine
1918 definition.
1919
1920 The attributes must be valid as simple identifier names (without any
1921 punctuation other than the '_' character). They may have a parameter
1922 list appended, which is only checked for whether its parentheses
1923 ('(',')') nest properly.
1924
1925 Examples of valid syntax (even though the attributes are unknown):
1926
1927 sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
1928 sub plugh () : Ugly('\(") :Bad;
1929 sub xyzzy : _5x5 { ... }
1930
1931 Examples of invalid syntax:
1932
1933 sub fnord : switch(10,foo(); # ()-string not balanced
1934 sub snoid : Ugly('('); # ()-string not balanced
1935 sub xyzzy : 5x5; # "5x5" not a valid identifier
1936 sub plugh : Y2::north; # "Y2::north" not a simple identifier
1937 sub snurt : foo + bar; # "+" not a colon or space
1938
1939 The attribute list is passed as a list of constant strings to the code
1940 which associates them with the subroutine. In particular, the second
1941 example of valid syntax above currently looks like this in terms of how
1942 it's parsed and invoked:
1943
1944 use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
1945
1946 For further details on attribute lists and their manipulation, see
1947 attributes and Attribute::Handlers.
1948
1950 See "Function Templates" in perlref for more about references and
1951 closures. See perlxs if you'd like to learn about calling C
1952 subroutines from Perl. See perlembed if you'd like to learn about
1953 calling Perl subroutines from C. See perlmod to learn about bundling
1954 up your functions in separate files. See perlmodlib to learn what
1955 library modules come standard on your system. See perlootut to learn
1956 how to make object method calls.
1957
1958
1959
1960perl v5.32.1 2021-03-31 PERLSUB(1)