1PERLSUB(1) Perl Programmers Reference Guide PERLSUB(1)
2
3
4
6 perlsub - Perl subroutines
7
9 To declare subroutines:
10
11 sub NAME; # A "forward" declaration.
12 sub NAME(PROTO); # ditto, but with prototypes
13 sub NAME : ATTRS; # with attributes
14 sub NAME(PROTO) : ATTRS; # with attributes and prototypes
15
16 sub NAME BLOCK # A declaration and a definition.
17 sub NAME(PROTO) BLOCK # ditto, but with prototypes
18 sub NAME : ATTRS BLOCK # with attributes
19 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
20
21 use feature 'signatures';
22 sub NAME(SIG) BLOCK # with signature
23 sub NAME :ATTRS (SIG) BLOCK # with signature, attributes
24 sub NAME :prototype(PROTO) (SIG) BLOCK # with signature, prototype
25
26 To define an anonymous subroutine at runtime:
27
28 $subref = sub BLOCK; # no proto
29 $subref = sub (PROTO) BLOCK; # with proto
30 $subref = sub : ATTRS BLOCK; # with attributes
31 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
32
33 use feature 'signatures';
34 $subref = sub (SIG) BLOCK; # with signature
35 $subref = sub : ATTRS(SIG) BLOCK; # with signature, attributes
36
37 To import subroutines:
38
39 use MODULE qw(NAME1 NAME2 NAME3);
40
41 To call subroutines:
42
43 NAME(LIST); # & is optional with parentheses.
44 NAME LIST; # Parentheses optional if predeclared/imported.
45 &NAME(LIST); # Circumvent prototypes.
46 &NAME; # Makes current @_ visible to called subroutine.
47
49 Like many languages, Perl provides for user-defined subroutines. These
50 may be located anywhere in the main program, loaded in from other files
51 via the "do", "require", or "use" keywords, or generated on the fly
52 using "eval" or anonymous subroutines. You can even call a function
53 indirectly using a variable containing its name or a CODE reference.
54
55 The Perl model for function call and return values is simple: all
56 functions are passed as parameters one single flat list of scalars, and
57 all functions likewise return to their caller one single flat list of
58 scalars. Any arrays or hashes in these call and return lists will
59 collapse, losing their identities--but you may always use pass-by-
60 reference instead to avoid this. Both call and return lists may
61 contain as many or as few scalar elements as you'd like. (Often a
62 function without an explicit return statement is called a subroutine,
63 but there's really no difference from Perl's perspective.)
64
65 Any arguments passed in show up in the array @_. (They may also show
66 up in lexical variables introduced by a signature; see "Signatures"
67 below.) Therefore, if you called a function with two arguments, those
68 would be stored in $_[0] and $_[1]. The array @_ is a local array, but
69 its elements are aliases for the actual scalar parameters. In
70 particular, if an element $_[0] is updated, the corresponding argument
71 is updated (or an error occurs if it is not updatable). If an argument
72 is an array or hash element which did not exist when the function was
73 called, that element is created only when (and if) it is modified or a
74 reference to it is taken. (Some earlier versions of Perl created the
75 element whether or not the element was assigned to.) Assigning to the
76 whole array @_ removes that aliasing, and does not update any
77 arguments.
78
79 A "return" statement may be used to exit a subroutine, optionally
80 specifying the returned value, which will be evaluated in the
81 appropriate context (list, scalar, or void) depending on the context of
82 the subroutine call. If you specify no return value, the subroutine
83 returns an empty list in list context, the undefined value in scalar
84 context, or nothing in void context. If you return one or more
85 aggregates (arrays and hashes), these will be flattened together into
86 one large indistinguishable list.
87
88 If no "return" is found and if the last statement is an expression, its
89 value is returned. If the last statement is a loop control structure
90 like a "foreach" or a "while", the returned value is unspecified. The
91 empty sub returns the empty list.
92
93 Aside from an experimental facility (see "Signatures" below), Perl does
94 not have named formal parameters. In practice all you do is assign to
95 a "my()" list of these. Variables that aren't declared to be private
96 are global variables. For gory details on creating private variables,
97 see "Private Variables via my()" and "Temporary Values via local()".
98 To create protected environments for a set of functions in a separate
99 package (and probably a separate file), see "Packages" in perlmod.
100
101 Example:
102
103 sub max {
104 my $max = shift(@_);
105 foreach $foo (@_) {
106 $max = $foo if $max < $foo;
107 }
108 return $max;
109 }
110 $bestday = max($mon,$tue,$wed,$thu,$fri);
111
112 Example:
113
114 # get a line, combining continuation lines
115 # that start with whitespace
116
117 sub get_line {
118 $thisline = $lookahead; # global variables!
119 LINE: while (defined($lookahead = <STDIN>)) {
120 if ($lookahead =~ /^[ \t]/) {
121 $thisline .= $lookahead;
122 }
123 else {
124 last LINE;
125 }
126 }
127 return $thisline;
128 }
129
130 $lookahead = <STDIN>; # get first line
131 while (defined($line = get_line())) {
132 ...
133 }
134
135 Assigning to a list of private variables to name your arguments:
136
137 sub maybeset {
138 my($key, $value) = @_;
139 $Foo{$key} = $value unless $Foo{$key};
140 }
141
142 Because the assignment copies the values, this also has the effect of
143 turning call-by-reference into call-by-value. Otherwise a function is
144 free to do in-place modifications of @_ and change its caller's values.
145
146 upcase_in($v1, $v2); # this changes $v1 and $v2
147 sub upcase_in {
148 for (@_) { tr/a-z/A-Z/ }
149 }
150
151 You aren't allowed to modify constants in this way, of course. If an
152 argument were actually literal and you tried to change it, you'd take a
153 (presumably fatal) exception. For example, this won't work:
154
155 upcase_in("frederick");
156
157 It would be much safer if the "upcase_in()" function were written to
158 return a copy of its parameters instead of changing them in place:
159
160 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
161 sub upcase {
162 return unless defined wantarray; # void context, do nothing
163 my @parms = @_;
164 for (@parms) { tr/a-z/A-Z/ }
165 return wantarray ? @parms : $parms[0];
166 }
167
168 Notice how this (unprototyped) function doesn't care whether it was
169 passed real scalars or arrays. Perl sees all arguments as one big,
170 long, flat parameter list in @_. This is one area where Perl's simple
171 argument-passing style shines. The "upcase()" function would work
172 perfectly well without changing the "upcase()" definition even if we
173 fed it things like this:
174
175 @newlist = upcase(@list1, @list2);
176 @newlist = upcase( split /:/, $var );
177
178 Do not, however, be tempted to do this:
179
180 (@a, @b) = upcase(@list1, @list2);
181
182 Like the flattened incoming parameter list, the return list is also
183 flattened on return. So all you have managed to do here is stored
184 everything in @a and made @b empty. See "Pass by Reference" for
185 alternatives.
186
187 A subroutine may be called using an explicit "&" prefix. The "&" is
188 optional in modern Perl, as are parentheses if the subroutine has been
189 predeclared. The "&" is not optional when just naming the subroutine,
190 such as when it's used as an argument to defined() or undef(). Nor is
191 it optional when you want to do an indirect subroutine call with a
192 subroutine name or reference using the "&$subref()" or "&{$subref}()"
193 constructs, although the "$subref->()" notation solves that problem.
194 See perlref for more about all that.
195
196 Subroutines may be called recursively. If a subroutine is called using
197 the "&" form, the argument list is optional, and if omitted, no @_
198 array is set up for the subroutine: the @_ array at the time of the
199 call is visible to subroutine instead. This is an efficiency mechanism
200 that new users may wish to avoid.
201
202 &foo(1,2,3); # pass three arguments
203 foo(1,2,3); # the same
204
205 foo(); # pass a null list
206 &foo(); # the same
207
208 &foo; # foo() get current args, like foo(@_) !!
209 use strict 'subs';
210 foo; # like foo() iff sub foo predeclared, else
211 # a compile-time error
212 no strict 'subs';
213 foo; # like foo() iff sub foo predeclared, else
214 # a literal string "foo"
215
216 Not only does the "&" form make the argument list optional, it also
217 disables any prototype checking on arguments you do provide. This is
218 partly for historical reasons, and partly for having a convenient way
219 to cheat if you know what you're doing. See "Prototypes" below.
220
221 Since Perl 5.16.0, the "__SUB__" token is available under "use feature
222 'current_sub'" and "use 5.16.0". It will evaluate to a reference to
223 the currently-running sub, which allows for recursive calls without
224 knowing your subroutine's name.
225
226 use 5.16.0;
227 my $factorial = sub {
228 my ($x) = @_;
229 return 1 if $x == 1;
230 return($x * __SUB__->( $x - 1 ) );
231 };
232
233 The behavior of "__SUB__" within a regex code block (such as
234 "/(?{...})/") is subject to change.
235
236 Subroutines whose names are in all upper case are reserved to the Perl
237 core, as are modules whose names are in all lower case. A subroutine
238 in all capitals is a loosely-held convention meaning it will be called
239 indirectly by the run-time system itself, usually due to a triggered
240 event. Subroutines whose name start with a left parenthesis are also
241 reserved the same way. The following is a list of some subroutines
242 that currently do special, pre-defined things.
243
244 documented later in this document
245 "AUTOLOAD"
246
247 documented in perlmod
248 "CLONE", "CLONE_SKIP"
249
250 documented in perlobj
251 "DESTROY", "DOES"
252
253 documented in perltie
254 "BINMODE", "CLEAR", "CLOSE", "DELETE", "DESTROY", "EOF", "EXISTS",
255 "EXTEND", "FETCH", "FETCHSIZE", "FILENO", "FIRSTKEY", "GETC",
256 "NEXTKEY", "OPEN", "POP", "PRINT", "PRINTF", "PUSH", "READ",
257 "READLINE", "SCALAR", "SEEK", "SHIFT", "SPLICE", "STORE",
258 "STORESIZE", "TELL", "TIEARRAY", "TIEHANDLE", "TIEHASH",
259 "TIESCALAR", "UNSHIFT", "UNTIE", "WRITE"
260
261 documented in PerlIO::via
262 "BINMODE", "CLEARERR", "CLOSE", "EOF", "ERROR", "FDOPEN", "FILENO",
263 "FILL", "FLUSH", "OPEN", "POPPED", "PUSHED", "READ", "SEEK",
264 "SETLINEBUF", "SYSOPEN", "TELL", "UNREAD", "UTF8", "WRITE"
265
266 documented in perlfunc
267 "import" , "unimport" , "INC"
268
269 documented in UNIVERSAL
270 "VERSION"
271
272 documented in perldebguts
273 "DB::DB", "DB::sub", "DB::lsub", "DB::goto", "DB::postponed"
274
275 undocumented, used internally by the overload feature
276 any starting with "("
277
278 The "BEGIN", "UNITCHECK", "CHECK", "INIT" and "END" subroutines are not
279 so much subroutines as named special code blocks, of which you can have
280 more than one in a package, and which you can not call explicitly. See
281 "BEGIN, UNITCHECK, CHECK, INIT and END" in perlmod
282
283 Signatures
284 WARNING: Subroutine signatures are experimental. The feature may be
285 modified or removed in future versions of Perl.
286
287 Perl has an experimental facility to allow a subroutine's formal
288 parameters to be introduced by special syntax, separate from the
289 procedural code of the subroutine body. The formal parameter list is
290 known as a signature. The facility must be enabled first by a
291 pragmatic declaration, "use feature 'signatures'", and it will produce
292 a warning unless the "experimental::signatures" warnings category is
293 disabled.
294
295 The signature is part of a subroutine's body. Normally the body of a
296 subroutine is simply a braced block of code, but when using a
297 signature, the signature is a parenthesised list that goes immediately
298 before the block, after any name or attributes.
299
300 For example,
301
302 sub foo :lvalue ($a, $b = 1, @c) { .... }
303
304 The signature declares lexical variables that are in scope for the
305 block. When the subroutine is called, the signature takes control
306 first. It populates the signature variables from the list of arguments
307 that were passed. If the argument list doesn't meet the requirements
308 of the signature, then it will throw an exception. When the signature
309 processing is complete, control passes to the block.
310
311 Positional parameters are handled by simply naming scalar variables in
312 the signature. For example,
313
314 sub foo ($left, $right) {
315 return $left + $right;
316 }
317
318 takes two positional parameters, which must be filled at runtime by two
319 arguments. By default the parameters are mandatory, and it is not
320 permitted to pass more arguments than expected. So the above is
321 equivalent to
322
323 sub foo {
324 die "Too many arguments for subroutine" unless @_ <= 2;
325 die "Too few arguments for subroutine" unless @_ >= 2;
326 my $left = $_[0];
327 my $right = $_[1];
328 return $left + $right;
329 }
330
331 An argument can be ignored by omitting the main part of the name from a
332 parameter declaration, leaving just a bare "$" sigil. For example,
333
334 sub foo ($first, $, $third) {
335 return "first=$first, third=$third";
336 }
337
338 Although the ignored argument doesn't go into a variable, it is still
339 mandatory for the caller to pass it.
340
341 A positional parameter is made optional by giving a default value,
342 separated from the parameter name by "=":
343
344 sub foo ($left, $right = 0) {
345 return $left + $right;
346 }
347
348 The above subroutine may be called with either one or two arguments.
349 The default value expression is evaluated when the subroutine is
350 called, so it may provide different default values for different calls.
351 It is only evaluated if the argument was actually omitted from the
352 call. For example,
353
354 my $auto_id = 0;
355 sub foo ($thing, $id = $auto_id++) {
356 print "$thing has ID $id";
357 }
358
359 automatically assigns distinct sequential IDs to things for which no ID
360 was supplied by the caller. A default value expression may also refer
361 to parameters earlier in the signature, making the default for one
362 parameter vary according to the earlier parameters. For example,
363
364 sub foo ($first_name, $surname, $nickname = $first_name) {
365 print "$first_name $surname is known as \"$nickname\"";
366 }
367
368 An optional parameter can be nameless just like a mandatory parameter.
369 For example,
370
371 sub foo ($thing, $ = 1) {
372 print $thing;
373 }
374
375 The parameter's default value will still be evaluated if the
376 corresponding argument isn't supplied, even though the value won't be
377 stored anywhere. This is in case evaluating it has important side
378 effects. However, it will be evaluated in void context, so if it
379 doesn't have side effects and is not trivial it will generate a warning
380 if the "void" warning category is enabled. If a nameless optional
381 parameter's default value is not important, it may be omitted just as
382 the parameter's name was:
383
384 sub foo ($thing, $=) {
385 print $thing;
386 }
387
388 Optional positional parameters must come after all mandatory positional
389 parameters. (If there are no mandatory positional parameters then an
390 optional positional parameters can be the first thing in the
391 signature.) If there are multiple optional positional parameters and
392 not enough arguments are supplied to fill them all, they will be filled
393 from left to right.
394
395 After positional parameters, additional arguments may be captured in a
396 slurpy parameter. The simplest form of this is just an array variable:
397
398 sub foo ($filter, @inputs) {
399 print $filter->($_) foreach @inputs;
400 }
401
402 With a slurpy parameter in the signature, there is no upper limit on
403 how many arguments may be passed. A slurpy array parameter may be
404 nameless just like a positional parameter, in which case its only
405 effect is to turn off the argument limit that would otherwise apply:
406
407 sub foo ($thing, @) {
408 print $thing;
409 }
410
411 A slurpy parameter may instead be a hash, in which case the arguments
412 available to it are interpreted as alternating keys and values. There
413 must be as many keys as values: if there is an odd argument then an
414 exception will be thrown. Keys will be stringified, and if there are
415 duplicates then the later instance takes precedence over the earlier,
416 as with standard hash construction.
417
418 sub foo ($filter, %inputs) {
419 print $filter->($_, $inputs{$_}) foreach sort keys %inputs;
420 }
421
422 A slurpy hash parameter may be nameless just like other kinds of
423 parameter. It still insists that the number of arguments available to
424 it be even, even though they're not being put into a variable.
425
426 sub foo ($thing, %) {
427 print $thing;
428 }
429
430 A slurpy parameter, either array or hash, must be the last thing in the
431 signature. It may follow mandatory and optional positional parameters;
432 it may also be the only thing in the signature. Slurpy parameters
433 cannot have default values: if no arguments are supplied for them then
434 you get an empty array or empty hash.
435
436 A signature may be entirely empty, in which case all it does is check
437 that the caller passed no arguments:
438
439 sub foo () {
440 return 123;
441 }
442
443 When using a signature, the arguments are still available in the
444 special array variable @_, in addition to the lexical variables of the
445 signature. There is a difference between the two ways of accessing the
446 arguments: @_ aliases the arguments, but the signature variables get
447 copies of the arguments. So writing to a signature variable only
448 changes that variable, and has no effect on the caller's variables, but
449 writing to an element of @_ modifies whatever the caller used to supply
450 that argument.
451
452 There is a potential syntactic ambiguity between signatures and
453 prototypes (see "Prototypes"), because both start with an opening
454 parenthesis and both can appear in some of the same places, such as
455 just after the name in a subroutine declaration. For historical
456 reasons, when signatures are not enabled, any opening parenthesis in
457 such a context will trigger very forgiving prototype parsing. Most
458 signatures will be interpreted as prototypes in those circumstances,
459 but won't be valid prototypes. (A valid prototype cannot contain any
460 alphabetic character.) This will lead to somewhat confusing error
461 messages.
462
463 To avoid ambiguity, when signatures are enabled the special syntax for
464 prototypes is disabled. There is no attempt to guess whether a
465 parenthesised group was intended to be a prototype or a signature. To
466 give a subroutine a prototype under these circumstances, use a
467 prototype attribute. For example,
468
469 sub foo :prototype($) { $_[0] }
470
471 It is entirely possible for a subroutine to have both a prototype and a
472 signature. They do different jobs: the prototype affects compilation
473 of calls to the subroutine, and the signature puts argument values into
474 lexical variables at runtime. You can therefore write
475
476 sub foo :prototype($$) ($left, $right) {
477 return $left + $right;
478 }
479
480 The prototype attribute, and any other attributes, must come before the
481 signature. The signature always immediately precedes the block of the
482 subroutine's body.
483
484 Private Variables via my()
485 Synopsis:
486
487 my $foo; # declare $foo lexically local
488 my (@wid, %get); # declare list of variables local
489 my $foo = "flurp"; # declare $foo lexical, and init it
490 my @oof = @bar; # declare @oof lexical, and init it
491 my $x : Foo = $y; # similar, with an attribute applied
492
493 WARNING: The use of attribute lists on "my" declarations is still
494 evolving. The current semantics and interface are subject to change.
495 See attributes and Attribute::Handlers.
496
497 The "my" operator declares the listed variables to be lexically
498 confined to the enclosing block, conditional
499 ("if"/"unless"/"elsif"/"else"), loop
500 ("for"/"foreach"/"while"/"until"/"continue"), subroutine, "eval", or
501 "do"/"require"/"use"'d file. If more than one value is listed, the
502 list must be placed in parentheses. All listed elements must be legal
503 lvalues. Only alphanumeric identifiers may be lexically
504 scoped--magical built-ins like $/ must currently be "local"ized with
505 "local" instead.
506
507 Unlike dynamic variables created by the "local" operator, lexical
508 variables declared with "my" are totally hidden from the outside world,
509 including any called subroutines. This is true if it's the same
510 subroutine called from itself or elsewhere--every call gets its own
511 copy.
512
513 This doesn't mean that a "my" variable declared in a statically
514 enclosing lexical scope would be invisible. Only dynamic scopes are
515 cut off. For example, the "bumpx()" function below has access to the
516 lexical $x variable because both the "my" and the "sub" occurred at the
517 same scope, presumably file scope.
518
519 my $x = 10;
520 sub bumpx { $x++ }
521
522 An "eval()", however, can see lexical variables of the scope it is
523 being evaluated in, so long as the names aren't hidden by declarations
524 within the "eval()" itself. See perlref.
525
526 The parameter list to my() may be assigned to if desired, which allows
527 you to initialize your variables. (If no initializer is given for a
528 particular variable, it is created with the undefined value.) Commonly
529 this is used to name input parameters to a subroutine. Examples:
530
531 $arg = "fred"; # "global" variable
532 $n = cube_root(27);
533 print "$arg thinks the root is $n\n";
534 fred thinks the root is 3
535
536 sub cube_root {
537 my $arg = shift; # name doesn't matter
538 $arg **= 1/3;
539 return $arg;
540 }
541
542 The "my" is simply a modifier on something you might assign to. So
543 when you do assign to variables in its argument list, "my" doesn't
544 change whether those variables are viewed as a scalar or an array. So
545
546 my ($foo) = <STDIN>; # WRONG?
547 my @FOO = <STDIN>;
548
549 both supply a list context to the right-hand side, while
550
551 my $foo = <STDIN>;
552
553 supplies a scalar context. But the following declares only one
554 variable:
555
556 my $foo, $bar = 1; # WRONG
557
558 That has the same effect as
559
560 my $foo;
561 $bar = 1;
562
563 The declared variable is not introduced (is not visible) until after
564 the current statement. Thus,
565
566 my $x = $x;
567
568 can be used to initialize a new $x with the value of the old $x, and
569 the expression
570
571 my $x = 123 and $x == 123
572
573 is false unless the old $x happened to have the value 123.
574
575 Lexical scopes of control structures are not bounded precisely by the
576 braces that delimit their controlled blocks; control expressions are
577 part of that scope, too. Thus in the loop
578
579 while (my $line = <>) {
580 $line = lc $line;
581 } continue {
582 print $line;
583 }
584
585 the scope of $line extends from its declaration throughout the rest of
586 the loop construct (including the "continue" clause), but not beyond
587 it. Similarly, in the conditional
588
589 if ((my $answer = <STDIN>) =~ /^yes$/i) {
590 user_agrees();
591 } elsif ($answer =~ /^no$/i) {
592 user_disagrees();
593 } else {
594 chomp $answer;
595 die "'$answer' is neither 'yes' nor 'no'";
596 }
597
598 the scope of $answer extends from its declaration through the rest of
599 that conditional, including any "elsif" and "else" clauses, but not
600 beyond it. See "Simple Statements" in perlsyn for information on the
601 scope of variables in statements with modifiers.
602
603 The "foreach" loop defaults to scoping its index variable dynamically
604 in the manner of "local". However, if the index variable is prefixed
605 with the keyword "my", or if there is already a lexical by that name in
606 scope, then a new lexical is created instead. Thus in the loop
607
608 for my $i (1, 2, 3) {
609 some_function();
610 }
611
612 the scope of $i extends to the end of the loop, but not beyond it,
613 rendering the value of $i inaccessible within "some_function()".
614
615 Some users may wish to encourage the use of lexically scoped variables.
616 As an aid to catching implicit uses to package variables, which are
617 always global, if you say
618
619 use strict 'vars';
620
621 then any variable mentioned from there to the end of the enclosing
622 block must either refer to a lexical variable, be predeclared via "our"
623 or "use vars", or else must be fully qualified with the package name.
624 A compilation error results otherwise. An inner block may countermand
625 this with "no strict 'vars'".
626
627 A "my" has both a compile-time and a run-time effect. At compile time,
628 the compiler takes notice of it. The principal usefulness of this is
629 to quiet "use strict 'vars'", but it is also essential for generation
630 of closures as detailed in perlref. Actual initialization is delayed
631 until run time, though, so it gets executed at the appropriate time,
632 such as each time through a loop, for example.
633
634 Variables declared with "my" are not part of any package and are
635 therefore never fully qualified with the package name. In particular,
636 you're not allowed to try to make a package variable (or other global)
637 lexical:
638
639 my $pack::var; # ERROR! Illegal syntax
640
641 In fact, a dynamic variable (also known as package or global variables)
642 are still accessible using the fully qualified "::" notation even while
643 a lexical of the same name is also visible:
644
645 package main;
646 local $x = 10;
647 my $x = 20;
648 print "$x and $::x\n";
649
650 That will print out 20 and 10.
651
652 You may declare "my" variables at the outermost scope of a file to hide
653 any such identifiers from the world outside that file. This is similar
654 in spirit to C's static variables when they are used at the file level.
655 To do this with a subroutine requires the use of a closure (an
656 anonymous function that accesses enclosing lexicals). If you want to
657 create a private subroutine that cannot be called from outside that
658 block, it can declare a lexical variable containing an anonymous sub
659 reference:
660
661 my $secret_version = '1.001-beta';
662 my $secret_sub = sub { print $secret_version };
663 &$secret_sub();
664
665 As long as the reference is never returned by any function within the
666 module, no outside module can see the subroutine, because its name is
667 not in any package's symbol table. Remember that it's not REALLY
668 called $some_pack::secret_version or anything; it's just
669 $secret_version, unqualified and unqualifiable.
670
671 This does not work with object methods, however; all object methods
672 have to be in the symbol table of some package to be found. See
673 "Function Templates" in perlref for something of a work-around to this.
674
675 Persistent Private Variables
676 There are two ways to build persistent private variables in Perl 5.10.
677 First, you can simply use the "state" feature. Or, you can use
678 closures, if you want to stay compatible with releases older than 5.10.
679
680 Persistent variables via state()
681
682 Beginning with Perl 5.10.0, you can declare variables with the "state"
683 keyword in place of "my". For that to work, though, you must have
684 enabled that feature beforehand, either by using the "feature" pragma,
685 or by using "-E" on one-liners (see feature). Beginning with Perl
686 5.16, the "CORE::state" form does not require the "feature" pragma.
687
688 The "state" keyword creates a lexical variable (following the same
689 scoping rules as "my") that persists from one subroutine call to the
690 next. If a state variable resides inside an anonymous subroutine, then
691 each copy of the subroutine has its own copy of the state variable.
692 However, the value of the state variable will still persist between
693 calls to the same copy of the anonymous subroutine. (Don't forget that
694 "sub { ... }" creates a new subroutine each time it is executed.)
695
696 For example, the following code maintains a private counter,
697 incremented each time the gimme_another() function is called:
698
699 use feature 'state';
700 sub gimme_another { state $x; return ++$x }
701
702 And this example uses anonymous subroutines to create separate
703 counters:
704
705 use feature 'state';
706 sub create_counter {
707 return sub { state $x; return ++$x }
708 }
709
710 Also, since $x is lexical, it can't be reached or modified by any Perl
711 code outside.
712
713 When combined with variable declaration, simple assignment to "state"
714 variables (as in "state $x = 42") is executed only the first time.
715 When such statements are evaluated subsequent times, the assignment is
716 ignored. The behavior of assignment to "state" declarations where the
717 left hand side of the assignment involves any parentheses is currently
718 undefined.
719
720 Persistent variables with closures
721
722 Just because a lexical variable is lexically (also called statically)
723 scoped to its enclosing block, "eval", or "do" FILE, this doesn't mean
724 that within a function it works like a C static. It normally works
725 more like a C auto, but with implicit garbage collection.
726
727 Unlike local variables in C or C++, Perl's lexical variables don't
728 necessarily get recycled just because their scope has exited. If
729 something more permanent is still aware of the lexical, it will stick
730 around. So long as something else references a lexical, that lexical
731 won't be freed--which is as it should be. You wouldn't want memory
732 being free until you were done using it, or kept around once you were
733 done. Automatic garbage collection takes care of this for you.
734
735 This means that you can pass back or save away references to lexical
736 variables, whereas to return a pointer to a C auto is a grave error.
737 It also gives us a way to simulate C's function statics. Here's a
738 mechanism for giving a function private variables with both lexical
739 scoping and a static lifetime. If you do want to create something like
740 C's static variables, just enclose the whole function in an extra
741 block, and put the static variable outside the function but in the
742 block.
743
744 {
745 my $secret_val = 0;
746 sub gimme_another {
747 return ++$secret_val;
748 }
749 }
750 # $secret_val now becomes unreachable by the outside
751 # world, but retains its value between calls to gimme_another
752
753 If this function is being sourced in from a separate file via "require"
754 or "use", then this is probably just fine. If it's all in the main
755 program, you'll need to arrange for the "my" to be executed early,
756 either by putting the whole block above your main program, or more
757 likely, placing merely a "BEGIN" code block around it to make sure it
758 gets executed before your program starts to run:
759
760 BEGIN {
761 my $secret_val = 0;
762 sub gimme_another {
763 return ++$secret_val;
764 }
765 }
766
767 See "BEGIN, UNITCHECK, CHECK, INIT and END" in perlmod about the
768 special triggered code blocks, "BEGIN", "UNITCHECK", "CHECK", "INIT"
769 and "END".
770
771 If declared at the outermost scope (the file scope), then lexicals work
772 somewhat like C's file statics. They are available to all functions in
773 that same file declared below them, but are inaccessible from outside
774 that file. This strategy is sometimes used in modules to create
775 private variables that the whole module can see.
776
777 Temporary Values via local()
778 WARNING: In general, you should be using "my" instead of "local",
779 because it's faster and safer. Exceptions to this include the global
780 punctuation variables, global filehandles and formats, and direct
781 manipulation of the Perl symbol table itself. "local" is mostly used
782 when the current value of a variable must be visible to called
783 subroutines.
784
785 Synopsis:
786
787 # localization of values
788
789 local $foo; # make $foo dynamically local
790 local (@wid, %get); # make list of variables local
791 local $foo = "flurp"; # make $foo dynamic, and init it
792 local @oof = @bar; # make @oof dynamic, and init it
793
794 local $hash{key} = "val"; # sets a local value for this hash entry
795 delete local $hash{key}; # delete this entry for the current block
796 local ($cond ? $v1 : $v2); # several types of lvalues support
797 # localization
798
799 # localization of symbols
800
801 local *FH; # localize $FH, @FH, %FH, &FH ...
802 local *merlyn = *randal; # now $merlyn is really $randal, plus
803 # @merlyn is really @randal, etc
804 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
805 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
806
807 A "local" modifies its listed variables to be "local" to the enclosing
808 block, "eval", or "do FILE"--and to any subroutine called from within
809 that block. A "local" just gives temporary values to global (meaning
810 package) variables. It does not create a local variable. This is
811 known as dynamic scoping. Lexical scoping is done with "my", which
812 works more like C's auto declarations.
813
814 Some types of lvalues can be localized as well: hash and array elements
815 and slices, conditionals (provided that their result is always
816 localizable), and symbolic references. As for simple variables, this
817 creates new, dynamically scoped values.
818
819 If more than one variable or expression is given to "local", they must
820 be placed in parentheses. This operator works by saving the current
821 values of those variables in its argument list on a hidden stack and
822 restoring them upon exiting the block, subroutine, or eval. This means
823 that called subroutines can also reference the local variable, but not
824 the global one. The argument list may be assigned to if desired, which
825 allows you to initialize your local variables. (If no initializer is
826 given for a particular variable, it is created with an undefined
827 value.)
828
829 Because "local" is a run-time operator, it gets executed each time
830 through a loop. Consequently, it's more efficient to localize your
831 variables outside the loop.
832
833 Grammatical note on local()
834
835 A "local" is simply a modifier on an lvalue expression. When you
836 assign to a "local"ized variable, the "local" doesn't change whether
837 its list is viewed as a scalar or an array. So
838
839 local($foo) = <STDIN>;
840 local @FOO = <STDIN>;
841
842 both supply a list context to the right-hand side, while
843
844 local $foo = <STDIN>;
845
846 supplies a scalar context.
847
848 Localization of special variables
849
850 If you localize a special variable, you'll be giving a new value to it,
851 but its magic won't go away. That means that all side-effects related
852 to this magic still work with the localized value.
853
854 This feature allows code like this to work :
855
856 # Read the whole contents of FILE in $slurp
857 { local $/ = undef; $slurp = <FILE>; }
858
859 Note, however, that this restricts localization of some values ; for
860 example, the following statement dies, as of perl 5.10.0, with an error
861 Modification of a read-only value attempted, because the $1 variable is
862 magical and read-only :
863
864 local $1 = 2;
865
866 One exception is the default scalar variable: starting with perl 5.14
867 "local($_)" will always strip all magic from $_, to make it possible to
868 safely reuse $_ in a subroutine.
869
870 WARNING: Localization of tied arrays and hashes does not currently work
871 as described. This will be fixed in a future release of Perl; in the
872 meantime, avoid code that relies on any particular behavior of
873 localising tied arrays or hashes (localising individual elements is
874 still okay). See "Localising Tied Arrays and Hashes Is Broken" in
875 perl58delta for more details.
876
877 Localization of globs
878
879 The construct
880
881 local *name;
882
883 creates a whole new symbol table entry for the glob "name" in the
884 current package. That means that all variables in its glob slot
885 ($name, @name, %name, &name, and the "name" filehandle) are dynamically
886 reset.
887
888 This implies, among other things, that any magic eventually carried by
889 those variables is locally lost. In other words, saying "local */"
890 will not have any effect on the internal value of the input record
891 separator.
892
893 Localization of elements of composite types
894
895 It's also worth taking a moment to explain what happens when you
896 "local"ize a member of a composite type (i.e. an array or hash
897 element). In this case, the element is "local"ized by name. This
898 means that when the scope of the "local()" ends, the saved value will
899 be restored to the hash element whose key was named in the "local()",
900 or the array element whose index was named in the "local()". If that
901 element was deleted while the "local()" was in effect (e.g. by a
902 "delete()" from a hash or a "shift()" of an array), it will spring back
903 into existence, possibly extending an array and filling in the skipped
904 elements with "undef". For instance, if you say
905
906 %hash = ( 'This' => 'is', 'a' => 'test' );
907 @ary = ( 0..5 );
908 {
909 local($ary[5]) = 6;
910 local($hash{'a'}) = 'drill';
911 while (my $e = pop(@ary)) {
912 print "$e . . .\n";
913 last unless $e > 3;
914 }
915 if (@ary) {
916 $hash{'only a'} = 'test';
917 delete $hash{'a'};
918 }
919 }
920 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
921 print "The array has ",scalar(@ary)," elements: ",
922 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
923
924 Perl will print
925
926 6 . . .
927 4 . . .
928 3 . . .
929 This is a test only a test.
930 The array has 6 elements: 0, 1, 2, undef, undef, 5
931
932 The behavior of local() on non-existent members of composite types is
933 subject to change in future. The behavior of local() on array elements
934 specified using negative indexes is particularly surprising, and is
935 very likely to change.
936
937 Localized deletion of elements of composite types
938
939 You can use the "delete local $array[$idx]" and "delete local
940 $hash{key}" constructs to delete a composite type entry for the current
941 block and restore it when it ends. They return the array/hash value
942 before the localization, which means that they are respectively
943 equivalent to
944
945 do {
946 my $val = $array[$idx];
947 local $array[$idx];
948 delete $array[$idx];
949 $val
950 }
951
952 and
953
954 do {
955 my $val = $hash{key};
956 local $hash{key};
957 delete $hash{key};
958 $val
959 }
960
961 except that for those the "local" is scoped to the "do" block. Slices
962 are also accepted.
963
964 my %hash = (
965 a => [ 7, 8, 9 ],
966 b => 1,
967 )
968
969 {
970 my $a = delete local $hash{a};
971 # $a is [ 7, 8, 9 ]
972 # %hash is (b => 1)
973
974 {
975 my @nums = delete local @$a[0, 2]
976 # @nums is (7, 9)
977 # $a is [ undef, 8 ]
978
979 $a[0] = 999; # will be erased when the scope ends
980 }
981 # $a is back to [ 7, 8, 9 ]
982
983 }
984 # %hash is back to its original state
985
986 This construct is supported since Perl v5.12.
987
988 Lvalue subroutines
989 It is possible to return a modifiable value from a subroutine. To do
990 this, you have to declare the subroutine to return an lvalue.
991
992 my $val;
993 sub canmod : lvalue {
994 $val; # or: return $val;
995 }
996 sub nomod {
997 $val;
998 }
999
1000 canmod() = 5; # assigns to $val
1001 nomod() = 5; # ERROR
1002
1003 The scalar/list context for the subroutine and for the right-hand side
1004 of assignment is determined as if the subroutine call is replaced by a
1005 scalar. For example, consider:
1006
1007 data(2,3) = get_data(3,4);
1008
1009 Both subroutines here are called in a scalar context, while in:
1010
1011 (data(2,3)) = get_data(3,4);
1012
1013 and in:
1014
1015 (data(2),data(3)) = get_data(3,4);
1016
1017 all the subroutines are called in a list context.
1018
1019 Lvalue subroutines are convenient, but you have to keep in mind that,
1020 when used with objects, they may violate encapsulation. A normal
1021 mutator can check the supplied argument before setting the attribute it
1022 is protecting, an lvalue subroutine cannot. If you require any special
1023 processing when storing and retrieving the values, consider using the
1024 CPAN module Sentinel or something similar.
1025
1026 Lexical Subroutines
1027 Beginning with Perl 5.18, you can declare a private subroutine with
1028 "my" or "state". As with state variables, the "state" keyword is only
1029 available under "use feature 'state'" or "use 5.010" or higher.
1030
1031 Prior to Perl 5.26, lexical subroutines were deemed experimental and
1032 were available only under the "use feature 'lexical_subs'" pragma.
1033 They also produced a warning unless the "experimental::lexical_subs"
1034 warnings category was disabled.
1035
1036 These subroutines are only visible within the block in which they are
1037 declared, and only after that declaration:
1038
1039 # Include these two lines if your code is intended to run under Perl
1040 # versions earlier than 5.26.
1041 no warnings "experimental::lexical_subs";
1042 use feature 'lexical_subs';
1043
1044 foo(); # calls the package/global subroutine
1045 state sub foo {
1046 foo(); # also calls the package subroutine
1047 }
1048 foo(); # calls "state" sub
1049 my $ref = \&foo; # take a reference to "state" sub
1050
1051 my sub bar { ... }
1052 bar(); # calls "my" sub
1053
1054 You can't (directly) write a recursive lexical subroutine:
1055
1056 # WRONG
1057 my sub baz {
1058 baz();
1059 }
1060
1061 This example fails because "baz()" refers to the package/global
1062 subroutine "baz", not the lexical subroutine currently being defined.
1063
1064 The solution is to use "__SUB__":
1065
1066 my sub baz {
1067 __SUB__->(); # calls itself
1068 }
1069
1070 It is possible to predeclare a lexical subroutine. The "sub foo {...}"
1071 subroutine definition syntax respects any previous "my sub;" or "state
1072 sub;" declaration. Using this to define recursive subroutines is a bad
1073 idea, however:
1074
1075 my sub baz; # predeclaration
1076 sub baz { # define the "my" sub
1077 baz(); # WRONG: calls itself, but leaks memory
1078 }
1079
1080 Just like "my $f; $f = sub { $f->() }", this example leaks memory. The
1081 name "baz" is a reference to the subroutine, and the subroutine uses
1082 the name "baz"; they keep each other alive (see "Circular References"
1083 in perlref).
1084
1085 "state sub" vs "my sub"
1086
1087 What is the difference between "state" subs and "my" subs? Each time
1088 that execution enters a block when "my" subs are declared, a new copy
1089 of each sub is created. "State" subroutines persist from one execution
1090 of the containing block to the next.
1091
1092 So, in general, "state" subroutines are faster. But "my" subs are
1093 necessary if you want to create closures:
1094
1095 sub whatever {
1096 my $x = shift;
1097 my sub inner {
1098 ... do something with $x ...
1099 }
1100 inner();
1101 }
1102
1103 In this example, a new $x is created when "whatever" is called, and
1104 also a new "inner", which can see the new $x. A "state" sub will only
1105 see the $x from the first call to "whatever".
1106
1107 "our" subroutines
1108
1109 Like "our $variable", "our sub" creates a lexical alias to the package
1110 subroutine of the same name.
1111
1112 The two main uses for this are to switch back to using the package sub
1113 inside an inner scope:
1114
1115 sub foo { ... }
1116
1117 sub bar {
1118 my sub foo { ... }
1119 {
1120 # need to use the outer foo here
1121 our sub foo;
1122 foo();
1123 }
1124 }
1125
1126 and to make a subroutine visible to other packages in the same scope:
1127
1128 package MySneakyModule;
1129
1130 our sub do_something { ... }
1131
1132 sub do_something_with_caller {
1133 package DB;
1134 () = caller 1; # sets @DB::args
1135 do_something(@args); # uses MySneakyModule::do_something
1136 }
1137
1138 Passing Symbol Table Entries (typeglobs)
1139 WARNING: The mechanism described in this section was originally the
1140 only way to simulate pass-by-reference in older versions of Perl.
1141 While it still works fine in modern versions, the new reference
1142 mechanism is generally easier to work with. See below.
1143
1144 Sometimes you don't want to pass the value of an array to a subroutine
1145 but rather the name of it, so that the subroutine can modify the global
1146 copy of it rather than working with a local copy. In perl you can
1147 refer to all objects of a particular name by prefixing the name with a
1148 star: *foo. This is often known as a "typeglob", because the star on
1149 the front can be thought of as a wildcard match for all the funny
1150 prefix characters on variables and subroutines and such.
1151
1152 When evaluated, the typeglob produces a scalar value that represents
1153 all the objects of that name, including any filehandle, format, or
1154 subroutine. When assigned to, it causes the name mentioned to refer to
1155 whatever "*" value was assigned to it. Example:
1156
1157 sub doubleary {
1158 local(*someary) = @_;
1159 foreach $elem (@someary) {
1160 $elem *= 2;
1161 }
1162 }
1163 doubleary(*foo);
1164 doubleary(*bar);
1165
1166 Scalars are already passed by reference, so you can modify scalar
1167 arguments without using this mechanism by referring explicitly to $_[0]
1168 etc. You can modify all the elements of an array by passing all the
1169 elements as scalars, but you have to use the "*" mechanism (or the
1170 equivalent reference mechanism) to "push", "pop", or change the size of
1171 an array. It will certainly be faster to pass the typeglob (or
1172 reference).
1173
1174 Even if you don't want to modify an array, this mechanism is useful for
1175 passing multiple arrays in a single LIST, because normally the LIST
1176 mechanism will merge all the array values so that you can't extract out
1177 the individual arrays. For more on typeglobs, see "Typeglobs and
1178 Filehandles" in perldata.
1179
1180 When to Still Use local()
1181 Despite the existence of "my", there are still three places where the
1182 "local" operator still shines. In fact, in these three places, you
1183 must use "local" instead of "my".
1184
1185 1. You need to give a global variable a temporary value, especially
1186 $_.
1187
1188 The global variables, like @ARGV or the punctuation variables, must
1189 be "local"ized with "local()". This block reads in /etc/motd, and
1190 splits it up into chunks separated by lines of equal signs, which
1191 are placed in @Fields.
1192
1193 {
1194 local @ARGV = ("/etc/motd");
1195 local $/ = undef;
1196 local $_ = <>;
1197 @Fields = split /^\s*=+\s*$/;
1198 }
1199
1200 It particular, it's important to "local"ize $_ in any routine that
1201 assigns to it. Look out for implicit assignments in "while"
1202 conditionals.
1203
1204 2. You need to create a local file or directory handle or a local
1205 function.
1206
1207 A function that needs a filehandle of its own must use "local()" on
1208 a complete typeglob. This can be used to create new symbol table
1209 entries:
1210
1211 sub ioqueue {
1212 local (*READER, *WRITER); # not my!
1213 pipe (READER, WRITER) or die "pipe: $!";
1214 return (*READER, *WRITER);
1215 }
1216 ($head, $tail) = ioqueue();
1217
1218 See the Symbol module for a way to create anonymous symbol table
1219 entries.
1220
1221 Because assignment of a reference to a typeglob creates an alias,
1222 this can be used to create what is effectively a local function, or
1223 at least, a local alias.
1224
1225 {
1226 local *grow = \&shrink; # only until this block exits
1227 grow(); # really calls shrink()
1228 move(); # if move() grow()s, it shrink()s too
1229 }
1230 grow(); # get the real grow() again
1231
1232 See "Function Templates" in perlref for more about manipulating
1233 functions by name in this way.
1234
1235 3. You want to temporarily change just one element of an array or
1236 hash.
1237
1238 You can "local"ize just one element of an aggregate. Usually this
1239 is done on dynamics:
1240
1241 {
1242 local $SIG{INT} = 'IGNORE';
1243 funct(); # uninterruptible
1244 }
1245 # interruptibility automatically restored here
1246
1247 But it also works on lexically declared aggregates.
1248
1249 Pass by Reference
1250 If you want to pass more than one array or hash into a function--or
1251 return them from it--and have them maintain their integrity, then
1252 you're going to have to use an explicit pass-by-reference. Before you
1253 do that, you need to understand references as detailed in perlref.
1254 This section may not make much sense to you otherwise.
1255
1256 Here are a few simple examples. First, let's pass in several arrays to
1257 a function and have it "pop" all of then, returning a new list of all
1258 their former last elements:
1259
1260 @tailings = popmany ( \@a, \@b, \@c, \@d );
1261
1262 sub popmany {
1263 my $aref;
1264 my @retlist;
1265 foreach $aref ( @_ ) {
1266 push @retlist, pop @$aref;
1267 }
1268 return @retlist;
1269 }
1270
1271 Here's how you might write a function that returns a list of keys
1272 occurring in all the hashes passed to it:
1273
1274 @common = inter( \%foo, \%bar, \%joe );
1275 sub inter {
1276 my ($k, $href, %seen); # locals
1277 foreach $href (@_) {
1278 while ( $k = each %$href ) {
1279 $seen{$k}++;
1280 }
1281 }
1282 return grep { $seen{$_} == @_ } keys %seen;
1283 }
1284
1285 So far, we're using just the normal list return mechanism. What
1286 happens if you want to pass or return a hash? Well, if you're using
1287 only one of them, or you don't mind them concatenating, then the normal
1288 calling convention is ok, although a little expensive.
1289
1290 Where people get into trouble is here:
1291
1292 (@a, @b) = func(@c, @d);
1293 or
1294 (%a, %b) = func(%c, %d);
1295
1296 That syntax simply won't work. It sets just @a or %a and clears the @b
1297 or %b. Plus the function didn't get passed into two separate arrays or
1298 hashes: it got one long list in @_, as always.
1299
1300 If you can arrange for everyone to deal with this through references,
1301 it's cleaner code, although not so nice to look at. Here's a function
1302 that takes two array references as arguments, returning the two array
1303 elements in order of how many elements they have in them:
1304
1305 ($aref, $bref) = func(\@c, \@d);
1306 print "@$aref has more than @$bref\n";
1307 sub func {
1308 my ($cref, $dref) = @_;
1309 if (@$cref > @$dref) {
1310 return ($cref, $dref);
1311 } else {
1312 return ($dref, $cref);
1313 }
1314 }
1315
1316 It turns out that you can actually do this also:
1317
1318 (*a, *b) = func(\@c, \@d);
1319 print "@a has more than @b\n";
1320 sub func {
1321 local (*c, *d) = @_;
1322 if (@c > @d) {
1323 return (\@c, \@d);
1324 } else {
1325 return (\@d, \@c);
1326 }
1327 }
1328
1329 Here we're using the typeglobs to do symbol table aliasing. It's a tad
1330 subtle, though, and also won't work if you're using "my" variables,
1331 because only globals (even in disguise as "local"s) are in the symbol
1332 table.
1333
1334 If you're passing around filehandles, you could usually just use the
1335 bare typeglob, like *STDOUT, but typeglobs references work, too. For
1336 example:
1337
1338 splutter(\*STDOUT);
1339 sub splutter {
1340 my $fh = shift;
1341 print $fh "her um well a hmmm\n";
1342 }
1343
1344 $rec = get_rec(\*STDIN);
1345 sub get_rec {
1346 my $fh = shift;
1347 return scalar <$fh>;
1348 }
1349
1350 If you're planning on generating new filehandles, you could do this.
1351 Notice to pass back just the bare *FH, not its reference.
1352
1353 sub openit {
1354 my $path = shift;
1355 local *FH;
1356 return open (FH, $path) ? *FH : undef;
1357 }
1358
1359 Prototypes
1360 Perl supports a very limited kind of compile-time argument checking
1361 using function prototyping. This can be declared in either the PROTO
1362 section or with a prototype attribute. If you declare either of
1363
1364 sub mypush (\@@)
1365 sub mypush :prototype(\@@)
1366
1367 then "mypush()" takes arguments exactly like "push()" does.
1368
1369 If subroutine signatures are enabled (see "Signatures"), then the
1370 shorter PROTO syntax is unavailable, because it would clash with
1371 signatures. In that case, a prototype can only be declared in the form
1372 of an attribute.
1373
1374 The function declaration must be visible at compile time. The
1375 prototype affects only interpretation of new-style calls to the
1376 function, where new-style is defined as not using the "&" character.
1377 In other words, if you call it like a built-in function, then it
1378 behaves like a built-in function. If you call it like an old-fashioned
1379 subroutine, then it behaves like an old-fashioned subroutine. It
1380 naturally falls out from this rule that prototypes have no influence on
1381 subroutine references like "\&foo" or on indirect subroutine calls like
1382 "&{$subref}" or "$subref->()".
1383
1384 Method calls are not influenced by prototypes either, because the
1385 function to be called is indeterminate at compile time, since the exact
1386 code called depends on inheritance.
1387
1388 Because the intent of this feature is primarily to let you define
1389 subroutines that work like built-in functions, here are prototypes for
1390 some other functions that parse almost exactly like the corresponding
1391 built-in.
1392
1393 Declared as Called as
1394
1395 sub mylink ($$) mylink $old, $new
1396 sub myvec ($$$) myvec $var, $offset, 1
1397 sub myindex ($$;$) myindex &getstring, "substr"
1398 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
1399 sub myreverse (@) myreverse $a, $b, $c
1400 sub myjoin ($@) myjoin ":", $a, $b, $c
1401 sub mypop (\@) mypop @array
1402 sub mysplice (\@$$@) mysplice @array, 0, 2, @pushme
1403 sub mykeys (\[%@]) mykeys $hashref->%*
1404 sub myopen (*;$) myopen HANDLE, $name
1405 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
1406 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
1407 sub myrand (;$) myrand 42
1408 sub mytime () mytime
1409
1410 Any backslashed prototype character represents an actual argument that
1411 must start with that character (optionally preceded by "my", "our" or
1412 "local"), with the exception of "$", which will accept any scalar
1413 lvalue expression, such as "$foo = 7" or "my_function()->[0]". The
1414 value passed as part of @_ will be a reference to the actual argument
1415 given in the subroutine call, obtained by applying "\" to that
1416 argument.
1417
1418 You can use the "\[]" backslash group notation to specify more than one
1419 allowed argument type. For example:
1420
1421 sub myref (\[$@%&*])
1422
1423 will allow calling myref() as
1424
1425 myref $var
1426 myref @array
1427 myref %hash
1428 myref &sub
1429 myref *glob
1430
1431 and the first argument of myref() will be a reference to a scalar, an
1432 array, a hash, a code, or a glob.
1433
1434 Unbackslashed prototype characters have special meanings. Any
1435 unbackslashed "@" or "%" eats all remaining arguments, and forces list
1436 context. An argument represented by "$" forces scalar context. An "&"
1437 requires an anonymous subroutine, which, if passed as the first
1438 argument, does not require the "sub" keyword or a subsequent comma.
1439
1440 A "*" allows the subroutine to accept a bareword, constant, scalar
1441 expression, typeglob, or a reference to a typeglob in that slot. The
1442 value will be available to the subroutine either as a simple scalar, or
1443 (in the latter two cases) as a reference to the typeglob. If you wish
1444 to always convert such arguments to a typeglob reference, use
1445 Symbol::qualify_to_ref() as follows:
1446
1447 use Symbol 'qualify_to_ref';
1448
1449 sub foo (*) {
1450 my $fh = qualify_to_ref(shift, caller);
1451 ...
1452 }
1453
1454 The "+" prototype is a special alternative to "$" that will act like
1455 "\[@%]" when given a literal array or hash variable, but will otherwise
1456 force scalar context on the argument. This is useful for functions
1457 which should accept either a literal array or an array reference as the
1458 argument:
1459
1460 sub mypush (+@) {
1461 my $aref = shift;
1462 die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
1463 push @$aref, @_;
1464 }
1465
1466 When using the "+" prototype, your function must check that the
1467 argument is of an acceptable type.
1468
1469 A semicolon (";") separates mandatory arguments from optional
1470 arguments. It is redundant before "@" or "%", which gobble up
1471 everything else.
1472
1473 As the last character of a prototype, or just before a semicolon, a "@"
1474 or a "%", you can use "_" in place of "$": if this argument is not
1475 provided, $_ will be used instead.
1476
1477 Note how the last three examples in the table above are treated
1478 specially by the parser. "mygrep()" is parsed as a true list operator,
1479 "myrand()" is parsed as a true unary operator with unary precedence the
1480 same as "rand()", and "mytime()" is truly without arguments, just like
1481 "time()". That is, if you say
1482
1483 mytime +2;
1484
1485 you'll get "mytime() + 2", not mytime(2), which is how it would be
1486 parsed without a prototype. If you want to force a unary function to
1487 have the same precedence as a list operator, add ";" to the end of the
1488 prototype:
1489
1490 sub mygetprotobynumber($;);
1491 mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)
1492
1493 The interesting thing about "&" is that you can generate new syntax
1494 with it, provided it's in the initial position:
1495
1496 sub try (&@) {
1497 my($try,$catch) = @_;
1498 eval { &$try };
1499 if ($@) {
1500 local $_ = $@;
1501 &$catch;
1502 }
1503 }
1504 sub catch (&) { $_[0] }
1505
1506 try {
1507 die "phooey";
1508 } catch {
1509 /phooey/ and print "unphooey\n";
1510 };
1511
1512 That prints "unphooey". (Yes, there are still unresolved issues having
1513 to do with visibility of @_. I'm ignoring that question for the
1514 moment. (But note that if we make @_ lexically scoped, those anonymous
1515 subroutines can act like closures... (Gee, is this sounding a little
1516 Lispish? (Never mind.))))
1517
1518 And here's a reimplementation of the Perl "grep" operator:
1519
1520 sub mygrep (&@) {
1521 my $code = shift;
1522 my @result;
1523 foreach $_ (@_) {
1524 push(@result, $_) if &$code;
1525 }
1526 @result;
1527 }
1528
1529 Some folks would prefer full alphanumeric prototypes. Alphanumerics
1530 have been intentionally left out of prototypes for the express purpose
1531 of someday in the future adding named, formal parameters. The current
1532 mechanism's main goal is to let module writers provide better
1533 diagnostics for module users. Larry feels the notation quite
1534 understandable to Perl programmers, and that it will not intrude
1535 greatly upon the meat of the module, nor make it harder to read. The
1536 line noise is visually encapsulated into a small pill that's easy to
1537 swallow.
1538
1539 If you try to use an alphanumeric sequence in a prototype you will
1540 generate an optional warning - "Illegal character in prototype...".
1541 Unfortunately earlier versions of Perl allowed the prototype to be used
1542 as long as its prefix was a valid prototype. The warning may be
1543 upgraded to a fatal error in a future version of Perl once the majority
1544 of offending code is fixed.
1545
1546 It's probably best to prototype new functions, not retrofit prototyping
1547 into older ones. That's because you must be especially careful about
1548 silent impositions of differing list versus scalar contexts. For
1549 example, if you decide that a function should take just one parameter,
1550 like this:
1551
1552 sub func ($) {
1553 my $n = shift;
1554 print "you gave me $n\n";
1555 }
1556
1557 and someone has been calling it with an array or expression returning a
1558 list:
1559
1560 func(@foo);
1561 func( $text =~ /\w+/g );
1562
1563 Then you've just supplied an automatic "scalar" in front of their
1564 argument, which can be more than a bit surprising. The old @foo which
1565 used to hold one thing doesn't get passed in. Instead, "func()" now
1566 gets passed in a 1; that is, the number of elements in @foo. And the
1567 "m//g" gets called in scalar context so instead of a list of words it
1568 returns a boolean result and advances "pos($text)". Ouch!
1569
1570 If a sub has both a PROTO and a BLOCK, the prototype is not applied
1571 until after the BLOCK is completely defined. This means that a
1572 recursive function with a prototype has to be predeclared for the
1573 prototype to take effect, like so:
1574
1575 sub foo($$);
1576 sub foo($$) {
1577 foo 1, 2;
1578 }
1579
1580 This is all very powerful, of course, and should be used only in
1581 moderation to make the world a better place.
1582
1583 Constant Functions
1584 Functions with a prototype of "()" are potential candidates for
1585 inlining. If the result after optimization and constant folding is
1586 either a constant or a lexically-scoped scalar which has no other
1587 references, then it will be used in place of function calls made
1588 without "&". Calls made using "&" are never inlined. (See constant
1589 for an easy way to declare most constants.)
1590
1591 The following functions would all be inlined:
1592
1593 sub pi () { 3.14159 } # Not exact, but close.
1594 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
1595 # and it's inlined, too!
1596 sub ST_DEV () { 0 }
1597 sub ST_INO () { 1 }
1598
1599 sub FLAG_FOO () { 1 << 8 }
1600 sub FLAG_BAR () { 1 << 9 }
1601 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
1602
1603 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
1604
1605 sub N () { int(OPT_BAZ) / 3 }
1606
1607 sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
1608 sub FOO_SET2 () { if (FLAG_MASK & FLAG_FOO) { 1 } }
1609
1610 (Be aware that the last example was not always inlined in Perl 5.20 and
1611 earlier, which did not behave consistently with subroutines containing
1612 inner scopes.) You can countermand inlining by using an explicit
1613 "return":
1614
1615 sub baz_val () {
1616 if (OPT_BAZ) {
1617 return 23;
1618 }
1619 else {
1620 return 42;
1621 }
1622 }
1623 sub bonk_val () { return 12345 }
1624
1625 As alluded to earlier you can also declare inlined subs dynamically at
1626 BEGIN time if their body consists of a lexically-scoped scalar which
1627 has no other references. Only the first example here will be inlined:
1628
1629 BEGIN {
1630 my $var = 1;
1631 no strict 'refs';
1632 *INLINED = sub () { $var };
1633 }
1634
1635 BEGIN {
1636 my $var = 1;
1637 my $ref = \$var;
1638 no strict 'refs';
1639 *NOT_INLINED = sub () { $var };
1640 }
1641
1642 A not so obvious caveat with this (see [RT #79908]) is that the
1643 variable will be immediately inlined, and will stop behaving like a
1644 normal lexical variable, e.g. this will print 79907, not 79908:
1645
1646 BEGIN {
1647 my $x = 79907;
1648 *RT_79908 = sub () { $x };
1649 $x++;
1650 }
1651 print RT_79908(); # prints 79907
1652
1653 As of Perl 5.22, this buggy behavior, while preserved for backward
1654 compatibility, is detected and emits a deprecation warning. If you
1655 want the subroutine to be inlined (with no warning), make sure the
1656 variable is not used in a context where it could be modified aside from
1657 where it is declared.
1658
1659 # Fine, no warning
1660 BEGIN {
1661 my $x = 54321;
1662 *INLINED = sub () { $x };
1663 }
1664 # Warns. Future Perl versions will stop inlining it.
1665 BEGIN {
1666 my $x;
1667 $x = 54321;
1668 *ALSO_INLINED = sub () { $x };
1669 }
1670
1671 Perl 5.22 also introduces the experimental "const" attribute as an
1672 alternative. (Disable the "experimental::const_attr" warnings if you
1673 want to use it.) When applied to an anonymous subroutine, it forces
1674 the sub to be called when the "sub" expression is evaluated. The
1675 return value is captured and turned into a constant subroutine:
1676
1677 my $x = 54321;
1678 *INLINED = sub : const { $x };
1679 $x++;
1680
1681 The return value of "INLINED" in this example will always be 54321,
1682 regardless of later modifications to $x. You can also put any
1683 arbitrary code inside the sub, at it will be executed immediately and
1684 its return value captured the same way.
1685
1686 If you really want a subroutine with a "()" prototype that returns a
1687 lexical variable you can easily force it to not be inlined by adding an
1688 explicit "return":
1689
1690 BEGIN {
1691 my $x = 79907;
1692 *RT_79908 = sub () { return $x };
1693 $x++;
1694 }
1695 print RT_79908(); # prints 79908
1696
1697 The easiest way to tell if a subroutine was inlined is by using
1698 B::Deparse. Consider this example of two subroutines returning 1, one
1699 with a "()" prototype causing it to be inlined, and one without (with
1700 deparse output truncated for clarity):
1701
1702 $ perl -MO=Deparse -le 'sub ONE { 1 } if (ONE) { print ONE if ONE }'
1703 sub ONE {
1704 1;
1705 }
1706 if (ONE ) {
1707 print ONE() if ONE ;
1708 }
1709 $ perl -MO=Deparse -le 'sub ONE () { 1 } if (ONE) { print ONE if ONE }'
1710 sub ONE () { 1 }
1711 do {
1712 print 1
1713 };
1714
1715 If you redefine a subroutine that was eligible for inlining, you'll get
1716 a warning by default. You can use this warning to tell whether or not
1717 a particular subroutine is considered inlinable, since it's different
1718 than the warning for overriding non-inlined subroutines:
1719
1720 $ perl -e 'sub one () {1} sub one () {2}'
1721 Constant subroutine one redefined at -e line 1.
1722 $ perl -we 'sub one {1} sub one {2}'
1723 Subroutine one redefined at -e line 1.
1724
1725 The warning is considered severe enough not to be affected by the -w
1726 switch (or its absence) because previously compiled invocations of the
1727 function will still be using the old value of the function. If you
1728 need to be able to redefine the subroutine, you need to ensure that it
1729 isn't inlined, either by dropping the "()" prototype (which changes
1730 calling semantics, so beware) or by thwarting the inlining mechanism in
1731 some other way, e.g. by adding an explicit "return", as mentioned
1732 above:
1733
1734 sub not_inlined () { return 23 }
1735
1736 Overriding Built-in Functions
1737 Many built-in functions may be overridden, though this should be tried
1738 only occasionally and for good reason. Typically this might be done by
1739 a package attempting to emulate missing built-in functionality on a
1740 non-Unix system.
1741
1742 Overriding may be done only by importing the name from a module at
1743 compile time--ordinary predeclaration isn't good enough. However, the
1744 "use subs" pragma lets you, in effect, predeclare subs via the import
1745 syntax, and these names may then override built-in ones:
1746
1747 use subs 'chdir', 'chroot', 'chmod', 'chown';
1748 chdir $somewhere;
1749 sub chdir { ... }
1750
1751 To unambiguously refer to the built-in form, precede the built-in name
1752 with the special package qualifier "CORE::". For example, saying
1753 "CORE::open()" always refers to the built-in "open()", even if the
1754 current package has imported some other subroutine called "&open()"
1755 from elsewhere. Even though it looks like a regular function call, it
1756 isn't: the CORE:: prefix in that case is part of Perl's syntax, and
1757 works for any keyword, regardless of what is in the CORE package.
1758 Taking a reference to it, that is, "\&CORE::open", only works for some
1759 keywords. See CORE.
1760
1761 Library modules should not in general export built-in names like "open"
1762 or "chdir" as part of their default @EXPORT list, because these may
1763 sneak into someone else's namespace and change the semantics
1764 unexpectedly. Instead, if the module adds that name to @EXPORT_OK,
1765 then it's possible for a user to import the name explicitly, but not
1766 implicitly. That is, they could say
1767
1768 use Module 'open';
1769
1770 and it would import the "open" override. But if they said
1771
1772 use Module;
1773
1774 they would get the default imports without overrides.
1775
1776 The foregoing mechanism for overriding built-in is restricted, quite
1777 deliberately, to the package that requests the import. There is a
1778 second method that is sometimes applicable when you wish to override a
1779 built-in everywhere, without regard to namespace boundaries. This is
1780 achieved by importing a sub into the special namespace
1781 "CORE::GLOBAL::". Here is an example that quite brazenly replaces the
1782 "glob" operator with something that understands regular expressions.
1783
1784 package REGlob;
1785 require Exporter;
1786 @ISA = 'Exporter';
1787 @EXPORT_OK = 'glob';
1788
1789 sub import {
1790 my $pkg = shift;
1791 return unless @_;
1792 my $sym = shift;
1793 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1794 $pkg->export($where, $sym, @_);
1795 }
1796
1797 sub glob {
1798 my $pat = shift;
1799 my @got;
1800 if (opendir my $d, '.') {
1801 @got = grep /$pat/, readdir $d;
1802 closedir $d;
1803 }
1804 return @got;
1805 }
1806 1;
1807
1808 And here's how it could be (ab)used:
1809
1810 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1811 package Foo;
1812 use REGlob 'glob'; # override glob() in Foo:: only
1813 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1814
1815 The initial comment shows a contrived, even dangerous example. By
1816 overriding "glob" globally, you would be forcing the new (and
1817 subversive) behavior for the "glob" operator for every namespace,
1818 without the complete cognizance or cooperation of the modules that own
1819 those namespaces. Naturally, this should be done with extreme
1820 caution--if it must be done at all.
1821
1822 The "REGlob" example above does not implement all the support needed to
1823 cleanly override perl's "glob" operator. The built-in "glob" has
1824 different behaviors depending on whether it appears in a scalar or list
1825 context, but our "REGlob" doesn't. Indeed, many perl built-in have
1826 such context sensitive behaviors, and these must be adequately
1827 supported by a properly written override. For a fully functional
1828 example of overriding "glob", study the implementation of
1829 "File::DosGlob" in the standard library.
1830
1831 When you override a built-in, your replacement should be consistent (if
1832 possible) with the built-in native syntax. You can achieve this by
1833 using a suitable prototype. To get the prototype of an overridable
1834 built-in, use the "prototype" function with an argument of
1835 "CORE::builtin_name" (see "prototype" in perlfunc).
1836
1837 Note however that some built-ins can't have their syntax expressed by a
1838 prototype (such as "system" or "chomp"). If you override them you
1839 won't be able to fully mimic their original syntax.
1840
1841 The built-ins "do", "require" and "glob" can also be overridden, but
1842 due to special magic, their original syntax is preserved, and you don't
1843 have to define a prototype for their replacements. (You can't override
1844 the "do BLOCK" syntax, though).
1845
1846 "require" has special additional dark magic: if you invoke your
1847 "require" replacement as "require Foo::Bar", it will actually receive
1848 the argument "Foo/Bar.pm" in @_. See "require" in perlfunc.
1849
1850 And, as you'll have noticed from the previous example, if you override
1851 "glob", the "<*>" glob operator is overridden as well.
1852
1853 In a similar fashion, overriding the "readline" function also overrides
1854 the equivalent I/O operator "<FILEHANDLE>". Also, overriding
1855 "readpipe" also overrides the operators "``" and "qx//".
1856
1857 Finally, some built-ins (e.g. "exists" or "grep") can't be overridden.
1858
1859 Autoloading
1860 If you call a subroutine that is undefined, you would ordinarily get an
1861 immediate, fatal error complaining that the subroutine doesn't exist.
1862 (Likewise for subroutines being used as methods, when the method
1863 doesn't exist in any base class of the class's package.) However, if
1864 an "AUTOLOAD" subroutine is defined in the package or packages used to
1865 locate the original subroutine, then that "AUTOLOAD" subroutine is
1866 called with the arguments that would have been passed to the original
1867 subroutine. The fully qualified name of the original subroutine
1868 magically appears in the global $AUTOLOAD variable of the same package
1869 as the "AUTOLOAD" routine. The name is not passed as an ordinary
1870 argument because, er, well, just because, that's why. (As an
1871 exception, a method call to a nonexistent "import" or "unimport" method
1872 is just skipped instead. Also, if the AUTOLOAD subroutine is an XSUB,
1873 there are other ways to retrieve the subroutine name. See "Autoloading
1874 with XSUBs" in perlguts for details.)
1875
1876 Many "AUTOLOAD" routines load in a definition for the requested
1877 subroutine using eval(), then execute that subroutine using a special
1878 form of goto() that erases the stack frame of the "AUTOLOAD" routine
1879 without a trace. (See the source to the standard module documented in
1880 AutoLoader, for example.) But an "AUTOLOAD" routine can also just
1881 emulate the routine and never define it. For example, let's pretend
1882 that a function that wasn't defined should just invoke "system" with
1883 those arguments. All you'd do is:
1884
1885 sub AUTOLOAD {
1886 our $AUTOLOAD; # keep 'use strict' happy
1887 my $program = $AUTOLOAD;
1888 $program =~ s/.*:://;
1889 system($program, @_);
1890 }
1891 date();
1892 who();
1893 ls('-l');
1894
1895 In fact, if you predeclare functions you want to call that way, you
1896 don't even need parentheses:
1897
1898 use subs qw(date who ls);
1899 date;
1900 who;
1901 ls '-l';
1902
1903 A more complete example of this is the Shell module on CPAN, which can
1904 treat undefined subroutine calls as calls to external programs.
1905
1906 Mechanisms are available to help modules writers split their modules
1907 into autoloadable files. See the standard AutoLoader module described
1908 in AutoLoader and in AutoSplit, the standard SelfLoader modules in
1909 SelfLoader, and the document on adding C functions to Perl code in
1910 perlxs.
1911
1912 Subroutine Attributes
1913 A subroutine declaration or definition may have a list of attributes
1914 associated with it. If such an attribute list is present, it is broken
1915 up at space or colon boundaries and treated as though a "use
1916 attributes" had been seen. See attributes for details about what
1917 attributes are currently supported. Unlike the limitation with the
1918 obsolescent "use attrs", the "sub : ATTRLIST" syntax works to associate
1919 the attributes with a pre-declaration, and not just with a subroutine
1920 definition.
1921
1922 The attributes must be valid as simple identifier names (without any
1923 punctuation other than the '_' character). They may have a parameter
1924 list appended, which is only checked for whether its parentheses
1925 ('(',')') nest properly.
1926
1927 Examples of valid syntax (even though the attributes are unknown):
1928
1929 sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
1930 sub plugh () : Ugly('\(") :Bad;
1931 sub xyzzy : _5x5 { ... }
1932
1933 Examples of invalid syntax:
1934
1935 sub fnord : switch(10,foo(); # ()-string not balanced
1936 sub snoid : Ugly('('); # ()-string not balanced
1937 sub xyzzy : 5x5; # "5x5" not a valid identifier
1938 sub plugh : Y2::north; # "Y2::north" not a simple identifier
1939 sub snurt : foo + bar; # "+" not a colon or space
1940
1941 The attribute list is passed as a list of constant strings to the code
1942 which associates them with the subroutine. In particular, the second
1943 example of valid syntax above currently looks like this in terms of how
1944 it's parsed and invoked:
1945
1946 use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
1947
1948 For further details on attribute lists and their manipulation, see
1949 attributes and Attribute::Handlers.
1950
1952 See "Function Templates" in perlref for more about references and
1953 closures. See perlxs if you'd like to learn about calling C
1954 subroutines from Perl. See perlembed if you'd like to learn about
1955 calling Perl subroutines from C. See perlmod to learn about bundling
1956 up your functions in separate files. See perlmodlib to learn what
1957 library modules come standard on your system. See perlootut to learn
1958 how to make object method calls.
1959
1960
1961
1962perl v5.34.0 2021-10-18 PERLSUB(1)