1PERLSUB(1) Perl Programmers Reference Guide PERLSUB(1)
2
3
4
6 perlsub - Perl subroutines
7
9 To declare subroutines:
10
11 sub NAME; # A "forward" declaration.
12 sub NAME(PROTO); # ditto, but with prototypes
13 sub NAME : ATTRS; # with attributes
14 sub NAME(PROTO) : ATTRS; # with attributes and prototypes
15
16 sub NAME BLOCK # A declaration and a definition.
17 sub NAME(PROTO) BLOCK # ditto, but with prototypes
18 sub NAME : ATTRS BLOCK # with attributes
19 sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes
20
21 use feature 'signatures';
22 sub NAME(SIG) BLOCK # with signature
23 sub NAME :ATTRS (SIG) BLOCK # with signature, attributes
24 sub NAME :prototype(PROTO) (SIG) BLOCK # with signature, prototype
25
26 To define an anonymous subroutine at runtime:
27
28 $subref = sub BLOCK; # no proto
29 $subref = sub (PROTO) BLOCK; # with proto
30 $subref = sub : ATTRS BLOCK; # with attributes
31 $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes
32
33 use feature 'signatures';
34 $subref = sub (SIG) BLOCK; # with signature
35 $subref = sub : ATTRS(SIG) BLOCK; # with signature, attributes
36
37 To import subroutines:
38
39 use MODULE qw(NAME1 NAME2 NAME3);
40
41 To call subroutines:
42
43 NAME(LIST); # & is optional with parentheses.
44 NAME LIST; # Parentheses optional if predeclared/imported.
45 &NAME(LIST); # Circumvent prototypes.
46 &NAME; # Makes current @_ visible to called subroutine.
47
49 Like many languages, Perl provides for user-defined subroutines. These
50 may be located anywhere in the main program, loaded in from other files
51 via the "do", "require", or "use" keywords, or generated on the fly
52 using "eval" or anonymous subroutines. You can even call a function
53 indirectly using a variable containing its name or a CODE reference.
54
55 The Perl model for function call and return values is simple: all
56 functions are passed as parameters one single flat list of scalars, and
57 all functions likewise return to their caller one single flat list of
58 scalars. Any arrays or hashes in these call and return lists will
59 collapse, losing their identities--but you may always use pass-by-
60 reference instead to avoid this. Both call and return lists may
61 contain as many or as few scalar elements as you'd like. (Often a
62 function without an explicit return statement is called a subroutine,
63 but there's really no difference from Perl's perspective.)
64
65 In a subroutine that uses signatures (see "Signatures" below),
66 arguments are assigned into lexical variables introduced by the
67 signature. In the current implementation of perl they are also
68 accessible in the @_ array in the same way as for non-signature
69 subroutines, but accessing them in this manner is now discouraged
70 inside such a signature-using subroutine.
71
72 In a subroutine that does not use signatures, any arguments passed in
73 show up in the array @_. Therefore, if you called a function with two
74 arguments, those would be stored in $_[0] and $_[1]. The array @_ is a
75 local array, but its elements are aliases for the actual scalar
76 parameters. In particular, if an element $_[0] is updated, the
77 corresponding argument is updated (or an error occurs if it is not
78 updatable). If an argument is an array or hash element which did not
79 exist when the function was called, that element is created only when
80 (and if) it is modified or a reference to it is taken. (Some earlier
81 versions of Perl created the element whether or not the element was
82 assigned to.) Assigning to the whole array @_ removes that aliasing,
83 and does not update any arguments.
84
85 When not using signatures, Perl does not otherwise provide a means to
86 create named formal parameters. In practice all you do is assign to a
87 "my()" list of these. Variables that aren't declared to be private are
88 global variables. For gory details on creating private variables, see
89 "Private Variables via my()" and "Temporary Values via local()". To
90 create protected environments for a set of functions in a separate
91 package (and probably a separate file), see "Packages" in perlmod.
92
93 A "return" statement may be used to exit a subroutine, optionally
94 specifying the returned value, which will be evaluated in the
95 appropriate context (list, scalar, or void) depending on the context of
96 the subroutine call. If you specify no return value, the subroutine
97 returns an empty list in list context, the undefined value in scalar
98 context, or nothing in void context. If you return one or more
99 aggregates (arrays and hashes), these will be flattened together into
100 one large indistinguishable list.
101
102 If no "return" is found and if the last statement is an expression, its
103 value is returned. If the last statement is a loop control structure
104 like a "foreach" or a "while", the returned value is unspecified. The
105 empty sub returns the empty list.
106
107 Example:
108
109 sub max {
110 my $max = shift(@_);
111 foreach $foo (@_) {
112 $max = $foo if $max < $foo;
113 }
114 return $max;
115 }
116 $bestday = max($mon,$tue,$wed,$thu,$fri);
117
118 Example:
119
120 # get a line, combining continuation lines
121 # that start with whitespace
122
123 sub get_line {
124 $thisline = $lookahead; # global variables!
125 LINE: while (defined($lookahead = <STDIN>)) {
126 if ($lookahead =~ /^[ \t]/) {
127 $thisline .= $lookahead;
128 }
129 else {
130 last LINE;
131 }
132 }
133 return $thisline;
134 }
135
136 $lookahead = <STDIN>; # get first line
137 while (defined($line = get_line())) {
138 ...
139 }
140
141 Assigning to a list of private variables to name your arguments:
142
143 sub maybeset {
144 my($key, $value) = @_;
145 $Foo{$key} = $value unless $Foo{$key};
146 }
147
148 Because the assignment copies the values, this also has the effect of
149 turning call-by-reference into call-by-value. Otherwise a function is
150 free to do in-place modifications of @_ and change its caller's values.
151
152 upcase_in($v1, $v2); # this changes $v1 and $v2
153 sub upcase_in {
154 for (@_) { tr/a-z/A-Z/ }
155 }
156
157 You aren't allowed to modify constants in this way, of course. If an
158 argument were actually literal and you tried to change it, you'd take a
159 (presumably fatal) exception. For example, this won't work:
160
161 upcase_in("frederick");
162
163 It would be much safer if the "upcase_in()" function were written to
164 return a copy of its parameters instead of changing them in place:
165
166 ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2
167 sub upcase {
168 return unless defined wantarray; # void context, do nothing
169 my @parms = @_;
170 for (@parms) { tr/a-z/A-Z/ }
171 return wantarray ? @parms : $parms[0];
172 }
173
174 Notice how this (unprototyped) function doesn't care whether it was
175 passed real scalars or arrays. Perl sees all arguments as one big,
176 long, flat parameter list in @_. This is one area where Perl's simple
177 argument-passing style shines. The "upcase()" function would work
178 perfectly well without changing the "upcase()" definition even if we
179 fed it things like this:
180
181 @newlist = upcase(@list1, @list2);
182 @newlist = upcase( split /:/, $var );
183
184 Do not, however, be tempted to do this:
185
186 (@a, @b) = upcase(@list1, @list2);
187
188 Like the flattened incoming parameter list, the return list is also
189 flattened on return. So all you have managed to do here is stored
190 everything in @a and made @b empty. See "Pass by Reference" for
191 alternatives.
192
193 A subroutine may be called using an explicit "&" prefix. The "&" is
194 optional in modern Perl, as are parentheses if the subroutine has been
195 predeclared. The "&" is not optional when just naming the subroutine,
196 such as when it's used as an argument to defined() or undef(). Nor is
197 it optional when you want to do an indirect subroutine call with a
198 subroutine name or reference using the "&$subref()" or "&{$subref}()"
199 constructs, although the "$subref->()" notation solves that problem.
200 See perlref for more about all that.
201
202 Subroutines may be called recursively. If a subroutine is called using
203 the "&" form, the argument list is optional, and if omitted, no @_
204 array is set up for the subroutine: the @_ array at the time of the
205 call is visible to subroutine instead. This is an efficiency mechanism
206 that new users may wish to avoid.
207
208 &foo(1,2,3); # pass three arguments
209 foo(1,2,3); # the same
210
211 foo(); # pass a null list
212 &foo(); # the same
213
214 &foo; # foo() get current args, like foo(@_) !!
215 use strict 'subs';
216 foo; # like foo() iff sub foo predeclared, else
217 # a compile-time error
218 no strict 'subs';
219 foo; # like foo() iff sub foo predeclared, else
220 # a literal string "foo"
221
222 Not only does the "&" form make the argument list optional, it also
223 disables any prototype checking on arguments you do provide. This is
224 partly for historical reasons, and partly for having a convenient way
225 to cheat if you know what you're doing. See "Prototypes" below.
226
227 Since Perl 5.16.0, the "__SUB__" token is available under "use feature
228 'current_sub'" and "use v5.16". It will evaluate to a reference to the
229 currently-running sub, which allows for recursive calls without knowing
230 your subroutine's name.
231
232 use v5.16;
233 my $factorial = sub {
234 my ($x) = @_;
235 return 1 if $x == 1;
236 return($x * __SUB__->( $x - 1 ) );
237 };
238
239 The behavior of "__SUB__" within a regex code block (such as
240 "/(?{...})/") is subject to change.
241
242 Subroutines whose names are in all upper case are reserved to the Perl
243 core, as are modules whose names are in all lower case. A subroutine
244 in all capitals is a loosely-held convention meaning it will be called
245 indirectly by the run-time system itself, usually due to a triggered
246 event. Subroutines whose name start with a left parenthesis are also
247 reserved the same way. The following is a list of some subroutines
248 that currently do special, pre-defined things.
249
250 documented later in this document
251 "AUTOLOAD"
252
253 documented in perlmod
254 "CLONE", "CLONE_SKIP"
255
256 documented in perlobj
257 "DESTROY", "DOES"
258
259 documented in perltie
260 "BINMODE", "CLEAR", "CLOSE", "DELETE", "DESTROY", "EOF", "EXISTS",
261 "EXTEND", "FETCH", "FETCHSIZE", "FILENO", "FIRSTKEY", "GETC",
262 "NEXTKEY", "OPEN", "POP", "PRINT", "PRINTF", "PUSH", "READ",
263 "READLINE", "SCALAR", "SEEK", "SHIFT", "SPLICE", "STORE",
264 "STORESIZE", "TELL", "TIEARRAY", "TIEHANDLE", "TIEHASH",
265 "TIESCALAR", "UNSHIFT", "UNTIE", "WRITE"
266
267 documented in PerlIO::via
268 "BINMODE", "CLEARERR", "CLOSE", "EOF", "ERROR", "FDOPEN", "FILENO",
269 "FILL", "FLUSH", "OPEN", "POPPED", "PUSHED", "READ", "SEEK",
270 "SETLINEBUF", "SYSOPEN", "TELL", "UNREAD", "UTF8", "WRITE"
271
272 documented in perlfunc
273 "import" , "unimport" , "INC"
274
275 documented in UNIVERSAL
276 "VERSION"
277
278 documented in perldebguts
279 "DB::DB", "DB::sub", "DB::lsub", "DB::goto", "DB::postponed"
280
281 undocumented, used internally by the overload feature
282 any starting with "("
283
284 The "BEGIN", "UNITCHECK", "CHECK", "INIT" and "END" subroutines are not
285 so much subroutines as named special code blocks, of which you can have
286 more than one in a package, and which you can not call explicitly. See
287 "BEGIN, UNITCHECK, CHECK, INIT and END" in perlmod
288
289 Signatures
290 Perl has a facility to allow a subroutine's formal parameters to be
291 declared by special syntax, separate from the procedural code of the
292 subroutine body. The formal parameter list is known as a signature.
293
294 This facility must be enabled before it can be used. It is enabled
295 automatically by a "use v5.36" (or higher) declaration, or more
296 directly by "use feature 'signatures'", in the current scope.
297
298 The signature is part of a subroutine's body. Normally the body of a
299 subroutine is simply a braced block of code, but when using a
300 signature, the signature is a parenthesised list that goes immediately
301 before the block, after any name or attributes.
302
303 For example,
304
305 sub foo :lvalue ($a, $b = 1, @c) { .... }
306
307 The signature declares lexical variables that are in scope for the
308 block. When the subroutine is called, the signature takes control
309 first. It populates the signature variables from the list of arguments
310 that were passed. If the argument list doesn't meet the requirements
311 of the signature, then it will throw an exception. When the signature
312 processing is complete, control passes to the block.
313
314 Positional parameters are handled by simply naming scalar variables in
315 the signature. For example,
316
317 sub foo ($left, $right) {
318 return $left + $right;
319 }
320
321 takes two positional parameters, which must be filled at runtime by two
322 arguments. By default the parameters are mandatory, and it is not
323 permitted to pass more arguments than expected. So the above is
324 equivalent to
325
326 sub foo {
327 die "Too many arguments for subroutine" unless @_ <= 2;
328 die "Too few arguments for subroutine" unless @_ >= 2;
329 my $left = $_[0];
330 my $right = $_[1];
331 return $left + $right;
332 }
333
334 An argument can be ignored by omitting the main part of the name from a
335 parameter declaration, leaving just a bare "$" sigil. For example,
336
337 sub foo ($first, $, $third) {
338 return "first=$first, third=$third";
339 }
340
341 Although the ignored argument doesn't go into a variable, it is still
342 mandatory for the caller to pass it.
343
344 A positional parameter is made optional by giving a default value,
345 separated from the parameter name by "=":
346
347 sub foo ($left, $right = 0) {
348 return $left + $right;
349 }
350
351 The above subroutine may be called with either one or two arguments.
352 The default value expression is evaluated when the subroutine is
353 called, so it may provide different default values for different calls.
354 It is only evaluated if the argument was actually omitted from the
355 call. For example,
356
357 my $auto_id = 0;
358 sub foo ($thing, $id = $auto_id++) {
359 print "$thing has ID $id";
360 }
361
362 automatically assigns distinct sequential IDs to things for which no ID
363 was supplied by the caller. A default value expression may also refer
364 to parameters earlier in the signature, making the default for one
365 parameter vary according to the earlier parameters. For example,
366
367 sub foo ($first_name, $surname, $nickname = $first_name) {
368 print "$first_name $surname is known as \"$nickname\"";
369 }
370
371 An optional parameter can be nameless just like a mandatory parameter.
372 For example,
373
374 sub foo ($thing, $ = 1) {
375 print $thing;
376 }
377
378 The parameter's default value will still be evaluated if the
379 corresponding argument isn't supplied, even though the value won't be
380 stored anywhere. This is in case evaluating it has important side
381 effects. However, it will be evaluated in void context, so if it
382 doesn't have side effects and is not trivial it will generate a warning
383 if the "void" warning category is enabled. If a nameless optional
384 parameter's default value is not important, it may be omitted just as
385 the parameter's name was:
386
387 sub foo ($thing, $=) {
388 print $thing;
389 }
390
391 Optional positional parameters must come after all mandatory positional
392 parameters. (If there are no mandatory positional parameters then an
393 optional positional parameters can be the first thing in the
394 signature.) If there are multiple optional positional parameters and
395 not enough arguments are supplied to fill them all, they will be filled
396 from left to right.
397
398 After positional parameters, additional arguments may be captured in a
399 slurpy parameter. The simplest form of this is just an array variable:
400
401 sub foo ($filter, @inputs) {
402 print $filter->($_) foreach @inputs;
403 }
404
405 With a slurpy parameter in the signature, there is no upper limit on
406 how many arguments may be passed. A slurpy array parameter may be
407 nameless just like a positional parameter, in which case its only
408 effect is to turn off the argument limit that would otherwise apply:
409
410 sub foo ($thing, @) {
411 print $thing;
412 }
413
414 A slurpy parameter may instead be a hash, in which case the arguments
415 available to it are interpreted as alternating keys and values. There
416 must be as many keys as values: if there is an odd argument then an
417 exception will be thrown. Keys will be stringified, and if there are
418 duplicates then the later instance takes precedence over the earlier,
419 as with standard hash construction.
420
421 sub foo ($filter, %inputs) {
422 print $filter->($_, $inputs{$_}) foreach sort keys %inputs;
423 }
424
425 A slurpy hash parameter may be nameless just like other kinds of
426 parameter. It still insists that the number of arguments available to
427 it be even, even though they're not being put into a variable.
428
429 sub foo ($thing, %) {
430 print $thing;
431 }
432
433 A slurpy parameter, either array or hash, must be the last thing in the
434 signature. It may follow mandatory and optional positional parameters;
435 it may also be the only thing in the signature. Slurpy parameters
436 cannot have default values: if no arguments are supplied for them then
437 you get an empty array or empty hash.
438
439 A signature may be entirely empty, in which case all it does is check
440 that the caller passed no arguments:
441
442 sub foo () {
443 return 123;
444 }
445
446 Prior to Perl 5.36 these were considered experimental, and emitted a
447 warning in the "experimental::signatures" category. From Perl 5.36
448 onwards this no longer happens, though the warning category still
449 exists for back-compatibility with code that attempts to disable it
450 with a statement such as:
451
452 no warnings 'experimental::signatures';
453
454 In the current perl implementation, when using a signature the
455 arguments are still also available in the special array variable @_.
456 However, accessing them via this array is now discouraged, and should
457 not be relied upon in newly-written code as this ability may change in
458 a future version. Code that attempts to access the @_ array will
459 produce warnings in the "experimental::args_array_with_signatures"
460 category when compiled:
461
462 sub f ($x) {
463 # This line emits the warning seen below
464 print "Arguments are @_";
465 }
466
467 Use of @_ in join or string with signatured subroutine is
468 experimental at ...
469
470 There is a difference between the two ways of accessing the arguments:
471 @_ aliases the arguments, but the signature variables get copies of the
472 arguments. So writing to a signature variable only changes that
473 variable, and has no effect on the caller's variables, but writing to
474 an element of @_ modifies whatever the caller used to supply that
475 argument.
476
477 There is a potential syntactic ambiguity between signatures and
478 prototypes (see "Prototypes"), because both start with an opening
479 parenthesis and both can appear in some of the same places, such as
480 just after the name in a subroutine declaration. For historical
481 reasons, when signatures are not enabled, any opening parenthesis in
482 such a context will trigger very forgiving prototype parsing. Most
483 signatures will be interpreted as prototypes in those circumstances,
484 but won't be valid prototypes. (A valid prototype cannot contain any
485 alphabetic character.) This will lead to somewhat confusing error
486 messages.
487
488 To avoid ambiguity, when signatures are enabled the special syntax for
489 prototypes is disabled. There is no attempt to guess whether a
490 parenthesised group was intended to be a prototype or a signature. To
491 give a subroutine a prototype under these circumstances, use a
492 prototype attribute. For example,
493
494 sub foo :prototype($) { $_[0] }
495
496 It is entirely possible for a subroutine to have both a prototype and a
497 signature. They do different jobs: the prototype affects compilation
498 of calls to the subroutine, and the signature puts argument values into
499 lexical variables at runtime. You can therefore write
500
501 sub foo :prototype($$) ($left, $right) {
502 return $left + $right;
503 }
504
505 The prototype attribute, and any other attributes, must come before the
506 signature. The signature always immediately precedes the block of the
507 subroutine's body.
508
509 Private Variables via my()
510 Synopsis:
511
512 my $foo; # declare $foo lexically local
513 my (@wid, %get); # declare list of variables local
514 my $foo = "flurp"; # declare $foo lexical, and init it
515 my @oof = @bar; # declare @oof lexical, and init it
516 my $x : Foo = $y; # similar, with an attribute applied
517
518 WARNING: The use of attribute lists on "my" declarations is still
519 evolving. The current semantics and interface are subject to change.
520 See attributes and Attribute::Handlers.
521
522 The "my" operator declares the listed variables to be lexically
523 confined to the enclosing block, conditional
524 ("if"/"unless"/"elsif"/"else"), loop
525 ("for"/"foreach"/"while"/"until"/"continue"), subroutine, "eval", or
526 "do"/"require"/"use"'d file. If more than one value is listed, the
527 list must be placed in parentheses. All listed elements must be legal
528 lvalues. Only alphanumeric identifiers may be lexically
529 scoped--magical built-ins like $/ must currently be "local"ized with
530 "local" instead.
531
532 Unlike dynamic variables created by the "local" operator, lexical
533 variables declared with "my" are totally hidden from the outside world,
534 including any called subroutines. This is true if it's the same
535 subroutine called from itself or elsewhere--every call gets its own
536 copy.
537
538 This doesn't mean that a "my" variable declared in a statically
539 enclosing lexical scope would be invisible. Only dynamic scopes are
540 cut off. For example, the "bumpx()" function below has access to the
541 lexical $x variable because both the "my" and the "sub" occurred at the
542 same scope, presumably file scope.
543
544 my $x = 10;
545 sub bumpx { $x++ }
546
547 An "eval()", however, can see lexical variables of the scope it is
548 being evaluated in, so long as the names aren't hidden by declarations
549 within the "eval()" itself. See perlref.
550
551 The parameter list to my() may be assigned to if desired, which allows
552 you to initialize your variables. (If no initializer is given for a
553 particular variable, it is created with the undefined value.) Commonly
554 this is used to name input parameters to a subroutine. Examples:
555
556 $arg = "fred"; # "global" variable
557 $n = cube_root(27);
558 print "$arg thinks the root is $n\n";
559 fred thinks the root is 3
560
561 sub cube_root {
562 my $arg = shift; # name doesn't matter
563 $arg **= 1/3;
564 return $arg;
565 }
566
567 The "my" is simply a modifier on something you might assign to. So
568 when you do assign to variables in its argument list, "my" doesn't
569 change whether those variables are viewed as a scalar or an array. So
570
571 my ($foo) = <STDIN>; # WRONG?
572 my @FOO = <STDIN>;
573
574 both supply a list context to the right-hand side, while
575
576 my $foo = <STDIN>;
577
578 supplies a scalar context. But the following declares only one
579 variable:
580
581 my $foo, $bar = 1; # WRONG
582
583 That has the same effect as
584
585 my $foo;
586 $bar = 1;
587
588 The declared variable is not introduced (is not visible) until after
589 the current statement. Thus,
590
591 my $x = $x;
592
593 can be used to initialize a new $x with the value of the old $x, and
594 the expression
595
596 my $x = 123 and $x == 123
597
598 is false unless the old $x happened to have the value 123.
599
600 Lexical scopes of control structures are not bounded precisely by the
601 braces that delimit their controlled blocks; control expressions are
602 part of that scope, too. Thus in the loop
603
604 while (my $line = <>) {
605 $line = lc $line;
606 } continue {
607 print $line;
608 }
609
610 the scope of $line extends from its declaration throughout the rest of
611 the loop construct (including the "continue" clause), but not beyond
612 it. Similarly, in the conditional
613
614 if ((my $answer = <STDIN>) =~ /^yes$/i) {
615 user_agrees();
616 } elsif ($answer =~ /^no$/i) {
617 user_disagrees();
618 } else {
619 chomp $answer;
620 die "'$answer' is neither 'yes' nor 'no'";
621 }
622
623 the scope of $answer extends from its declaration through the rest of
624 that conditional, including any "elsif" and "else" clauses, but not
625 beyond it. See "Simple Statements" in perlsyn for information on the
626 scope of variables in statements with modifiers.
627
628 The "foreach" loop defaults to scoping its index variable dynamically
629 in the manner of "local". However, if the index variable is prefixed
630 with the keyword "my", or if there is already a lexical by that name in
631 scope, then a new lexical is created instead. Thus in the loop
632
633 for my $i (1, 2, 3) {
634 some_function();
635 }
636
637 the scope of $i extends to the end of the loop, but not beyond it,
638 rendering the value of $i inaccessible within "some_function()".
639
640 Some users may wish to encourage the use of lexically scoped variables.
641 As an aid to catching implicit uses to package variables, which are
642 always global, if you say
643
644 use strict 'vars';
645
646 then any variable mentioned from there to the end of the enclosing
647 block must either refer to a lexical variable, be predeclared via "our"
648 or "use vars", or else must be fully qualified with the package name.
649 A compilation error results otherwise. An inner block may countermand
650 this with "no strict 'vars'".
651
652 A "my" has both a compile-time and a run-time effect. At compile time,
653 the compiler takes notice of it. The principal usefulness of this is
654 to quiet "use strict 'vars'", but it is also essential for generation
655 of closures as detailed in perlref. Actual initialization is delayed
656 until run time, though, so it gets executed at the appropriate time,
657 such as each time through a loop, for example.
658
659 Variables declared with "my" are not part of any package and are
660 therefore never fully qualified with the package name. In particular,
661 you're not allowed to try to make a package variable (or other global)
662 lexical:
663
664 my $pack::var; # ERROR! Illegal syntax
665
666 In fact, a dynamic variable (also known as package or global variables)
667 are still accessible using the fully qualified "::" notation even while
668 a lexical of the same name is also visible:
669
670 package main;
671 local $x = 10;
672 my $x = 20;
673 print "$x and $::x\n";
674
675 That will print out 20 and 10.
676
677 You may declare "my" variables at the outermost scope of a file to hide
678 any such identifiers from the world outside that file. This is similar
679 in spirit to C's static variables when they are used at the file level.
680 To do this with a subroutine requires the use of a closure (an
681 anonymous function that accesses enclosing lexicals). If you want to
682 create a private subroutine that cannot be called from outside that
683 block, it can declare a lexical variable containing an anonymous sub
684 reference:
685
686 my $secret_version = '1.001-beta';
687 my $secret_sub = sub { print $secret_version };
688 &$secret_sub();
689
690 As long as the reference is never returned by any function within the
691 module, no outside module can see the subroutine, because its name is
692 not in any package's symbol table. Remember that it's not REALLY
693 called $some_pack::secret_version or anything; it's just
694 $secret_version, unqualified and unqualifiable.
695
696 This does not work with object methods, however; all object methods
697 have to be in the symbol table of some package to be found. See
698 "Function Templates" in perlref for something of a work-around to this.
699
700 Persistent Private Variables
701 There are two ways to build persistent private variables in Perl 5.10.
702 First, you can simply use the "state" feature. Or, you can use
703 closures, if you want to stay compatible with releases older than 5.10.
704
705 Persistent variables via state()
706
707 Beginning with Perl 5.10.0, you can declare variables with the "state"
708 keyword in place of "my". For that to work, though, you must have
709 enabled that feature beforehand, either by using the "feature" pragma,
710 or by using "-E" on one-liners (see feature). Beginning with Perl
711 5.16, the "CORE::state" form does not require the "feature" pragma.
712
713 The "state" keyword creates a lexical variable (following the same
714 scoping rules as "my") that persists from one subroutine call to the
715 next. If a state variable resides inside an anonymous subroutine, then
716 each copy of the subroutine has its own copy of the state variable.
717 However, the value of the state variable will still persist between
718 calls to the same copy of the anonymous subroutine. (Don't forget that
719 "sub { ... }" creates a new subroutine each time it is executed.)
720
721 For example, the following code maintains a private counter,
722 incremented each time the gimme_another() function is called:
723
724 use feature 'state';
725 sub gimme_another { state $x; return ++$x }
726
727 And this example uses anonymous subroutines to create separate
728 counters:
729
730 use feature 'state';
731 sub create_counter {
732 return sub { state $x; return ++$x }
733 }
734
735 Also, since $x is lexical, it can't be reached or modified by any Perl
736 code outside.
737
738 When combined with variable declaration, simple assignment to "state"
739 variables (as in "state $x = 42") is executed only the first time.
740 When such statements are evaluated subsequent times, the assignment is
741 ignored. The behavior of assignment to "state" declarations where the
742 left hand side of the assignment involves any parentheses is currently
743 undefined.
744
745 Persistent variables with closures
746
747 Just because a lexical variable is lexically (also called statically)
748 scoped to its enclosing block, "eval", or "do" FILE, this doesn't mean
749 that within a function it works like a C static. It normally works
750 more like a C auto, but with implicit garbage collection.
751
752 Unlike local variables in C or C++, Perl's lexical variables don't
753 necessarily get recycled just because their scope has exited. If
754 something more permanent is still aware of the lexical, it will stick
755 around. So long as something else references a lexical, that lexical
756 won't be freed--which is as it should be. You wouldn't want memory
757 being free until you were done using it, or kept around once you were
758 done. Automatic garbage collection takes care of this for you.
759
760 This means that you can pass back or save away references to lexical
761 variables, whereas to return a pointer to a C auto is a grave error.
762 It also gives us a way to simulate C's function statics. Here's a
763 mechanism for giving a function private variables with both lexical
764 scoping and a static lifetime. If you do want to create something like
765 C's static variables, just enclose the whole function in an extra
766 block, and put the static variable outside the function but in the
767 block.
768
769 {
770 my $secret_val = 0;
771 sub gimme_another {
772 return ++$secret_val;
773 }
774 }
775 # $secret_val now becomes unreachable by the outside
776 # world, but retains its value between calls to gimme_another
777
778 If this function is being sourced in from a separate file via "require"
779 or "use", then this is probably just fine. If it's all in the main
780 program, you'll need to arrange for the "my" to be executed early,
781 either by putting the whole block above your main program, or more
782 likely, placing merely a "BEGIN" code block around it to make sure it
783 gets executed before your program starts to run:
784
785 BEGIN {
786 my $secret_val = 0;
787 sub gimme_another {
788 return ++$secret_val;
789 }
790 }
791
792 See "BEGIN, UNITCHECK, CHECK, INIT and END" in perlmod about the
793 special triggered code blocks, "BEGIN", "UNITCHECK", "CHECK", "INIT"
794 and "END".
795
796 If declared at the outermost scope (the file scope), then lexicals work
797 somewhat like C's file statics. They are available to all functions in
798 that same file declared below them, but are inaccessible from outside
799 that file. This strategy is sometimes used in modules to create
800 private variables that the whole module can see.
801
802 Temporary Values via local()
803 WARNING: In general, you should be using "my" instead of "local",
804 because it's faster and safer. Exceptions to this include the global
805 punctuation variables, global filehandles and formats, and direct
806 manipulation of the Perl symbol table itself. "local" is mostly used
807 when the current value of a variable must be visible to called
808 subroutines.
809
810 Synopsis:
811
812 # localization of values
813
814 local $foo; # make $foo dynamically local
815 local (@wid, %get); # make list of variables local
816 local $foo = "flurp"; # make $foo dynamic, and init it
817 local @oof = @bar; # make @oof dynamic, and init it
818
819 local $hash{key} = "val"; # sets a local value for this hash entry
820 delete local $hash{key}; # delete this entry for the current block
821 local ($cond ? $v1 : $v2); # several types of lvalues support
822 # localization
823
824 # localization of symbols
825
826 local *FH; # localize $FH, @FH, %FH, &FH ...
827 local *merlyn = *randal; # now $merlyn is really $randal, plus
828 # @merlyn is really @randal, etc
829 local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal
830 local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc
831
832 A "local" modifies its listed variables to be "local" to the enclosing
833 block, "eval", or "do FILE"--and to any subroutine called from within
834 that block. A "local" just gives temporary values to global (meaning
835 package) variables. It does not create a local variable. This is
836 known as dynamic scoping. Lexical scoping is done with "my", which
837 works more like C's auto declarations.
838
839 Some types of lvalues can be localized as well: hash and array elements
840 and slices, conditionals (provided that their result is always
841 localizable), and symbolic references. As for simple variables, this
842 creates new, dynamically scoped values.
843
844 If more than one variable or expression is given to "local", they must
845 be placed in parentheses. This operator works by saving the current
846 values of those variables in its argument list on a hidden stack and
847 restoring them upon exiting the block, subroutine, or eval. This means
848 that called subroutines can also reference the local variable, but not
849 the global one. The argument list may be assigned to if desired, which
850 allows you to initialize your local variables. (If no initializer is
851 given for a particular variable, it is created with an undefined
852 value.)
853
854 Because "local" is a run-time operator, it gets executed each time
855 through a loop. Consequently, it's more efficient to localize your
856 variables outside the loop.
857
858 Grammatical note on local()
859
860 A "local" is simply a modifier on an lvalue expression. When you
861 assign to a "local"ized variable, the "local" doesn't change whether
862 its list is viewed as a scalar or an array. So
863
864 local($foo) = <STDIN>;
865 local @FOO = <STDIN>;
866
867 both supply a list context to the right-hand side, while
868
869 local $foo = <STDIN>;
870
871 supplies a scalar context.
872
873 Localization of special variables
874
875 If you localize a special variable, you'll be giving a new value to it,
876 but its magic won't go away. That means that all side-effects related
877 to this magic still work with the localized value.
878
879 This feature allows code like this to work :
880
881 # Read the whole contents of FILE in $slurp
882 { local $/ = undef; $slurp = <FILE>; }
883
884 Note, however, that this restricts localization of some values ; for
885 example, the following statement dies, as of perl 5.10.0, with an error
886 Modification of a read-only value attempted, because the $1 variable is
887 magical and read-only :
888
889 local $1 = 2;
890
891 One exception is the default scalar variable: starting with perl 5.14
892 "local($_)" will always strip all magic from $_, to make it possible to
893 safely reuse $_ in a subroutine.
894
895 WARNING: Localization of tied arrays and hashes does not currently work
896 as described. This will be fixed in a future release of Perl; in the
897 meantime, avoid code that relies on any particular behavior of
898 localising tied arrays or hashes (localising individual elements is
899 still okay). See "Localising Tied Arrays and Hashes Is Broken" in
900 perl58delta for more details.
901
902 Localization of globs
903
904 The construct
905
906 local *name;
907
908 creates a whole new symbol table entry for the glob "name" in the
909 current package. That means that all variables in its glob slot
910 ($name, @name, %name, &name, and the "name" filehandle) are dynamically
911 reset.
912
913 This implies, among other things, that any magic eventually carried by
914 those variables is locally lost. In other words, saying "local */"
915 will not have any effect on the internal value of the input record
916 separator.
917
918 Localization of elements of composite types
919
920 It's also worth taking a moment to explain what happens when you
921 "local"ize a member of a composite type (i.e. an array or hash
922 element). In this case, the element is "local"ized by name. This
923 means that when the scope of the "local()" ends, the saved value will
924 be restored to the hash element whose key was named in the "local()",
925 or the array element whose index was named in the "local()". If that
926 element was deleted while the "local()" was in effect (e.g. by a
927 "delete()" from a hash or a "shift()" of an array), it will spring back
928 into existence, possibly extending an array and filling in the skipped
929 elements with "undef". For instance, if you say
930
931 %hash = ( 'This' => 'is', 'a' => 'test' );
932 @ary = ( 0..5 );
933 {
934 local($ary[5]) = 6;
935 local($hash{'a'}) = 'drill';
936 while (my $e = pop(@ary)) {
937 print "$e . . .\n";
938 last unless $e > 3;
939 }
940 if (@ary) {
941 $hash{'only a'} = 'test';
942 delete $hash{'a'};
943 }
944 }
945 print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n";
946 print "The array has ",scalar(@ary)," elements: ",
947 join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n";
948
949 Perl will print
950
951 6 . . .
952 4 . . .
953 3 . . .
954 This is a test only a test.
955 The array has 6 elements: 0, 1, 2, undef, undef, 5
956
957 The behavior of local() on non-existent members of composite types is
958 subject to change in future. The behavior of local() on array elements
959 specified using negative indexes is particularly surprising, and is
960 very likely to change.
961
962 Localized deletion of elements of composite types
963
964 You can use the "delete local $array[$idx]" and "delete local
965 $hash{key}" constructs to delete a composite type entry for the current
966 block and restore it when it ends. They return the array/hash value
967 before the localization, which means that they are respectively
968 equivalent to
969
970 do {
971 my $val = $array[$idx];
972 local $array[$idx];
973 delete $array[$idx];
974 $val
975 }
976
977 and
978
979 do {
980 my $val = $hash{key};
981 local $hash{key};
982 delete $hash{key};
983 $val
984 }
985
986 except that for those the "local" is scoped to the "do" block. Slices
987 are also accepted.
988
989 my %hash = (
990 a => [ 7, 8, 9 ],
991 b => 1,
992 )
993
994 {
995 my $a = delete local $hash{a};
996 # $a is [ 7, 8, 9 ]
997 # %hash is (b => 1)
998
999 {
1000 my @nums = delete local @$a[0, 2]
1001 # @nums is (7, 9)
1002 # $a is [ undef, 8 ]
1003
1004 $a[0] = 999; # will be erased when the scope ends
1005 }
1006 # $a is back to [ 7, 8, 9 ]
1007
1008 }
1009 # %hash is back to its original state
1010
1011 This construct is supported since Perl v5.12.
1012
1013 Lvalue subroutines
1014 It is possible to return a modifiable value from a subroutine. To do
1015 this, you have to declare the subroutine to return an lvalue.
1016
1017 my $val;
1018 sub canmod : lvalue {
1019 $val; # or: return $val;
1020 }
1021 sub nomod {
1022 $val;
1023 }
1024
1025 canmod() = 5; # assigns to $val
1026 nomod() = 5; # ERROR
1027
1028 The scalar/list context for the subroutine and for the right-hand side
1029 of assignment is determined as if the subroutine call is replaced by a
1030 scalar. For example, consider:
1031
1032 data(2,3) = get_data(3,4);
1033
1034 Both subroutines here are called in a scalar context, while in:
1035
1036 (data(2,3)) = get_data(3,4);
1037
1038 and in:
1039
1040 (data(2),data(3)) = get_data(3,4);
1041
1042 all the subroutines are called in a list context.
1043
1044 Lvalue subroutines are convenient, but you have to keep in mind that,
1045 when used with objects, they may violate encapsulation. A normal
1046 mutator can check the supplied argument before setting the attribute it
1047 is protecting, an lvalue subroutine cannot. If you require any special
1048 processing when storing and retrieving the values, consider using the
1049 CPAN module Sentinel or something similar.
1050
1051 Lexical Subroutines
1052 Beginning with Perl 5.18, you can declare a private subroutine with
1053 "my" or "state". As with state variables, the "state" keyword is only
1054 available under "use feature 'state'" or "use v5.10" or higher.
1055
1056 Prior to Perl 5.26, lexical subroutines were deemed experimental and
1057 were available only under the "use feature 'lexical_subs'" pragma.
1058 They also produced a warning unless the "experimental::lexical_subs"
1059 warnings category was disabled.
1060
1061 These subroutines are only visible within the block in which they are
1062 declared, and only after that declaration:
1063
1064 # Include these two lines if your code is intended to run under Perl
1065 # versions earlier than 5.26.
1066 no warnings "experimental::lexical_subs";
1067 use feature 'lexical_subs';
1068
1069 foo(); # calls the package/global subroutine
1070 state sub foo {
1071 foo(); # also calls the package subroutine
1072 }
1073 foo(); # calls "state" sub
1074 my $ref = \&foo; # take a reference to "state" sub
1075
1076 my sub bar { ... }
1077 bar(); # calls "my" sub
1078
1079 You can't (directly) write a recursive lexical subroutine:
1080
1081 # WRONG
1082 my sub baz {
1083 baz();
1084 }
1085
1086 This example fails because "baz()" refers to the package/global
1087 subroutine "baz", not the lexical subroutine currently being defined.
1088
1089 The solution is to use "__SUB__":
1090
1091 my sub baz {
1092 __SUB__->(); # calls itself
1093 }
1094
1095 It is possible to predeclare a lexical subroutine. The "sub foo {...}"
1096 subroutine definition syntax respects any previous "my sub;" or "state
1097 sub;" declaration. Using this to define recursive subroutines is a bad
1098 idea, however:
1099
1100 my sub baz; # predeclaration
1101 sub baz { # define the "my" sub
1102 baz(); # WRONG: calls itself, but leaks memory
1103 }
1104
1105 Just like "my $f; $f = sub { $f->() }", this example leaks memory. The
1106 name "baz" is a reference to the subroutine, and the subroutine uses
1107 the name "baz"; they keep each other alive (see "Circular References"
1108 in perlref).
1109
1110 "state sub" vs "my sub"
1111
1112 What is the difference between "state" subs and "my" subs? Each time
1113 that execution enters a block when "my" subs are declared, a new copy
1114 of each sub is created. "State" subroutines persist from one execution
1115 of the containing block to the next.
1116
1117 So, in general, "state" subroutines are faster. But "my" subs are
1118 necessary if you want to create closures:
1119
1120 sub whatever {
1121 my $x = shift;
1122 my sub inner {
1123 ... do something with $x ...
1124 }
1125 inner();
1126 }
1127
1128 In this example, a new $x is created when "whatever" is called, and
1129 also a new "inner", which can see the new $x. A "state" sub will only
1130 see the $x from the first call to "whatever".
1131
1132 "our" subroutines
1133
1134 Like "our $variable", "our sub" creates a lexical alias to the package
1135 subroutine of the same name.
1136
1137 The two main uses for this are to switch back to using the package sub
1138 inside an inner scope:
1139
1140 sub foo { ... }
1141
1142 sub bar {
1143 my sub foo { ... }
1144 {
1145 # need to use the outer foo here
1146 our sub foo;
1147 foo();
1148 }
1149 }
1150
1151 and to make a subroutine visible to other packages in the same scope:
1152
1153 package MySneakyModule;
1154
1155 our sub do_something { ... }
1156
1157 sub do_something_with_caller {
1158 package DB;
1159 () = caller 1; # sets @DB::args
1160 do_something(@args); # uses MySneakyModule::do_something
1161 }
1162
1163 Passing Symbol Table Entries (typeglobs)
1164 WARNING: The mechanism described in this section was originally the
1165 only way to simulate pass-by-reference in older versions of Perl.
1166 While it still works fine in modern versions, the new reference
1167 mechanism is generally easier to work with. See below.
1168
1169 Sometimes you don't want to pass the value of an array to a subroutine
1170 but rather the name of it, so that the subroutine can modify the global
1171 copy of it rather than working with a local copy. In perl you can
1172 refer to all objects of a particular name by prefixing the name with a
1173 star: *foo. This is often known as a "typeglob", because the star on
1174 the front can be thought of as a wildcard match for all the funny
1175 prefix characters on variables and subroutines and such.
1176
1177 When evaluated, the typeglob produces a scalar value that represents
1178 all the objects of that name, including any filehandle, format, or
1179 subroutine. When assigned to, it causes the name mentioned to refer to
1180 whatever "*" value was assigned to it. Example:
1181
1182 sub doubleary {
1183 local(*someary) = @_;
1184 foreach $elem (@someary) {
1185 $elem *= 2;
1186 }
1187 }
1188 doubleary(*foo);
1189 doubleary(*bar);
1190
1191 Scalars are already passed by reference, so you can modify scalar
1192 arguments without using this mechanism by referring explicitly to $_[0]
1193 etc. You can modify all the elements of an array by passing all the
1194 elements as scalars, but you have to use the "*" mechanism (or the
1195 equivalent reference mechanism) to "push", "pop", or change the size of
1196 an array. It will certainly be faster to pass the typeglob (or
1197 reference).
1198
1199 Even if you don't want to modify an array, this mechanism is useful for
1200 passing multiple arrays in a single LIST, because normally the LIST
1201 mechanism will merge all the array values so that you can't extract out
1202 the individual arrays. For more on typeglobs, see "Typeglobs and
1203 Filehandles" in perldata.
1204
1205 When to Still Use local()
1206 Despite the existence of "my", there are still three places where the
1207 "local" operator still shines. In fact, in these three places, you
1208 must use "local" instead of "my".
1209
1210 1. You need to give a global variable a temporary value, especially
1211 $_.
1212
1213 The global variables, like @ARGV or the punctuation variables, must
1214 be "local"ized with "local()". This block reads in /etc/motd, and
1215 splits it up into chunks separated by lines of equal signs, which
1216 are placed in @Fields.
1217
1218 {
1219 local @ARGV = ("/etc/motd");
1220 local $/ = undef;
1221 local $_ = <>;
1222 @Fields = split /^\s*=+\s*$/;
1223 }
1224
1225 It particular, it's important to "local"ize $_ in any routine that
1226 assigns to it. Look out for implicit assignments in "while"
1227 conditionals.
1228
1229 2. You need to create a local file or directory handle or a local
1230 function.
1231
1232 A function that needs a filehandle of its own must use "local()" on
1233 a complete typeglob. This can be used to create new symbol table
1234 entries:
1235
1236 sub ioqueue {
1237 local (*READER, *WRITER); # not my!
1238 pipe (READER, WRITER) or die "pipe: $!";
1239 return (*READER, *WRITER);
1240 }
1241 ($head, $tail) = ioqueue();
1242
1243 See the Symbol module for a way to create anonymous symbol table
1244 entries.
1245
1246 Because assignment of a reference to a typeglob creates an alias,
1247 this can be used to create what is effectively a local function, or
1248 at least, a local alias.
1249
1250 {
1251 local *grow = \&shrink; # only until this block exits
1252 grow(); # really calls shrink()
1253 move(); # if move() grow()s, it shrink()s too
1254 }
1255 grow(); # get the real grow() again
1256
1257 See "Function Templates" in perlref for more about manipulating
1258 functions by name in this way.
1259
1260 3. You want to temporarily change just one element of an array or
1261 hash.
1262
1263 You can "local"ize just one element of an aggregate. Usually this
1264 is done on dynamics:
1265
1266 {
1267 local $SIG{INT} = 'IGNORE';
1268 funct(); # uninterruptible
1269 }
1270 # interruptibility automatically restored here
1271
1272 But it also works on lexically declared aggregates.
1273
1274 Pass by Reference
1275 If you want to pass more than one array or hash into a function--or
1276 return them from it--and have them maintain their integrity, then
1277 you're going to have to use an explicit pass-by-reference. Before you
1278 do that, you need to understand references as detailed in perlref.
1279 This section may not make much sense to you otherwise.
1280
1281 Here are a few simple examples. First, let's pass in several arrays to
1282 a function and have it "pop" all of then, returning a new list of all
1283 their former last elements:
1284
1285 @tailings = popmany ( \@a, \@b, \@c, \@d );
1286
1287 sub popmany {
1288 my $aref;
1289 my @retlist;
1290 foreach $aref ( @_ ) {
1291 push @retlist, pop @$aref;
1292 }
1293 return @retlist;
1294 }
1295
1296 Here's how you might write a function that returns a list of keys
1297 occurring in all the hashes passed to it:
1298
1299 @common = inter( \%foo, \%bar, \%joe );
1300 sub inter {
1301 my ($k, $href, %seen); # locals
1302 foreach $href (@_) {
1303 while ( $k = each %$href ) {
1304 $seen{$k}++;
1305 }
1306 }
1307 return grep { $seen{$_} == @_ } keys %seen;
1308 }
1309
1310 So far, we're using just the normal list return mechanism. What
1311 happens if you want to pass or return a hash? Well, if you're using
1312 only one of them, or you don't mind them concatenating, then the normal
1313 calling convention is ok, although a little expensive.
1314
1315 Where people get into trouble is here:
1316
1317 (@a, @b) = func(@c, @d);
1318 or
1319 (%a, %b) = func(%c, %d);
1320
1321 That syntax simply won't work. It sets just @a or %a and clears the @b
1322 or %b. Plus the function didn't get passed into two separate arrays or
1323 hashes: it got one long list in @_, as always.
1324
1325 If you can arrange for everyone to deal with this through references,
1326 it's cleaner code, although not so nice to look at. Here's a function
1327 that takes two array references as arguments, returning the two array
1328 elements in order of how many elements they have in them:
1329
1330 ($aref, $bref) = func(\@c, \@d);
1331 print "@$aref has more than @$bref\n";
1332 sub func {
1333 my ($cref, $dref) = @_;
1334 if (@$cref > @$dref) {
1335 return ($cref, $dref);
1336 } else {
1337 return ($dref, $cref);
1338 }
1339 }
1340
1341 It turns out that you can actually do this also:
1342
1343 (*a, *b) = func(\@c, \@d);
1344 print "@a has more than @b\n";
1345 sub func {
1346 local (*c, *d) = @_;
1347 if (@c > @d) {
1348 return (\@c, \@d);
1349 } else {
1350 return (\@d, \@c);
1351 }
1352 }
1353
1354 Here we're using the typeglobs to do symbol table aliasing. It's a tad
1355 subtle, though, and also won't work if you're using "my" variables,
1356 because only globals (even in disguise as "local"s) are in the symbol
1357 table.
1358
1359 If you're passing around filehandles, you could usually just use the
1360 bare typeglob, like *STDOUT, but typeglobs references work, too. For
1361 example:
1362
1363 splutter(\*STDOUT);
1364 sub splutter {
1365 my $fh = shift;
1366 print $fh "her um well a hmmm\n";
1367 }
1368
1369 $rec = get_rec(\*STDIN);
1370 sub get_rec {
1371 my $fh = shift;
1372 return scalar <$fh>;
1373 }
1374
1375 If you're planning on generating new filehandles, you could do this.
1376 Notice to pass back just the bare *FH, not its reference.
1377
1378 sub openit {
1379 my $path = shift;
1380 local *FH;
1381 return open (FH, $path) ? *FH : undef;
1382 }
1383
1384 Prototypes
1385 Perl supports a very limited kind of compile-time argument checking
1386 using function prototyping. This can be declared in either the PROTO
1387 section or with a prototype attribute. If you declare either of
1388
1389 sub mypush (\@@)
1390 sub mypush :prototype(\@@)
1391
1392 then "mypush()" takes arguments exactly like "push()" does.
1393
1394 If subroutine signatures are enabled (see "Signatures"), then the
1395 shorter PROTO syntax is unavailable, because it would clash with
1396 signatures. In that case, a prototype can only be declared in the form
1397 of an attribute.
1398
1399 The function declaration must be visible at compile time. The
1400 prototype affects only interpretation of new-style calls to the
1401 function, where new-style is defined as not using the "&" character.
1402 In other words, if you call it like a built-in function, then it
1403 behaves like a built-in function. If you call it like an old-fashioned
1404 subroutine, then it behaves like an old-fashioned subroutine. It
1405 naturally falls out from this rule that prototypes have no influence on
1406 subroutine references like "\&foo" or on indirect subroutine calls like
1407 "&{$subref}" or "$subref->()".
1408
1409 Method calls are not influenced by prototypes either, because the
1410 function to be called is indeterminate at compile time, since the exact
1411 code called depends on inheritance.
1412
1413 Because the intent of this feature is primarily to let you define
1414 subroutines that work like built-in functions, here are prototypes for
1415 some other functions that parse almost exactly like the corresponding
1416 built-in.
1417
1418 Declared as Called as
1419
1420 sub mylink ($$) mylink $old, $new
1421 sub myvec ($$$) myvec $var, $offset, 1
1422 sub myindex ($$;$) myindex &getstring, "substr"
1423 sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off
1424 sub myreverse (@) myreverse $a, $b, $c
1425 sub myjoin ($@) myjoin ":", $a, $b, $c
1426 sub mypop (\@) mypop @array
1427 sub mysplice (\@$$@) mysplice @array, 0, 2, @pushme
1428 sub mykeys (\[%@]) mykeys $hashref->%*
1429 sub myopen (*;$) myopen HANDLE, $name
1430 sub mypipe (**) mypipe READHANDLE, WRITEHANDLE
1431 sub mygrep (&@) mygrep { /foo/ } $a, $b, $c
1432 sub myrand (;$) myrand 42
1433 sub mytime () mytime
1434
1435 Any backslashed prototype character represents an actual argument that
1436 must start with that character (optionally preceded by "my", "our" or
1437 "local"), with the exception of "$", which will accept any scalar
1438 lvalue expression, such as "$foo = 7" or "my_function()->[0]". The
1439 value passed as part of @_ will be a reference to the actual argument
1440 given in the subroutine call, obtained by applying "\" to that
1441 argument.
1442
1443 You can use the "\[]" backslash group notation to specify more than one
1444 allowed argument type. For example:
1445
1446 sub myref (\[$@%&*])
1447
1448 will allow calling myref() as
1449
1450 myref $var
1451 myref @array
1452 myref %hash
1453 myref &sub
1454 myref *glob
1455
1456 and the first argument of myref() will be a reference to a scalar, an
1457 array, a hash, a code, or a glob.
1458
1459 Unbackslashed prototype characters have special meanings. Any
1460 unbackslashed "@" or "%" eats all remaining arguments, and forces list
1461 context. An argument represented by "$" forces scalar context. An "&"
1462 requires an anonymous subroutine, which, if passed as the first
1463 argument, does not require the "sub" keyword or a subsequent comma.
1464
1465 A "*" allows the subroutine to accept a bareword, constant, scalar
1466 expression, typeglob, or a reference to a typeglob in that slot. The
1467 value will be available to the subroutine either as a simple scalar, or
1468 (in the latter two cases) as a reference to the typeglob. If you wish
1469 to always convert such arguments to a typeglob reference, use
1470 Symbol::qualify_to_ref() as follows:
1471
1472 use Symbol 'qualify_to_ref';
1473
1474 sub foo (*) {
1475 my $fh = qualify_to_ref(shift, caller);
1476 ...
1477 }
1478
1479 The "+" prototype is a special alternative to "$" that will act like
1480 "\[@%]" when given a literal array or hash variable, but will otherwise
1481 force scalar context on the argument. This is useful for functions
1482 which should accept either a literal array or an array reference as the
1483 argument:
1484
1485 sub mypush (+@) {
1486 my $aref = shift;
1487 die "Not an array or arrayref" unless ref $aref eq 'ARRAY';
1488 push @$aref, @_;
1489 }
1490
1491 When using the "+" prototype, your function must check that the
1492 argument is of an acceptable type.
1493
1494 A semicolon (";") separates mandatory arguments from optional
1495 arguments. It is redundant before "@" or "%", which gobble up
1496 everything else.
1497
1498 As the last character of a prototype, or just before a semicolon, a "@"
1499 or a "%", you can use "_" in place of "$": if this argument is not
1500 provided, $_ will be used instead.
1501
1502 Note how the last three examples in the table above are treated
1503 specially by the parser. "mygrep()" is parsed as a true list operator,
1504 "myrand()" is parsed as a true unary operator with unary precedence the
1505 same as "rand()", and "mytime()" is truly without arguments, just like
1506 "time()". That is, if you say
1507
1508 mytime +2;
1509
1510 you'll get "mytime() + 2", not mytime(2), which is how it would be
1511 parsed without a prototype. If you want to force a unary function to
1512 have the same precedence as a list operator, add ";" to the end of the
1513 prototype:
1514
1515 sub mygetprotobynumber($;);
1516 mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b)
1517
1518 The interesting thing about "&" is that you can generate new syntax
1519 with it, provided it's in the initial position:
1520
1521 sub try (&@) {
1522 my($try,$catch) = @_;
1523 eval { &$try };
1524 if ($@) {
1525 local $_ = $@;
1526 &$catch;
1527 }
1528 }
1529 sub catch (&) { $_[0] }
1530
1531 try {
1532 die "phooey";
1533 } catch {
1534 /phooey/ and print "unphooey\n";
1535 };
1536
1537 That prints "unphooey". (Yes, there are still unresolved issues having
1538 to do with visibility of @_. I'm ignoring that question for the
1539 moment. (But note that if we make @_ lexically scoped, those anonymous
1540 subroutines can act like closures... (Gee, is this sounding a little
1541 Lispish? (Never mind.))))
1542
1543 And here's a reimplementation of the Perl "grep" operator:
1544
1545 sub mygrep (&@) {
1546 my $code = shift;
1547 my @result;
1548 foreach $_ (@_) {
1549 push(@result, $_) if &$code;
1550 }
1551 @result;
1552 }
1553
1554 Some folks would prefer full alphanumeric prototypes. Alphanumerics
1555 have been intentionally left out of prototypes for the express purpose
1556 of someday in the future adding named, formal parameters. The current
1557 mechanism's main goal is to let module writers provide better
1558 diagnostics for module users. Larry feels the notation quite
1559 understandable to Perl programmers, and that it will not intrude
1560 greatly upon the meat of the module, nor make it harder to read. The
1561 line noise is visually encapsulated into a small pill that's easy to
1562 swallow.
1563
1564 If you try to use an alphanumeric sequence in a prototype you will
1565 generate an optional warning - "Illegal character in prototype...".
1566 Unfortunately earlier versions of Perl allowed the prototype to be used
1567 as long as its prefix was a valid prototype. The warning may be
1568 upgraded to a fatal error in a future version of Perl once the majority
1569 of offending code is fixed.
1570
1571 It's probably best to prototype new functions, not retrofit prototyping
1572 into older ones. That's because you must be especially careful about
1573 silent impositions of differing list versus scalar contexts. For
1574 example, if you decide that a function should take just one parameter,
1575 like this:
1576
1577 sub func ($) {
1578 my $n = shift;
1579 print "you gave me $n\n";
1580 }
1581
1582 and someone has been calling it with an array or expression returning a
1583 list:
1584
1585 func(@foo);
1586 func( $text =~ /\w+/g );
1587
1588 Then you've just supplied an automatic "scalar" in front of their
1589 argument, which can be more than a bit surprising. The old @foo which
1590 used to hold one thing doesn't get passed in. Instead, "func()" now
1591 gets passed in a 1; that is, the number of elements in @foo. And the
1592 "m//g" gets called in scalar context so instead of a list of words it
1593 returns a boolean result and advances "pos($text)". Ouch!
1594
1595 If a sub has both a PROTO and a BLOCK, the prototype is not applied
1596 until after the BLOCK is completely defined. This means that a
1597 recursive function with a prototype has to be predeclared for the
1598 prototype to take effect, like so:
1599
1600 sub foo($$);
1601 sub foo($$) {
1602 foo 1, 2;
1603 }
1604
1605 This is all very powerful, of course, and should be used only in
1606 moderation to make the world a better place.
1607
1608 Constant Functions
1609 Functions with a prototype of "()" are potential candidates for
1610 inlining. If the result after optimization and constant folding is
1611 either a constant or a lexically-scoped scalar which has no other
1612 references, then it will be used in place of function calls made
1613 without "&". Calls made using "&" are never inlined. (See constant
1614 for an easy way to declare most constants.)
1615
1616 The following functions would all be inlined:
1617
1618 sub pi () { 3.14159 } # Not exact, but close.
1619 sub PI () { 4 * atan2 1, 1 } # As good as it gets,
1620 # and it's inlined, too!
1621 sub ST_DEV () { 0 }
1622 sub ST_INO () { 1 }
1623
1624 sub FLAG_FOO () { 1 << 8 }
1625 sub FLAG_BAR () { 1 << 9 }
1626 sub FLAG_MASK () { FLAG_FOO | FLAG_BAR }
1627
1628 sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) }
1629
1630 sub N () { int(OPT_BAZ) / 3 }
1631
1632 sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO }
1633 sub FOO_SET2 () { if (FLAG_MASK & FLAG_FOO) { 1 } }
1634
1635 (Be aware that the last example was not always inlined in Perl 5.20 and
1636 earlier, which did not behave consistently with subroutines containing
1637 inner scopes.) You can countermand inlining by using an explicit
1638 "return":
1639
1640 sub baz_val () {
1641 if (OPT_BAZ) {
1642 return 23;
1643 }
1644 else {
1645 return 42;
1646 }
1647 }
1648 sub bonk_val () { return 12345 }
1649
1650 As alluded to earlier you can also declare inlined subs dynamically at
1651 BEGIN time if their body consists of a lexically-scoped scalar which
1652 has no other references. Only the first example here will be inlined:
1653
1654 BEGIN {
1655 my $var = 1;
1656 no strict 'refs';
1657 *INLINED = sub () { $var };
1658 }
1659
1660 BEGIN {
1661 my $var = 1;
1662 my $ref = \$var;
1663 no strict 'refs';
1664 *NOT_INLINED = sub () { $var };
1665 }
1666
1667 A not so obvious caveat with this (see [RT #79908]) is what happens if
1668 the variable is potentially modifiable. For example:
1669
1670 BEGIN {
1671 my $x = 10;
1672 *FOO = sub () { $x };
1673 $x++;
1674 }
1675 print FOO(); # printed 10 prior to 5.32.0
1676
1677 From Perl 5.22 onwards this gave a deprecation warning, and from Perl
1678 5.32 onwards it became a run-time error. Previously the variable was
1679 immediately inlined, and stopped behaving like a normal lexical
1680 variable; so it printed 10, not 11.
1681
1682 If you still want such a subroutine to be inlined (with no warning),
1683 make sure the variable is not used in a context where it could be
1684 modified aside from where it is declared.
1685
1686 # Fine, no warning
1687 BEGIN {
1688 my $x = 54321;
1689 *INLINED = sub () { $x };
1690 }
1691 # Error
1692 BEGIN {
1693 my $x;
1694 $x = 54321;
1695 *ALSO_INLINED = sub () { $x };
1696 }
1697
1698 Perl 5.22 also introduces the experimental "const" attribute as an
1699 alternative. (Disable the "experimental::const_attr" warnings if you
1700 want to use it.) When applied to an anonymous subroutine, it forces
1701 the sub to be called when the "sub" expression is evaluated. The
1702 return value is captured and turned into a constant subroutine:
1703
1704 my $x = 54321;
1705 *INLINED = sub : const { $x };
1706 $x++;
1707
1708 The return value of "INLINED" in this example will always be 54321,
1709 regardless of later modifications to $x. You can also put any
1710 arbitrary code inside the sub, at it will be executed immediately and
1711 its return value captured the same way.
1712
1713 If you really want a subroutine with a "()" prototype that returns a
1714 lexical variable you can easily force it to not be inlined by adding an
1715 explicit "return":
1716
1717 BEGIN {
1718 my $x = 10;
1719 *FOO = sub () { return $x };
1720 $x++;
1721 }
1722 print FOO(); # prints 11
1723
1724 The easiest way to tell if a subroutine was inlined is by using
1725 B::Deparse. Consider this example of two subroutines returning 1, one
1726 with a "()" prototype causing it to be inlined, and one without (with
1727 deparse output truncated for clarity):
1728
1729 $ perl -MO=Deparse -le 'sub ONE { 1 } if (ONE) { print ONE if ONE }'
1730 sub ONE {
1731 1;
1732 }
1733 if (ONE ) {
1734 print ONE() if ONE ;
1735 }
1736
1737 $ perl -MO=Deparse -le 'sub ONE () { 1 } if (ONE) { print ONE if ONE }'
1738 sub ONE () { 1 }
1739 do {
1740 print 1
1741 };
1742
1743 If you redefine a subroutine that was eligible for inlining, you'll get
1744 a warning by default. You can use this warning to tell whether or not
1745 a particular subroutine is considered inlinable, since it's different
1746 than the warning for overriding non-inlined subroutines:
1747
1748 $ perl -e 'sub one () {1} sub one () {2}'
1749 Constant subroutine one redefined at -e line 1.
1750 $ perl -we 'sub one {1} sub one {2}'
1751 Subroutine one redefined at -e line 1.
1752
1753 The warning is considered severe enough not to be affected by the -w
1754 switch (or its absence) because previously compiled invocations of the
1755 function will still be using the old value of the function. If you
1756 need to be able to redefine the subroutine, you need to ensure that it
1757 isn't inlined, either by dropping the "()" prototype (which changes
1758 calling semantics, so beware) or by thwarting the inlining mechanism in
1759 some other way, e.g. by adding an explicit "return", as mentioned
1760 above:
1761
1762 sub not_inlined () { return 23 }
1763
1764 Overriding Built-in Functions
1765 Many built-in functions may be overridden, though this should be tried
1766 only occasionally and for good reason. Typically this might be done by
1767 a package attempting to emulate missing built-in functionality on a
1768 non-Unix system.
1769
1770 Overriding may be done only by importing the name from a module at
1771 compile time--ordinary predeclaration isn't good enough. However, the
1772 "use subs" pragma lets you, in effect, predeclare subs via the import
1773 syntax, and these names may then override built-in ones:
1774
1775 use subs 'chdir', 'chroot', 'chmod', 'chown';
1776 chdir $somewhere;
1777 sub chdir { ... }
1778
1779 To unambiguously refer to the built-in form, precede the built-in name
1780 with the special package qualifier "CORE::". For example, saying
1781 "CORE::open()" always refers to the built-in "open()", even if the
1782 current package has imported some other subroutine called "&open()"
1783 from elsewhere. Even though it looks like a regular function call, it
1784 isn't: the CORE:: prefix in that case is part of Perl's syntax, and
1785 works for any keyword, regardless of what is in the CORE package.
1786 Taking a reference to it, that is, "\&CORE::open", only works for some
1787 keywords. See CORE.
1788
1789 Library modules should not in general export built-in names like "open"
1790 or "chdir" as part of their default @EXPORT list, because these may
1791 sneak into someone else's namespace and change the semantics
1792 unexpectedly. Instead, if the module adds that name to @EXPORT_OK,
1793 then it's possible for a user to import the name explicitly, but not
1794 implicitly. That is, they could say
1795
1796 use Module 'open';
1797
1798 and it would import the "open" override. But if they said
1799
1800 use Module;
1801
1802 they would get the default imports without overrides.
1803
1804 The foregoing mechanism for overriding built-in is restricted, quite
1805 deliberately, to the package that requests the import. There is a
1806 second method that is sometimes applicable when you wish to override a
1807 built-in everywhere, without regard to namespace boundaries. This is
1808 achieved by importing a sub into the special namespace
1809 "CORE::GLOBAL::". Here is an example that quite brazenly replaces the
1810 "glob" operator with something that understands regular expressions.
1811
1812 package REGlob;
1813 require Exporter;
1814 @ISA = 'Exporter';
1815 @EXPORT_OK = 'glob';
1816
1817 sub import {
1818 my $pkg = shift;
1819 return unless @_;
1820 my $sym = shift;
1821 my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0));
1822 $pkg->export($where, $sym, @_);
1823 }
1824
1825 sub glob {
1826 my $pat = shift;
1827 my @got;
1828 if (opendir my $d, '.') {
1829 @got = grep /$pat/, readdir $d;
1830 closedir $d;
1831 }
1832 return @got;
1833 }
1834 1;
1835
1836 And here's how it could be (ab)used:
1837
1838 #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces
1839 package Foo;
1840 use REGlob 'glob'; # override glob() in Foo:: only
1841 print for <^[a-z_]+\.pm\$>; # show all pragmatic modules
1842
1843 The initial comment shows a contrived, even dangerous example. By
1844 overriding "glob" globally, you would be forcing the new (and
1845 subversive) behavior for the "glob" operator for every namespace,
1846 without the complete cognizance or cooperation of the modules that own
1847 those namespaces. Naturally, this should be done with extreme
1848 caution--if it must be done at all.
1849
1850 The "REGlob" example above does not implement all the support needed to
1851 cleanly override perl's "glob" operator. The built-in "glob" has
1852 different behaviors depending on whether it appears in a scalar or list
1853 context, but our "REGlob" doesn't. Indeed, many perl built-in have
1854 such context sensitive behaviors, and these must be adequately
1855 supported by a properly written override. For a fully functional
1856 example of overriding "glob", study the implementation of
1857 "File::DosGlob" in the standard library.
1858
1859 When you override a built-in, your replacement should be consistent (if
1860 possible) with the built-in native syntax. You can achieve this by
1861 using a suitable prototype. To get the prototype of an overridable
1862 built-in, use the "prototype" function with an argument of
1863 "CORE::builtin_name" (see "prototype" in perlfunc).
1864
1865 Note however that some built-ins can't have their syntax expressed by a
1866 prototype (such as "system" or "chomp"). If you override them you
1867 won't be able to fully mimic their original syntax.
1868
1869 The built-ins "do", "require" and "glob" can also be overridden, but
1870 due to special magic, their original syntax is preserved, and you don't
1871 have to define a prototype for their replacements. (You can't override
1872 the "do BLOCK" syntax, though).
1873
1874 "require" has special additional dark magic: if you invoke your
1875 "require" replacement as "require Foo::Bar", it will actually receive
1876 the argument "Foo/Bar.pm" in @_. See "require" in perlfunc.
1877
1878 And, as you'll have noticed from the previous example, if you override
1879 "glob", the "<*>" glob operator is overridden as well.
1880
1881 In a similar fashion, overriding the "readline" function also overrides
1882 the equivalent I/O operator "<FILEHANDLE>". Also, overriding
1883 "readpipe" also overrides the operators "``" and "qx//".
1884
1885 Finally, some built-ins (e.g. "exists" or "grep") can't be overridden.
1886
1887 Autoloading
1888 If you call a subroutine that is undefined, you would ordinarily get an
1889 immediate, fatal error complaining that the subroutine doesn't exist.
1890 (Likewise for subroutines being used as methods, when the method
1891 doesn't exist in any base class of the class's package.) However, if
1892 an "AUTOLOAD" subroutine is defined in the package or packages used to
1893 locate the original subroutine, then that "AUTOLOAD" subroutine is
1894 called with the arguments that would have been passed to the original
1895 subroutine. The fully qualified name of the original subroutine
1896 magically appears in the global $AUTOLOAD variable of the same package
1897 as the "AUTOLOAD" routine. The name is not passed as an ordinary
1898 argument because, er, well, just because, that's why. (As an
1899 exception, a method call to a nonexistent "import" or "unimport" method
1900 is just skipped instead. Also, if the AUTOLOAD subroutine is an XSUB,
1901 there are other ways to retrieve the subroutine name. See "Autoloading
1902 with XSUBs" in perlguts for details.)
1903
1904 Many "AUTOLOAD" routines load in a definition for the requested
1905 subroutine using eval(), then execute that subroutine using a special
1906 form of goto() that erases the stack frame of the "AUTOLOAD" routine
1907 without a trace. (See the source to the standard module documented in
1908 AutoLoader, for example.) But an "AUTOLOAD" routine can also just
1909 emulate the routine and never define it. For example, let's pretend
1910 that a function that wasn't defined should just invoke "system" with
1911 those arguments. All you'd do is:
1912
1913 sub AUTOLOAD {
1914 our $AUTOLOAD; # keep 'use strict' happy
1915 my $program = $AUTOLOAD;
1916 $program =~ s/.*:://;
1917 system($program, @_);
1918 }
1919 date();
1920 who();
1921 ls('-l');
1922
1923 In fact, if you predeclare functions you want to call that way, you
1924 don't even need parentheses:
1925
1926 use subs qw(date who ls);
1927 date;
1928 who;
1929 ls '-l';
1930
1931 A more complete example of this is the Shell module on CPAN, which can
1932 treat undefined subroutine calls as calls to external programs.
1933
1934 Mechanisms are available to help modules writers split their modules
1935 into autoloadable files. See the standard AutoLoader module described
1936 in AutoLoader and in AutoSplit, the standard SelfLoader modules in
1937 SelfLoader, and the document on adding C functions to Perl code in
1938 perlxs.
1939
1940 Subroutine Attributes
1941 A subroutine declaration or definition may have a list of attributes
1942 associated with it. If such an attribute list is present, it is broken
1943 up at space or colon boundaries and treated as though a "use
1944 attributes" had been seen. See attributes for details about what
1945 attributes are currently supported. Unlike the limitation with the
1946 obsolescent "use attrs", the "sub : ATTRLIST" syntax works to associate
1947 the attributes with a pre-declaration, and not just with a subroutine
1948 definition.
1949
1950 The attributes must be valid as simple identifier names (without any
1951 punctuation other than the '_' character). They may have a parameter
1952 list appended, which is only checked for whether its parentheses
1953 ('(',')') nest properly.
1954
1955 Examples of valid syntax (even though the attributes are unknown):
1956
1957 sub fnord (&\%) : switch(10,foo(7,3)) : expensive;
1958 sub plugh () : Ugly('\(") :Bad;
1959 sub xyzzy : _5x5 { ... }
1960
1961 Examples of invalid syntax:
1962
1963 sub fnord : switch(10,foo(); # ()-string not balanced
1964 sub snoid : Ugly('('); # ()-string not balanced
1965 sub xyzzy : 5x5; # "5x5" not a valid identifier
1966 sub plugh : Y2::north; # "Y2::north" not a simple identifier
1967 sub snurt : foo + bar; # "+" not a colon or space
1968
1969 The attribute list is passed as a list of constant strings to the code
1970 which associates them with the subroutine. In particular, the second
1971 example of valid syntax above currently looks like this in terms of how
1972 it's parsed and invoked:
1973
1974 use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad';
1975
1976 For further details on attribute lists and their manipulation, see
1977 attributes and Attribute::Handlers.
1978
1980 See "Function Templates" in perlref for more about references and
1981 closures. See perlxs if you'd like to learn about calling C
1982 subroutines from Perl. See perlembed if you'd like to learn about
1983 calling Perl subroutines from C. See perlmod to learn about bundling
1984 up your functions in separate files. See perlmodlib to learn what
1985 library modules come standard on your system. See perlootut to learn
1986 how to make object method calls.
1987
1988
1989
1990perl v5.36.0 2022-08-30 PERLSUB(1)