perlfaq4(3pm)

1perlfaq4(3)           User Contributed Perl Documentation          perlfaq4(3)
2
3
4

NAME

6       perlfaq4 - Data Manipulation
7

VERSION

9       version 5.20230812
10

DESCRIPTION

12       This section of the FAQ answers questions related to manipulating
13       numbers, dates, strings, arrays, hashes, and miscellaneous data issues.
14

Data: Numbers

16   Why am I getting long decimals (eg, 19.9499999999999) instead of the
17       numbers I should be getting (eg, 19.95)?
18       For the long explanation, see David Goldberg's "What Every Computer
19       Scientist Should Know About Floating-Point Arithmetic"
20       (<http://web.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf>).
21
22       Internally, your computer represents floating-point numbers in binary.
23       Digital (as in powers of two) computers cannot store all numbers
24       exactly. Some real numbers lose precision in the process. This is a
25       problem with how computers store numbers and affects all computer
26       languages, not just Perl.
27
28       perlnumber shows the gory details of number representations and
29       conversions.
30
31       To limit the number of decimal places in your numbers, you can use the
32       "printf" or "sprintf" function. See "Floating-point Arithmetic" in
33       perlop for more details.
34
35           printf "%.2f", 10/3;
36
37           my $number = sprintf "%.2f", 10/3;
38
39   Why is int() broken?
40       Your int() is most probably working just fine. It's the numbers that
41       aren't quite what you think.
42
43       First, see the answer to "Why am I getting long decimals (eg,
44       19.9499999999999) instead of the numbers I should be getting (eg,
45       19.95)?".
46
47       For example, this
48
49           print int(0.6/0.2-2), "\n";
50
51       will in most computers print 0, not 1, because even such simple numbers
52       as 0.6 and 0.2 cannot be presented exactly by floating-point numbers.
53       What you think in the above as 'three' is really more like
54       2.9999999999999995559.
55
56   Why isn't my octal data interpreted correctly?
57       (contributed by brian d foy)
58
59       You're probably trying to convert a string to a number, which Perl only
60       converts as a decimal number. When Perl converts a string to a number,
61       it ignores leading spaces and zeroes, then assumes the rest of the
62       digits are in base 10:
63
64           my $string = '0644';
65
66           print $string + 0;  # prints 644
67
68           print $string + 44; # prints 688, certainly not octal!
69
70       This problem usually involves one of the Perl built-ins that has the
71       same name a Unix command that uses octal numbers as arguments on the
72       command line. In this example, "chmod" on the command line knows that
73       its first argument is octal because that's what it does:
74
75           %prompt> chmod 644 file
76
77       If you want to use the same literal digits (644) in Perl, you have to
78       tell Perl to treat them as octal numbers either by prefixing the digits
79       with a 0 or using "oct":
80
81           chmod(     0644, $filename );  # right, has leading zero
82           chmod( oct(644), $filename );  # also correct
83
84       The problem comes in when you take your numbers from something that
85       Perl thinks is a string, such as a command line argument in @ARGV:
86
87           chmod( $ARGV[0],      $filename );  # wrong, even if "0644"
88
89           chmod( oct($ARGV[0]), $filename );  # correct, treat string as octal
90
91       You can always check the value you're using by printing it in octal
92       notation to ensure it matches what you think it should be. Print it in
93       octal  and decimal format:
94
95           printf "0%o %d", $number, $number;
96
97   Does Perl have a round() function? What about ceil() and floor()? Trig
98       functions?
99       Remember that int() merely truncates toward 0. For rounding to a
100       certain number of digits, sprintf() or printf() is usually the easiest
101       route.
102
103           printf("%.3f", 3.1415926535);   # prints 3.142
104
105       The POSIX module (part of the standard Perl distribution) implements
106       ceil(), floor(), and a number of other mathematical and trigonometric
107       functions.
108
109           use POSIX;
110           my $ceil   = ceil(3.5);   # 4
111           my $floor  = floor(3.5);  # 3
112
113       In 5.000 to 5.003 perls, trigonometry was done in the Math::Complex
114       module. With 5.004, the Math::Trig module (part of the standard Perl
115       distribution) implements the trigonometric functions. Internally it
116       uses the Math::Complex module and some functions can break out from the
117       real axis into the complex plane, for example the inverse sine of 2.
118
119       Rounding in financial applications can have serious implications, and
120       the rounding method used should be specified precisely. In these cases,
121       it probably pays not to trust whichever system of rounding is being
122       used by Perl, but instead to implement the rounding function you need
123       yourself.
124
125       To see why, notice how you'll still have an issue on half-way-point
126       alternation:
127
128           for (my $i = -5; $i <= 5; $i += 0.5) { printf "%.0f ",$i }
129
130           -5 -4 -4 -4 -3 -2 -2 -2 -1 -0 0 0 1 2 2 2 3 4 4 4 5
131
132       Don't blame Perl. It's the same as in C. IEEE says we have to do this.
133       Perl numbers whose absolute values are integers under 2**31 (on 32-bit
134       machines) will work pretty much like mathematical integers.  Other
135       numbers are not guaranteed.
136
137   How do I convert between numeric representations/bases/radixes?
138       As always with Perl there is more than one way to do it. Below are a
139       few examples of approaches to making common conversions between number
140       representations. This is intended to be representational rather than
141       exhaustive.
142
143       Some of the examples later in perlfaq4 use the Bit::Vector module from
144       CPAN. The reason you might choose Bit::Vector over the perl built-in
145       functions is that it works with numbers of ANY size, that it is
146       optimized for speed on some operations, and for at least some
147       programmers the notation might be familiar.
148
149       How do I convert hexadecimal into decimal
150           Using perl's built in conversion of "0x" notation:
151
152               my $dec = 0xDEADBEEF;
153
154           Using the "hex" function:
155
156               my $dec = hex("DEADBEEF");
157
158           Using "pack":
159
160               my $dec = unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8)));
161
162           Using the CPAN module "Bit::Vector":
163
164               use Bit::Vector;
165               my $vec = Bit::Vector->new_Hex(32, "DEADBEEF");
166               my $dec = $vec->to_Dec();
167
168       How do I convert from decimal to hexadecimal
169           Using "sprintf":
170
171               my $hex = sprintf("%X", 3735928559); # upper case A-F
172               my $hex = sprintf("%x", 3735928559); # lower case a-f
173
174           Using "unpack":
175
176               my $hex = unpack("H*", pack("N", 3735928559));
177
178           Using Bit::Vector:
179
180               use Bit::Vector;
181               my $vec = Bit::Vector->new_Dec(32, -559038737);
182               my $hex = $vec->to_Hex();
183
184           And Bit::Vector supports odd bit counts:
185
186               use Bit::Vector;
187               my $vec = Bit::Vector->new_Dec(33, 3735928559);
188               $vec->Resize(32); # suppress leading 0 if unwanted
189               my $hex = $vec->to_Hex();
190
191       How do I convert from octal to decimal
192           Using Perl's built in conversion of numbers with leading zeros:
193
194               my $dec = 033653337357; # note the leading 0!
195
196           Using the "oct" function:
197
198               my $dec = oct("33653337357");
199
200           Using Bit::Vector:
201
202               use Bit::Vector;
203               my $vec = Bit::Vector->new(32);
204               $vec->Chunk_List_Store(3, split(//, reverse "33653337357"));
205               my $dec = $vec->to_Dec();
206
207       How do I convert from decimal to octal
208           Using "sprintf":
209
210               my $oct = sprintf("%o", 3735928559);
211
212           Using Bit::Vector:
213
214               use Bit::Vector;
215               my $vec = Bit::Vector->new_Dec(32, -559038737);
216               my $oct = reverse join('', $vec->Chunk_List_Read(3));
217
218       How do I convert from binary to decimal
219           Perl 5.6 lets you write binary numbers directly with the "0b"
220           notation:
221
222               my $number = 0b10110110;
223
224           Using "oct":
225
226               my $input = "10110110";
227               my $decimal = oct( "0b$input" );
228
229           Using "pack" and "ord":
230
231               my $decimal = ord(pack('B8', '10110110'));
232
233           Using "pack" and "unpack" for larger strings:
234
235               my $int = unpack("N", pack("B32",
236               substr("0" x 32 . "11110101011011011111011101111", -32)));
237               my $dec = sprintf("%d", $int);
238
239               # substr() is used to left-pad a 32-character string with zeros.
240
241           Using Bit::Vector:
242
243               my $vec = Bit::Vector->new_Bin(32, "11011110101011011011111011101111");
244               my $dec = $vec->to_Dec();
245
246       How do I convert from decimal to binary
247           Using "sprintf" (perl 5.6+):
248
249               my $bin = sprintf("%b", 3735928559);
250
251           Using "unpack":
252
253               my $bin = unpack("B*", pack("N", 3735928559));
254
255           Using Bit::Vector:
256
257               use Bit::Vector;
258               my $vec = Bit::Vector->new_Dec(32, -559038737);
259               my $bin = $vec->to_Bin();
260
261           The remaining transformations (e.g. hex -> oct, bin -> hex, etc.)
262           are left as an exercise to the inclined reader.
263
264   Why doesn't & work the way I want it to?
265       Perl's "&" bitwise operator works on both numbers and strings,
266       sometimes producing surprising results when you expected a number but
267       received a string. You probably expected perl to automatically convert
268       the operands to numbers like the mathematical operators would.
269       Instead, perl treats string operands as bitvectors.
270
271       Consider the bitwise difference between the number 3 and the bitvector
272       represented by "3". A number has the bit pattern for its magnitude. The
273       number 3 is 0b11 (a 2 and a 1). The bitvector has the bit pattern that
274       is the ordinal value for each octet, and that value is unrelated to any
275       numeric value that the digit represents. The character "3" is the
276       bitvector 0b0011_0011.
277
278       These operations have different results even though you might think
279       they look like the same "number":
280
281                11  &  3;   # 0b0000_1011 & 0b0000_0011
282                            #     -> 0b0000_0011   (number 3)
283               "11" & "3";  # 0b0011_0001_0011_0001 & 0b0011_0011
284                            #     -> 0b0011_0001   (ASCII char "1")
285
286       Note that if any operand has a numeric value, perl uses numeric
287       semantics (although you should not count on this):
288
289               my($i, $j) = ( 11,   3 );   # $i & $j  # 11  &  3 -> 3
290               my($i, $j) = ("11",  3 );   # $i & $j  # 11  &  3 -> 3
291               my($i, $j) = ("11", "3");   # $i & $j  # "11"  &  "3" -> 1
292
293       Remember that a perl scalar can have both string and numeric values at
294       the same time. A value that starts as a string and has never
295       encountered a numeric operation has no numeric value yet. Perl does
296       this to save time and work so it doesn't have to decide a numeric value
297       for a scalar it might never use as a number. In that case, string
298       semantics wins. But, if there is a numeric value already, numeric
299       semantics win. Force perl to compute the numeric value by adding 0:
300
301           my($i, $j) = ("11", "3"); $j += 0  # $i & $j  # "11"  &  3 -> 3
302
303       However, this is not a documented feature, or as perlop says, it "is
304       not well defined". One way to fix ensure numeric semantics is to
305       explicitly convert both of values to numbers:
306
307               (0+$i) & (0+$j)
308
309       To fix this annoyance, Perl v5.22 separated the string and number
310       behavior. The "bitwise" feature introduced four new operators that
311       would work with only string semantics: "&.", "|.", "^.", and "~.".  The
312       original operators, "&", "|", "^", and "~", would then apply only
313       numeric semantics.
314
315       Enable this feature explicitly with feature:
316
317               use feature qw(bitwise);
318
319       Or, as of v5.28, require the minimum version of perl with "use":
320
321               use v5.28;  # bitwise feature for free
322
323   How do I multiply matrices?
324       Use the Math::Matrix or Math::MatrixReal modules (available from CPAN)
325       or the PDL extension (also available from CPAN).
326
327   How do I perform an operation on a series of integers?
328       To call a function on each element in an array, and collect the
329       results, use:
330
331           my @results = map { my_func($_) } @array;
332
333       For example:
334
335           my @triple = map { 3 * $_ } @single;
336
337       To call a function on each element of an array, but ignore the results:
338
339           foreach my $iterator (@array) {
340               some_func($iterator);
341           }
342
343       To call a function on each integer in a (small) range, you can use:
344
345           my @results = map { some_func($_) } (5 .. 25);
346
347       but you should be aware that in this form, the ".." operator creates a
348       list of all integers in the range, which can take a lot of memory for
349       large ranges. However, the problem does not occur when using ".."
350       within a "for" loop, because in that case the range operator is
351       optimized to iterate over the range, without creating the entire list.
352       So
353
354           my @results = ();
355           for my $i (5 .. 500_005) {
356               push(@results, some_func($i));
357           }
358
359       or even
360
361          push(@results, some_func($_)) for 5 .. 500_005;
362
363       will not create an intermediate list of 500,000 integers.
364
365   How can I output Roman numerals?
366       Get the <http://www.cpan.org/modules/by-module/Roman> module.
367
368   Why aren't my random numbers random?
369       If you're using a version of Perl before 5.004, you must call "srand"
370       once at the start of your program to seed the random number generator.
371
372            BEGIN { srand() if $] < 5.004 }
373
374       5.004 and later automatically call "srand" at the beginning. Don't call
375       "srand" more than once--you make your numbers less random, rather than
376       more.
377
378       Computers are good at being predictable and bad at being random
379       (despite appearances caused by bugs in your programs :-). The random
380       article in the "Far More Than You Ever Wanted To Know" collection in
381       <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz>, courtesy of Tom
382       Phoenix, talks more about this. John von Neumann said, "Anyone who
383       attempts to generate random numbers by deterministic means is, of
384       course, living in a state of sin."
385
386       Perl relies on the underlying system for the implementation of "rand"
387       and "srand"; on some systems, the generated numbers are not random
388       enough (especially on Windows : see
389       <http://www.perlmonks.org/?node_id=803632>).  Several CPAN modules in
390       the "Math" namespace implement better pseudorandom generators; see for
391       example Math::Random::MT ("Mersenne Twister", fast), or
392       Math::TrulyRandom (uses the imperfections in the system's timer to
393       generate random numbers, which is rather slow).  More algorithms for
394       random numbers are described in "Numerical Recipes in C" at
395       <http://www.nr.com/>
396
397   How do I get a random number between X and Y?
398       To get a random number between two values, you can use the rand()
399       built-in to get a random number between 0 and 1. From there, you shift
400       that into the range that you want.
401
402       rand($x) returns a number such that "0 <= rand($x) < $x". Thus what you
403       want to have perl figure out is a random number in the range from 0 to
404       the difference between your X and Y.
405
406       That is, to get a number between 10 and 15, inclusive, you want a
407       random number between 0 and 5 that you can then add to 10.
408
409           my $number = 10 + int rand( 15-10+1 ); # ( 10,11,12,13,14, or 15 )
410
411       Hence you derive the following simple function to abstract that. It
412       selects a random integer between the two given integers (inclusive).
413       For example: "random_int_between(50,120)".
414
415           sub random_int_between {
416               my($min, $max) = @_;
417               # Assumes that the two arguments are integers themselves!
418               return $min if $min == $max;
419               ($min, $max) = ($max, $min)  if  $min > $max;
420               return $min + int rand(1 + $max - $min);
421           }
422

Data: Dates

424   How do I find the day or week of the year?
425       The day of the year is in the list returned by the "localtime"
426       function. Without an argument "localtime" uses the current time.
427
428           my $day_of_year = (localtime)[7];
429
430       The POSIX module can also format a date as the day of the year or week
431       of the year.
432
433           use POSIX qw/strftime/;
434           my $day_of_year  = strftime "%j", localtime;
435           my $week_of_year = strftime "%W", localtime;
436
437       To get the day of year for any date, use POSIX's "mktime" to get a time
438       in epoch seconds for the argument to "localtime".
439
440           use POSIX qw/mktime strftime/;
441           my $week_of_year = strftime "%j",
442               localtime( mktime( 0, 0, 0, 18, 11, 87 ) );
443
444       You can also use Time::Piece, which comes with Perl and provides a
445       "localtime" that returns an object:
446
447           use Time::Piece;
448           my $day_of_year  = localtime->yday;
449           my $week_of_year = localtime->week;
450
451       The Date::Calc module provides two functions to calculate these, too:
452
453           use Date::Calc;
454           my $day_of_year  = Day_of_Year(  1987, 12, 18 );
455           my $week_of_year = Week_of_Year( 1987, 12, 18 );
456
457   How do I find the current century or millennium?
458       Use the following simple functions:
459
460           sub get_century    {
461               return int((((localtime(shift || time))[5] + 1999))/100);
462           }
463
464           sub get_millennium {
465               return 1+int((((localtime(shift || time))[5] + 1899))/1000);
466           }
467
468       On some systems, the POSIX module's strftime() function has been
469       extended in a non-standard way to use a %C format, which they sometimes
470       claim is the "century". It isn't, because on most such systems, this is
471       only the first two digits of the four-digit year, and thus cannot be
472       used to determine reliably the current century or millennium.
473
474   How can I compare two dates and find the difference?
475       (contributed by brian d foy)
476
477       You could just store all your dates as a number and then subtract.
478       Life isn't always that simple though.
479
480       The Time::Piece module, which comes with Perl, replaces localtime with
481       a version that returns an object. It also overloads the comparison
482       operators so you can compare them directly:
483
484           use Time::Piece;
485           my $date1 = localtime( $some_time );
486           my $date2 = localtime( $some_other_time );
487
488           if( $date1 < $date2 ) {
489               print "The date was in the past\n";
490           }
491
492       You can also get differences with a subtraction, which returns a
493       Time::Seconds object:
494
495           my $date_diff = $date1 - $date2;
496           print "The difference is ", $date_diff->days, " days\n";
497
498       If you want to work with formatted dates, the Date::Manip, Date::Calc,
499       or DateTime modules can help you.
500
501   How can I take a string and turn it into epoch seconds?
502       If it's a regular enough string that it always has the same format, you
503       can split it up and pass the parts to "timelocal" in the standard
504       Time::Local module. Otherwise, you should look into the Date::Calc,
505       Date::Parse, and Date::Manip modules from CPAN.
506
507   How can I find the Julian Day?
508       (contributed by brian d foy and Dave Cross)
509
510       You can use the Time::Piece module, part of the Standard Library, which
511       can convert a date/time to a Julian Day:
512
513           $ perl -MTime::Piece -le 'print localtime->julian_day'
514           2455607.7959375
515
516       Or the modified Julian Day:
517
518           $ perl -MTime::Piece -le 'print localtime->mjd'
519           55607.2961226851
520
521       Or even the day of the year (which is what some people think of as a
522       Julian day):
523
524           $ perl -MTime::Piece -le 'print localtime->yday'
525           45
526
527       You can also do the same things with the DateTime module:
528
529           $ perl -MDateTime -le'print DateTime->today->jd'
530           2453401.5
531           $ perl -MDateTime -le'print DateTime->today->mjd'
532           53401
533           $ perl -MDateTime -le'print DateTime->today->doy'
534           31
535
536       You can use the Time::JulianDay module available on CPAN. Ensure that
537       you really want to find a Julian day, though, as many people have
538       different ideas about Julian days (see
539       <http://www.hermetic.ch/cal_stud/jdn.htm> for instance):
540
541           $  perl -MTime::JulianDay -le 'print local_julian_day( time )'
542           55608
543
544   How do I find yesterday's date?
545       (contributed by brian d foy)
546
547       To do it correctly, you can use one of the "Date" modules since they
548       work with calendars instead of times. The DateTime module makes it
549       simple, and give you the same time of day, only the day before, despite
550       daylight saving time changes:
551
552           use DateTime;
553
554           my $yesterday = DateTime->now->subtract( days => 1 );
555
556           print "Yesterday was $yesterday\n";
557
558       You can also use the Date::Calc module using its "Today_and_Now"
559       function.
560
561           use Date::Calc qw( Today_and_Now Add_Delta_DHMS );
562
563           my @date_time = Add_Delta_DHMS( Today_and_Now(), -1, 0, 0, 0 );
564
565           print "@date_time\n";
566
567       Most people try to use the time rather than the calendar to figure out
568       dates, but that assumes that days are twenty-four hours each. For most
569       people, there are two days a year when they aren't: the switch to and
570       from summer time throws this off. For example, the rest of the
571       suggestions will be wrong sometimes:
572
573       Starting with Perl 5.10, Time::Piece and Time::Seconds are part of the
574       standard distribution, so you might think that you could do something
575       like this:
576
577           use Time::Piece;
578           use Time::Seconds;
579
580           my $yesterday = localtime() - ONE_DAY; # WRONG
581           print "Yesterday was $yesterday\n";
582
583       The Time::Piece module exports a new "localtime" that returns an
584       object, and Time::Seconds exports the "ONE_DAY" constant that is a set
585       number of seconds. This means that it always gives the time 24 hours
586       ago, which is not always yesterday. This can cause problems around the
587       end of daylight saving time when there's one day that is 25 hours long.
588
589       You have the same problem with Time::Local, which will give the wrong
590       answer for those same special cases:
591
592           # contributed by Gunnar Hjalmarsson
593            use Time::Local;
594            my $today = timelocal 0, 0, 12, ( localtime )[3..5];
595            my ($d, $m, $y) = ( localtime $today-86400 )[3..5]; # WRONG
596            printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d;
597
598   Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant?
599       (contributed by brian d foy)
600
601       Perl itself never had a Y2K problem, although that never stopped people
602       from creating Y2K problems on their own. See the documentation for
603       "localtime" for its proper use.
604
605       Starting with Perl 5.12, "localtime" and "gmtime" can handle dates past
606       03:14:08 January 19, 2038, when a 32-bit based time would overflow. You
607       still might get a warning on a 32-bit "perl":
608
609           % perl5.12 -E 'say scalar localtime( 0x9FFF_FFFFFFFF )'
610           Integer overflow in hexadecimal number at -e line 1.
611           Wed Nov  1 19:42:39 5576711
612
613       On a 64-bit "perl", you can get even larger dates for those really long
614       running projects:
615
616           % perl5.12 -E 'say scalar gmtime( 0x9FFF_FFFFFFFF )'
617           Thu Nov  2 00:42:39 5576711
618
619       You're still out of luck if you need to keep track of decaying protons
620       though.
621

Data: Strings

623   How do I validate input?
624       (contributed by brian d foy)
625
626       There are many ways to ensure that values are what you expect or want
627       to accept. Besides the specific examples that we cover in the perlfaq,
628       you can also look at the modules with "Assert" and "Validate" in their
629       names, along with other modules such as Regexp::Common.
630
631       Some modules have validation for particular types of input, such as
632       Business::ISBN, Business::CreditCard, Email::Valid, and
633       Data::Validate::IP.
634
635   How do I unescape a string?
636       It depends just what you mean by "escape". URL escapes are dealt with
637       in perlfaq9. Shell escapes with the backslash ("\") character are
638       removed with
639
640           s/\\(.)/$1/g;
641
642       This won't expand "\n" or "\t" or any other special escapes.
643
644   How do I remove consecutive pairs of characters?
645       (contributed by brian d foy)
646
647       You can use the substitution operator to find pairs of characters (or
648       runs of characters) and replace them with a single instance. In this
649       substitution, we find a character in "(.)". The memory parentheses
650       store the matched character in the back-reference "\g1" and we use that
651       to require that the same thing immediately follow it. We replace that
652       part of the string with the character in $1.
653
654           s/(.)\g1/$1/g;
655
656       We can also use the transliteration operator, "tr///". In this example,
657       the search list side of our "tr///" contains nothing, but the "c"
658       option complements that so it contains everything. The replacement list
659       also contains nothing, so the transliteration is almost a no-op since
660       it won't do any replacements (or more exactly, replace the character
661       with itself). However, the "s" option squashes duplicated and
662       consecutive characters in the string so a character does not show up
663       next to itself
664
665           my $str = 'Haarlem';   # in the Netherlands
666           $str =~ tr///cs;       # Now Harlem, like in New York
667
668   How do I expand function calls in a string?
669       (contributed by brian d foy)
670
671       This is documented in perlref, and although it's not the easiest thing
672       to read, it does work. In each of these examples, we call the function
673       inside the braces used to dereference a reference. If we have more than
674       one return value, we can construct and dereference an anonymous array.
675       In this case, we call the function in list context.
676
677           print "The time values are @{ [localtime] }.\n";
678
679       If we want to call the function in scalar context, we have to do a bit
680       more work. We can really have any code we like inside the braces, so we
681       simply have to end with the scalar reference, although how you do that
682       is up to you, and you can use code inside the braces. Note that the use
683       of parens creates a list context, so we need "scalar" to force the
684       scalar context on the function:
685
686           print "The time is ${\(scalar localtime)}.\n"
687
688           print "The time is ${ my $x = localtime; \$x }.\n";
689
690       If your function already returns a reference, you don't need to create
691       the reference yourself.
692
693           sub timestamp { my $t = localtime; \$t }
694
695           print "The time is ${ timestamp() }.\n";
696
697       The "Interpolation" module can also do a lot of magic for you. You can
698       specify a variable name, in this case "E", to set up a tied hash that
699       does the interpolation for you. It has several other methods to do this
700       as well.
701
702           use Interpolation E => 'eval';
703           print "The time values are $E{localtime()}.\n";
704
705       In most cases, it is probably easier to simply use string
706       concatenation, which also forces scalar context.
707
708           print "The time is " . localtime() . ".\n";
709
710   How do I find matching/nesting anything?
711       To find something between two single characters, a pattern like
712       "/x([^x]*)x/" will get the intervening bits in $1. For multiple ones,
713       then something more like "/alpha(.*?)omega/" would be needed. For
714       nested patterns and/or balanced expressions, see the so-called (?PARNO)
715       construct (available since perl 5.10).  The CPAN module Regexp::Common
716       can help to build such regular expressions (see in particular
717       Regexp::Common::balanced and Regexp::Common::delimited).
718
719       More complex cases will require to write a parser, probably using a
720       parsing module from CPAN, like Regexp::Grammars, Parse::RecDescent,
721       Parse::Yapp, Text::Balanced, or Marpa::R2.
722
723   How do I reverse a string?
724       Use reverse() in scalar context, as documented in "reverse" in
725       perlfunc.
726
727           my $reversed = reverse $string;
728
729   How do I expand tabs in a string?
730       You can do it yourself:
731
732           1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e;
733
734       Or you can just use the Text::Tabs module (part of the standard Perl
735       distribution).
736
737           use Text::Tabs;
738           my @expanded_lines = expand(@lines_with_tabs);
739
740   How do I reformat a paragraph?
741       Use Text::Wrap (part of the standard Perl distribution):
742
743           use Text::Wrap;
744           print wrap("\t", '  ', @paragraphs);
745
746       The paragraphs you give to Text::Wrap should not contain embedded
747       newlines. Text::Wrap doesn't justify the lines (flush-right).
748
749       Or use the CPAN module Text::Autoformat. Formatting files can be easily
750       done by making a shell alias, like so:
751
752           alias fmt="perl -i -MText::Autoformat -n0777 \
753               -e 'print autoformat $_, {all=>1}' $*"
754
755       See the documentation for Text::Autoformat to appreciate its many
756       capabilities.
757
758   How can I access or change N characters of a string?
759       You can access the first characters of a string with substr().  To get
760       the first character, for example, start at position 0 and grab the
761       string of length 1.
762
763           my $string = "Just another Perl Hacker";
764           my $first_char = substr( $string, 0, 1 );  #  'J'
765
766       To change part of a string, you can use the optional fourth argument
767       which is the replacement string.
768
769           substr( $string, 13, 4, "Perl 5.8.0" );
770
771       You can also use substr() as an lvalue.
772
773           substr( $string, 13, 4 ) =  "Perl 5.8.0";
774
775   How do I change the Nth occurrence of something?
776       You have to keep track of N yourself. For example, let's say you want
777       to change the fifth occurrence of "whoever" or "whomever" into
778       "whosoever" or "whomsoever", case insensitively. These all assume that
779       $_ contains the string to be altered.
780
781           $count = 0;
782           s{((whom?)ever)}{
783           ++$count == 5       # is it the 5th?
784               ? "${2}soever"  # yes, swap
785               : $1            # renege and leave it there
786               }ige;
787
788       In the more general case, you can use the "/g" modifier in a "while"
789       loop, keeping count of matches.
790
791           $WANT = 3;
792           $count = 0;
793           $_ = "One fish two fish red fish blue fish";
794           while (/(\w+)\s+fish\b/gi) {
795               if (++$count == $WANT) {
796                   print "The third fish is a $1 one.\n";
797               }
798           }
799
800       That prints out: "The third fish is a red one."  You can also use a
801       repetition count and repeated pattern like this:
802
803           /(?:\w+\s+fish\s+){2}(\w+)\s+fish/i;
804
805   How can I count the number of occurrences of a substring within a string?
806       There are a number of ways, with varying efficiency. If you want a
807       count of a certain single character (X) within a string, you can use
808       the "tr///" function like so:
809
810           my $string = "ThisXlineXhasXsomeXx'sXinXit";
811           my $count = ($string =~ tr/X//);
812           print "There are $count X characters in the string";
813
814       This is fine if you are just looking for a single character. However,
815       if you are trying to count multiple character substrings within a
816       larger string, "tr///" won't work. What you can do is wrap a while()
817       loop around a global pattern match. For example, let's count negative
818       integers:
819
820           my $string = "-9 55 48 -2 23 -76 4 14 -44";
821           my $count = 0;
822           while ($string =~ /-\d+/g) { $count++ }
823           print "There are $count negative numbers in the string";
824
825       Another version uses a global match in list context, then assigns the
826       result to a scalar, producing a count of the number of matches.
827
828           my $count = () = $string =~ /-\d+/g;
829
830   How do I capitalize all the words on one line?
831       (contributed by brian d foy)
832
833       Damian Conway's Text::Autoformat handles all of the thinking for you.
834
835           use Text::Autoformat;
836           my $x = "Dr. Strangelove or: How I Learned to Stop ".
837             "Worrying and Love the Bomb";
838
839           print $x, "\n";
840           for my $style (qw( sentence title highlight )) {
841               print autoformat($x, { case => $style }), "\n";
842           }
843
844       How do you want to capitalize those words?
845
846           FRED AND BARNEY'S LODGE        # all uppercase
847           Fred And Barney's Lodge        # title case
848           Fred and Barney's Lodge        # highlight case
849
850       It's not as easy a problem as it looks. How many words do you think are
851       in there? Wait for it... wait for it.... If you answered 5 you're
852       right. Perl words are groups of "\w+", but that's not what you want to
853       capitalize. How is Perl supposed to know not to capitalize that "s"
854       after the apostrophe? You could try a regular expression:
855
856           $string =~ s/ (
857                        (^\w)    #at the beginning of the line
858                          |      # or
859                        (\s\w)   #preceded by whitespace
860                          )
861                       /\U$1/xg;
862
863           $string =~ s/([\w']+)/\u\L$1/g;
864
865       Now, what if you don't want to capitalize that "and"? Just use
866       Text::Autoformat and get on with the next problem. :)
867
868   How can I split a [character]-delimited string except when inside
869       [character]?
870       Several modules can handle this sort of parsing--Text::Balanced,
871       Text::CSV, Text::CSV_XS, and Text::ParseWords, among others.
872
873       Take the example case of trying to split a string that is comma-
874       separated into its different fields. You can't use "split(/,/)" because
875       you shouldn't split if the comma is inside quotes. For example, take a
876       data line like this:
877
878           SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped"
879
880       Due to the restriction of the quotes, this is a fairly complex problem.
881       Thankfully, we have Jeffrey Friedl, author of Mastering Regular
882       Expressions, to handle these for us. He suggests (assuming your string
883       is contained in $text):
884
885            my @new = ();
886            push(@new, $+) while $text =~ m{
887                "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes
888               | ([^,]+),?
889               | ,
890            }gx;
891            push(@new, undef) if substr($text,-1,1) eq ',';
892
893       If you want to represent quotation marks inside a quotation-mark-
894       delimited field, escape them with backslashes (eg, "like \"this\"".
895
896       Alternatively, the Text::ParseWords module (part of the standard Perl
897       distribution) lets you say:
898
899           use Text::ParseWords;
900           @new = quotewords(",", 0, $text);
901
902       For parsing or generating CSV, though, using Text::CSV rather than
903       implementing it yourself is highly recommended; you'll save yourself
904       odd bugs popping up later by just using code which has already been
905       tried and tested in production for years.
906
907   How do I strip blank space from the beginning/end of a string?
908       (contributed by brian d foy)
909
910       A substitution can do this for you. For a single line, you want to
911       replace all the leading or trailing whitespace with nothing. You can do
912       that with a pair of substitutions:
913
914           s/^\s+//;
915           s/\s+$//;
916
917       You can also write that as a single substitution, although it turns out
918       the combined statement is slower than the separate ones. That might not
919       matter to you, though:
920
921           s/^\s+|\s+$//g;
922
923       In this regular expression, the alternation matches either at the
924       beginning or the end of the string since the anchors have a lower
925       precedence than the alternation. With the "/g" flag, the substitution
926       makes all possible matches, so it gets both. Remember, the trailing
927       newline matches the "\s+", and  the "$" anchor can match to the
928       absolute end of the string, so the newline disappears too. Just add the
929       newline to the output, which has the added benefit of preserving
930       "blank" (consisting entirely of whitespace) lines which the "^\s+"
931       would remove all by itself:
932
933           while( <> ) {
934               s/^\s+|\s+$//g;
935               print "$_\n";
936           }
937
938       For a multi-line string, you can apply the regular expression to each
939       logical line in the string by adding the "/m" flag (for "multi-line").
940       With the "/m" flag, the "$" matches before an embedded newline, so it
941       doesn't remove it. This pattern still removes the newline at the end of
942       the string:
943
944           $string =~ s/^\s+|\s+$//gm;
945
946       Remember that lines consisting entirely of whitespace will disappear,
947       since the first part of the alternation can match the entire string and
948       replace it with nothing. If you need to keep embedded blank lines, you
949       have to do a little more work. Instead of matching any whitespace
950       (since that includes a newline), just match the other whitespace:
951
952           $string =~ s/^[\t\f ]+|[\t\f ]+$//mg;
953
954   How do I pad a string with blanks or pad a number with zeroes?
955       In the following examples, $pad_len is the length to which you wish to
956       pad the string, $text or $num contains the string to be padded, and
957       $pad_char contains the padding character. You can use a single
958       character string constant instead of the $pad_char variable if you know
959       what it is in advance. And in the same way you can use an integer in
960       place of $pad_len if you know the pad length in advance.
961
962       The simplest method uses the "sprintf" function. It can pad on the left
963       or right with blanks and on the left with zeroes and it will not
964       truncate the result. The "pack" function can only pad strings on the
965       right with blanks and it will truncate the result to a maximum length
966       of $pad_len.
967
968           # Left padding a string with blanks (no truncation):
969           my $padded = sprintf("%${pad_len}s", $text);
970           my $padded = sprintf("%*s", $pad_len, $text);  # same thing
971
972           # Right padding a string with blanks (no truncation):
973           my $padded = sprintf("%-${pad_len}s", $text);
974           my $padded = sprintf("%-*s", $pad_len, $text); # same thing
975
976           # Left padding a number with 0 (no truncation):
977           my $padded = sprintf("%0${pad_len}d", $num);
978           my $padded = sprintf("%0*d", $pad_len, $num); # same thing
979
980           # Right padding a string with blanks using pack (will truncate):
981           my $padded = pack("A$pad_len",$text);
982
983       If you need to pad with a character other than blank or zero you can
984       use one of the following methods. They all generate a pad string with
985       the "x" operator and combine that with $text. These methods do not
986       truncate $text.
987
988       Left and right padding with any character, creating a new string:
989
990           my $padded = $pad_char x ( $pad_len - length( $text ) ) . $text;
991           my $padded = $text . $pad_char x ( $pad_len - length( $text ) );
992
993       Left and right padding with any character, modifying $text directly:
994
995           substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) );
996           $text .= $pad_char x ( $pad_len - length( $text ) );
997
998   How do I extract selected columns from a string?
999       (contributed by brian d foy)
1000
1001       If you know the columns that contain the data, you can use "substr" to
1002       extract a single column.
1003
1004           my $column = substr( $line, $start_column, $length );
1005
1006       You can use "split" if the columns are separated by whitespace or some
1007       other delimiter, as long as whitespace or the delimiter cannot appear
1008       as part of the data.
1009
1010           my $line    = ' fred barney   betty   ';
1011           my @columns = split /\s+/, $line;
1012               # ( '', 'fred', 'barney', 'betty' );
1013
1014           my $line    = 'fred||barney||betty';
1015           my @columns = split /\|/, $line;
1016               # ( 'fred', '', 'barney', '', 'betty' );
1017
1018       If you want to work with comma-separated values, don't do this since
1019       that format is a bit more complicated. Use one of the modules that
1020       handle that format, such as Text::CSV, Text::CSV_XS, or Text::CSV_PP.
1021
1022       If you want to break apart an entire line of fixed columns, you can use
1023       "unpack" with the A (ASCII) format. By using a number after the format
1024       specifier, you can denote the column width. See the "pack" and "unpack"
1025       entries in perlfunc for more details.
1026
1027           my @fields = unpack( $line, "A8 A8 A8 A16 A4" );
1028
1029       Note that spaces in the format argument to "unpack" do not denote
1030       literal spaces. If you have space separated data, you may want "split"
1031       instead.
1032
1033   How do I find the soundex value of a string?
1034       (contributed by brian d foy)
1035
1036       You can use the "Text::Soundex" module. If you want to do fuzzy or
1037       close matching, you might also try the String::Approx, and
1038       Text::Metaphone, and Text::DoubleMetaphone modules.
1039
1040   How can I expand variables in text strings?
1041       (contributed by brian d foy)
1042
1043       If you can avoid it, don't, or if you can use a templating system, such
1044       as Text::Template or Template Toolkit, do that instead. You might even
1045       be able to get the job done with "sprintf" or "printf":
1046
1047           my $string = sprintf 'Say hello to %s and %s', $foo, $bar;
1048
1049       However, for the one-off simple case where I don't want to pull out a
1050       full templating system, I'll use a string that has two Perl scalar
1051       variables in it. In this example, I want to expand $foo and $bar to
1052       their variable's values:
1053
1054           my $foo = 'Fred';
1055           my $bar = 'Barney';
1056           $string = 'Say hello to $foo and $bar';
1057
1058       One way I can do this involves the substitution operator and a double
1059       "/e" flag. The first "/e" evaluates $1 on the replacement side and
1060       turns it into $foo. The second /e starts with $foo and replaces it with
1061       its value. $foo, then, turns into 'Fred', and that's finally what's
1062       left in the string:
1063
1064           $string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney'
1065
1066       The "/e" will also silently ignore violations of strict, replacing
1067       undefined variable names with the empty string. Since I'm using the
1068       "/e" flag (twice even!), I have all of the same security problems I
1069       have with "eval" in its string form. If there's something odd in $foo,
1070       perhaps something like "@{[ system "rm -rf /" ]}", then I could get
1071       myself in trouble.
1072
1073       To get around the security problem, I could also pull the values from a
1074       hash instead of evaluating variable names. Using a single "/e", I can
1075       check the hash to ensure the value exists, and if it doesn't, I can
1076       replace the missing value with a marker, in this case "???" to signal
1077       that I missed something:
1078
1079           my $string = 'This has $foo and $bar';
1080
1081           my %Replacements = (
1082               foo  => 'Fred',
1083               );
1084
1085           # $string =~ s/\$(\w+)/$Replacements{$1}/g;
1086           $string =~ s/\$(\w+)/
1087               exists $Replacements{$1} ? $Replacements{$1} : '???'
1088               /eg;
1089
1090           print $string;
1091
1092   Does Perl have anything like Ruby's #{} or Python's f string?
1093       Unlike the others, Perl allows you to embed a variable naked in a
1094       double quoted string, e.g. "variable $variable". When there isn't
1095       whitespace or other non-word characters following the variable name,
1096       you can add braces (e.g. "foo ${foo}bar") to ensure correct parsing.
1097
1098       An array can also be embedded directly in a string, and will be
1099       expanded by default with spaces between the elements. The default
1100       LIST_SEPARATOR can be changed by assigning a different string to the
1101       special variable $", such as "local $" = ', ';".
1102
1103       Perl also supports references within a string providing the equivalent
1104       of the features in the other two languages.
1105
1106       "${\ ... }" embedded within a string will work for most simple
1107       statements such as an object->method call. More complex code can be
1108       wrapped in a do block "${\ do{...} }".
1109
1110       When you want a list to be expanded per $", use "@{[ ... ]}".
1111
1112           use Time::Piece;
1113           use Time::Seconds;
1114           my $scalar = 'STRING';
1115           my @array = ( 'zorro', 'a', 1, 'B', 3 );
1116
1117           # Print the current date and time and then Tommorrow
1118           my $t = Time::Piece->new;
1119           say "Now is: ${\ $t->cdate() }";
1120           say "Tomorrow: ${\ do{ my $T=Time::Piece->new + ONE_DAY ; $T->fullday }}";
1121
1122           # some variables in strings
1123           say "This is some scalar I have $scalar, this is an array @array.";
1124           say "You can also write it like this ${scalar} @{array}.";
1125
1126           # Change the $LIST_SEPARATOR
1127           local $" = ':';
1128           say "Set \$\" to delimit with ':' and sort the Array @{[ sort @array ]}";
1129
1130       You may also want to look at the module Quote::Code, and templating
1131       tools such as Template::Toolkit and Mojo::Template.
1132
1133       See also: "How can I expand variables in text strings?" and "How do I
1134       expand function calls in a string?" in this FAQ.
1135
1136   What's wrong with always quoting "$vars"?
1137       The problem is that those double-quotes force stringification--coercing
1138       numbers and references into strings--even when you don't want them to
1139       be strings. Think of it this way: double-quote expansion is used to
1140       produce new strings. If you already have a string, why do you need
1141       more?
1142
1143       If you get used to writing odd things like these:
1144
1145           print "$var";       # BAD
1146           my $new = "$old";       # BAD
1147           somefunc("$var");    # BAD
1148
1149       You'll be in trouble. Those should (in 99.8% of the cases) be the
1150       simpler and more direct:
1151
1152           print $var;
1153           my $new = $old;
1154           somefunc($var);
1155
1156       Otherwise, besides slowing you down, you're going to break code when
1157       the thing in the scalar is actually neither a string nor a number, but
1158       a reference:
1159
1160           func(\@array);
1161           sub func {
1162               my $aref = shift;
1163               my $oref = "$aref";  # WRONG
1164           }
1165
1166       You can also get into subtle problems on those few operations in Perl
1167       that actually do care about the difference between a string and a
1168       number, such as the magical "++" autoincrement operator or the
1169       syscall() function.
1170
1171       Stringification also destroys arrays.
1172
1173           my @lines = `command`;
1174           print "@lines";     # WRONG - extra blanks
1175           print @lines;       # right
1176
1177   Why don't my <<HERE documents work?
1178       Here documents are found in perlop. Check for these three things:
1179
1180       There must be no space after the << part.
1181       There (probably) should be a semicolon at the end of the opening token
1182       You can't (easily) have any space in front of the tag.
1183       There needs to be at least a line separator after the end token.
1184
1185       If you want to indent the text in the here document, you can do this:
1186
1187           # all in one
1188           (my $VAR = <<HERE_TARGET) =~ s/^\s+//gm;
1189               your text
1190               goes here
1191           HERE_TARGET
1192
1193       But the HERE_TARGET must still be flush against the margin.  If you
1194       want that indented also, you'll have to quote in the indentation.
1195
1196           (my $quote = <<'    FINIS') =~ s/^\s+//gm;
1197                   ...we will have peace, when you and all your works have
1198                   perished--and the works of your dark master to whom you
1199                   would deliver us. You are a liar, Saruman, and a corrupter
1200                   of men's hearts. --Theoden in /usr/src/perl/taint.c
1201               FINIS
1202           $quote =~ s/\s+--/\n--/;
1203
1204       A nice general-purpose fixer-upper function for indented here documents
1205       follows. It expects to be called with a here document as its argument.
1206       It looks to see whether each line begins with a common substring, and
1207       if so, strips that substring off. Otherwise, it takes the amount of
1208       leading whitespace found on the first line and removes that much off
1209       each subsequent line.
1210
1211           sub fix {
1212               local $_ = shift;
1213               my ($white, $leader);  # common whitespace and common leading string
1214               if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\g1\g2?.*\n)+$/) {
1215                   ($white, $leader) = ($2, quotemeta($1));
1216               } else {
1217                   ($white, $leader) = (/^(\s+)/, '');
1218               }
1219               s/^\s*?$leader(?:$white)?//gm;
1220               return $_;
1221           }
1222
1223       This works with leading special strings, dynamically determined:
1224
1225           my $remember_the_main = fix<<'    MAIN_INTERPRETER_LOOP';
1226           @@@ int
1227           @@@ runops() {
1228           @@@     SAVEI32(runlevel);
1229           @@@     runlevel++;
1230           @@@     while ( op = (*op->op_ppaddr)() );
1231           @@@     TAINT_NOT;
1232           @@@     return 0;
1233           @@@ }
1234           MAIN_INTERPRETER_LOOP
1235
1236       Or with a fixed amount of leading whitespace, with remaining
1237       indentation correctly preserved:
1238
1239           my $poem = fix<<EVER_ON_AND_ON;
1240              Now far ahead the Road has gone,
1241             And I must follow, if I can,
1242              Pursuing it with eager feet,
1243             Until it joins some larger way
1244              Where many paths and errands meet.
1245             And whither then? I cannot say.
1246               --Bilbo in /usr/src/perl/pp_ctl.c
1247           EVER_ON_AND_ON
1248
1249       Beginning with Perl version 5.26, a much simpler and cleaner way to
1250       write indented here documents has been added to the language: the tilde
1251       (~) modifier. See "Indented Here-docs" in perlop for details.
1252

Data: Arrays

1254   What is the difference between a list and an array?
1255       (contributed by brian d foy)
1256
1257       A list is a fixed collection of scalars. An array is a variable that
1258       holds a variable collection of scalars. An array can supply its
1259       collection for list operations, so list operations also work on arrays:
1260
1261           # slices
1262           ( 'dog', 'cat', 'bird' )[2,3];
1263           @animals[2,3];
1264
1265           # iteration
1266           foreach ( qw( dog cat bird ) ) { ... }
1267           foreach ( @animals ) { ... }
1268
1269           my @three = grep { length == 3 } qw( dog cat bird );
1270           my @three = grep { length == 3 } @animals;
1271
1272           # supply an argument list
1273           wash_animals( qw( dog cat bird ) );
1274           wash_animals( @animals );
1275
1276       Array operations, which change the scalars, rearrange them, or add or
1277       subtract some scalars, only work on arrays. These can't work on a list,
1278       which is fixed. Array operations include "shift", "unshift", "push",
1279       "pop", and "splice".
1280
1281       An array can also change its length:
1282
1283           $#animals = 1;  # truncate to two elements
1284           $#animals = 10000; # pre-extend to 10,001 elements
1285
1286       You can change an array element, but you can't change a list element:
1287
1288           $animals[0] = 'Rottweiler';
1289           qw( dog cat bird )[0] = 'Rottweiler'; # syntax error!
1290
1291           foreach ( @animals ) {
1292               s/^d/fr/;  # works fine
1293           }
1294
1295           foreach ( qw( dog cat bird ) ) {
1296               s/^d/fr/;  # Error! Modification of read only value!
1297           }
1298
1299       However, if the list element is itself a variable, it appears that you
1300       can change a list element. However, the list element is the variable,
1301       not the data. You're not changing the list element, but something the
1302       list element refers to. The list element itself doesn't change: it's
1303       still the same variable.
1304
1305       You also have to be careful about context. You can assign an array to a
1306       scalar to get the number of elements in the array. This only works for
1307       arrays, though:
1308
1309           my $count = @animals;  # only works with arrays
1310
1311       If you try to do the same thing with what you think is a list, you get
1312       a quite different result. Although it looks like you have a list on the
1313       righthand side, Perl actually sees a bunch of scalars separated by a
1314       comma:
1315
1316           my $scalar = ( 'dog', 'cat', 'bird' );  # $scalar gets bird
1317
1318       Since you're assigning to a scalar, the righthand side is in scalar
1319       context. The comma operator (yes, it's an operator!) in scalar context
1320       evaluates its lefthand side, throws away the result, and evaluates its
1321       righthand side and returns the result. In effect, that list-lookalike
1322       assigns to $scalar its rightmost value. Many people mess this up
1323       because they choose a list-lookalike whose last element is also the
1324       count they expect:
1325
1326           my $scalar = ( 1, 2, 3 );  # $scalar gets 3, accidentally
1327
1328   What is the difference between $array[1] and @array[1]?
1329       (contributed by brian d foy)
1330
1331       The difference is the sigil, that special character in front of the
1332       array name. The "$" sigil means "exactly one item", while the "@" sigil
1333       means "zero or more items". The "$" gets you a single scalar, while the
1334       "@" gets you a list.
1335
1336       The confusion arises because people incorrectly assume that the sigil
1337       denotes the variable type.
1338
1339       The $array[1] is a single-element access to the array. It's going to
1340       return the item in index 1 (or undef if there is no item there).  If
1341       you intend to get exactly one element from the array, this is the form
1342       you should use.
1343
1344       The @array[1] is an array slice, although it has only one index.  You
1345       can pull out multiple elements simultaneously by specifying additional
1346       indices as a list, like @array[1,4,3,0].
1347
1348       Using a slice on the lefthand side of the assignment supplies list
1349       context to the righthand side. This can lead to unexpected results.
1350       For instance, if you want to read a single line from a filehandle,
1351       assigning to a scalar value is fine:
1352
1353           $array[1] = <STDIN>;
1354
1355       However, in list context, the line input operator returns all of the
1356       lines as a list. The first line goes into @array[1] and the rest of the
1357       lines mysteriously disappear:
1358
1359           @array[1] = <STDIN>;  # most likely not what you want
1360
1361       Either the "use warnings" pragma or the -w flag will warn you when you
1362       use an array slice with a single index.
1363
1364   How can I remove duplicate elements from a list or array?
1365       (contributed by brian d foy)
1366
1367       Use a hash. When you think the words "unique" or "duplicated", think
1368       "hash keys".
1369
1370       If you don't care about the order of the elements, you could just
1371       create the hash then extract the keys. It's not important how you
1372       create that hash: just that you use "keys" to get the unique elements.
1373
1374           my %hash   = map { $_, 1 } @array;
1375           # or a hash slice: @hash{ @array } = ();
1376           # or a foreach: $hash{$_} = 1 foreach ( @array );
1377
1378           my @unique = keys %hash;
1379
1380       If you want to use a module, try the "uniq" function from
1381       List::MoreUtils. In list context it returns the unique elements,
1382       preserving their order in the list. In scalar context, it returns the
1383       number of unique elements.
1384
1385           use List::MoreUtils qw(uniq);
1386
1387           my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7
1388           my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7
1389
1390       You can also go through each element and skip the ones you've seen
1391       before. Use a hash to keep track. The first time the loop sees an
1392       element, that element has no key in %Seen. The "next" statement creates
1393       the key and immediately uses its value, which is "undef", so the loop
1394       continues to the "push" and increments the value for that key. The next
1395       time the loop sees that same element, its key exists in the hash and
1396       the value for that key is true (since it's not 0 or "undef"), so the
1397       next skips that iteration and the loop goes to the next element.
1398
1399           my @unique = ();
1400           my %seen   = ();
1401
1402           foreach my $elem ( @array ) {
1403               next if $seen{ $elem }++;
1404               push @unique, $elem;
1405           }
1406
1407       You can write this more briefly using a grep, which does the same
1408       thing.
1409
1410           my %seen = ();
1411           my @unique = grep { ! $seen{ $_ }++ } @array;
1412
1413   How can I tell whether a certain element is contained in a list or array?
1414       (portions of this answer contributed by Anno Siegel and brian d foy)
1415
1416       Hearing the word "in" is an indication that you probably should have
1417       used a hash, not a list or array, to store your data. Hashes are
1418       designed to answer this question quickly and efficiently. Arrays
1419       aren't.
1420
1421       That being said, there are several ways to approach this. If you are
1422       going to make this query many times over arbitrary string values, the
1423       fastest way is probably to invert the original array and maintain a
1424       hash whose keys are the first array's values:
1425
1426           my @blues = qw/azure cerulean teal turquoise lapis-lazuli/;
1427           my %is_blue = ();
1428           for (@blues) { $is_blue{$_} = 1 }
1429
1430       Now you can check whether $is_blue{$some_color}. It might have been a
1431       good idea to keep the blues all in a hash in the first place.
1432
1433       If the values are all small integers, you could use a simple indexed
1434       array. This kind of an array will take up less space:
1435
1436           my @primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31);
1437           my @is_tiny_prime = ();
1438           for (@primes) { $is_tiny_prime[$_] = 1 }
1439           # or simply  @istiny_prime[@primes] = (1) x @primes;
1440
1441       Now you check whether $is_tiny_prime[$some_number].
1442
1443       If the values in question are integers instead of strings, you can save
1444       quite a lot of space by using bit strings instead:
1445
1446           my @articles = ( 1..10, 150..2000, 2017 );
1447           undef $read;
1448           for (@articles) { vec($read,$_,1) = 1 }
1449
1450       Now check whether "vec($read,$n,1)" is true for some $n.
1451
1452       These methods guarantee fast individual tests but require a re-
1453       organization of the original list or array. They only pay off if you
1454       have to test multiple values against the same array.
1455
1456       If you are testing only once, the standard module List::Util exports
1457       the function "any" for this purpose. It works by stopping once it finds
1458       the element. It's written in C for speed, and its Perl equivalent looks
1459       like this subroutine:
1460
1461           sub any (&@) {
1462               my $code = shift;
1463               foreach (@_) {
1464                   return 1 if $code->();
1465               }
1466               return 0;
1467           }
1468
1469       If speed is of little concern, the common idiom uses grep in scalar
1470       context (which returns the number of items that passed its condition)
1471       to traverse the entire list. This does have the benefit of telling you
1472       how many matches it found, though.
1473
1474           my $is_there = grep $_ eq $whatever, @array;
1475
1476       If you want to actually extract the matching elements, simply use grep
1477       in list context.
1478
1479           my @matches = grep $_ eq $whatever, @array;
1480
1481   How do I compute the difference of two arrays? How do I compute the
1482       intersection of two arrays?
1483       Use a hash. Here's code to do both and more. It assumes that each
1484       element is unique in a given array:
1485
1486           my (@union, @intersection, @difference);
1487           my %count = ();
1488           foreach my $element (@array1, @array2) { $count{$element}++ }
1489           foreach my $element (keys %count) {
1490               push @union, $element;
1491               push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element;
1492           }
1493
1494       Note that this is the symmetric difference, that is, all elements in
1495       either A or in B but not in both. Think of it as an xor operation.
1496
1497   How do I test whether two arrays or hashes are equal?
1498       The following code works for single-level arrays. It uses a stringwise
1499       comparison, and does not distinguish defined versus undefined empty
1500       strings. Modify if you have other needs.
1501
1502           $are_equal = compare_arrays(\@frogs, \@toads);
1503
1504           sub compare_arrays {
1505               my ($first, $second) = @_;
1506               no warnings;  # silence spurious -w undef complaints
1507               return 0 unless @$first == @$second;
1508               for (my $i = 0; $i < @$first; $i++) {
1509                   return 0 if $first->[$i] ne $second->[$i];
1510               }
1511               return 1;
1512           }
1513
1514       For multilevel structures, you may wish to use an approach more like
1515       this one. It uses the CPAN module FreezeThaw:
1516
1517           use FreezeThaw qw(cmpStr);
1518           my @a = my @b = ( "this", "that", [ "more", "stuff" ] );
1519
1520           printf "a and b contain %s arrays\n",
1521               cmpStr(\@a, \@b) == 0
1522               ? "the same"
1523               : "different";
1524
1525       This approach also works for comparing hashes. Here we'll demonstrate
1526       two different answers:
1527
1528           use FreezeThaw qw(cmpStr cmpStrHard);
1529
1530           my %a = my %b = ( "this" => "that", "extra" => [ "more", "stuff" ] );
1531           $a{EXTRA} = \%b;
1532           $b{EXTRA} = \%a;
1533
1534           printf "a and b contain %s hashes\n",
1535           cmpStr(\%a, \%b) == 0 ? "the same" : "different";
1536
1537           printf "a and b contain %s hashes\n",
1538           cmpStrHard(\%a, \%b) == 0 ? "the same" : "different";
1539
1540       The first reports that both those the hashes contain the same data,
1541       while the second reports that they do not. Which you prefer is left as
1542       an exercise to the reader.
1543
1544   How do I find the first array element for which a condition is true?
1545       To find the first array element which satisfies a condition, you can
1546       use the first() function in the List::Util module, which comes with
1547       Perl 5.8. This example finds the first element that contains "Perl".
1548
1549           use List::Util qw(first);
1550
1551           my $element = first { /Perl/ } @array;
1552
1553       If you cannot use List::Util, you can make your own loop to do the same
1554       thing. Once you find the element, you stop the loop with last.
1555
1556           my $found;
1557           foreach ( @array ) {
1558               if( /Perl/ ) { $found = $_; last }
1559           }
1560
1561       If you want the array index, use the firstidx() function from
1562       "List::MoreUtils":
1563
1564           use List::MoreUtils qw(firstidx);
1565           my $index = firstidx { /Perl/ } @array;
1566
1567       Or write it yourself, iterating through the indices and checking the
1568       array element at each index until you find one that satisfies the
1569       condition:
1570
1571           my( $found, $index ) = ( undef, -1 );
1572           for( $i = 0; $i < @array; $i++ ) {
1573               if( $array[$i] =~ /Perl/ ) {
1574                   $found = $array[$i];
1575                   $index = $i;
1576                   last;
1577               }
1578           }
1579
1580   How do I handle linked lists?
1581       (contributed by brian d foy)
1582
1583       Perl's arrays do not have a fixed size, so you don't need linked lists
1584       if you just want to add or remove items. You can use array operations
1585       such as "push", "pop", "shift", "unshift", or "splice" to do that.
1586
1587       Sometimes, however, linked lists can be useful in situations where you
1588       want to "shard" an array so you have many small arrays instead of a
1589       single big array. You can keep arrays longer than Perl's largest array
1590       index, lock smaller arrays separately in threaded programs, reallocate
1591       less memory, or quickly insert elements in the middle of the chain.
1592
1593       Steve Lembark goes through the details in his YAPC::NA 2009 talk "Perly
1594       Linked Lists" ( <http://www.slideshare.net/lembark/perly-linked-lists>
1595       ), although you can just use his LinkedList::Single module.
1596
1597   How do I handle circular lists?
1598       (contributed by brian d foy)
1599
1600       If you want to cycle through an array endlessly, you can increment the
1601       index modulo the number of elements in the array:
1602
1603           my @array = qw( a b c );
1604           my $i = 0;
1605
1606           while( 1 ) {
1607               print $array[ $i++ % @array ], "\n";
1608               last if $i > 20;
1609           }
1610
1611       You can also use Tie::Cycle to use a scalar that always has the next
1612       element of the circular array:
1613
1614           use Tie::Cycle;
1615
1616           tie my $cycle, 'Tie::Cycle', [ qw( FFFFFF 000000 FFFF00 ) ];
1617
1618           print $cycle; # FFFFFF
1619           print $cycle; # 000000
1620           print $cycle; # FFFF00
1621
1622       The Array::Iterator::Circular creates an iterator object for circular
1623       arrays:
1624
1625           use Array::Iterator::Circular;
1626
1627           my $color_iterator = Array::Iterator::Circular->new(
1628               qw(red green blue orange)
1629               );
1630
1631           foreach ( 1 .. 20 ) {
1632               print $color_iterator->next, "\n";
1633           }
1634
1635   How do I shuffle an array randomly?
1636       If you either have Perl 5.8.0 or later installed, or if you have
1637       Scalar-List-Utils 1.03 or later installed, you can say:
1638
1639           use List::Util 'shuffle';
1640
1641           @shuffled = shuffle(@list);
1642
1643       If not, you can use a Fisher-Yates shuffle.
1644
1645           sub fisher_yates_shuffle {
1646               my $deck = shift;  # $deck is a reference to an array
1647               return unless @$deck; # must not be empty!
1648
1649               my $i = @$deck;
1650               while (--$i) {
1651                   my $j = int rand ($i+1);
1652                   @$deck[$i,$j] = @$deck[$j,$i];
1653               }
1654           }
1655
1656           # shuffle my mpeg collection
1657           #
1658           my @mpeg = <audio/*/*.mp3>;
1659           fisher_yates_shuffle( \@mpeg );    # randomize @mpeg in place
1660           print @mpeg;
1661
1662       Note that the above implementation shuffles an array in place, unlike
1663       the List::Util::shuffle() which takes a list and returns a new shuffled
1664       list.
1665
1666       You've probably seen shuffling algorithms that work using splice,
1667       randomly picking another element to swap the current element with
1668
1669           srand;
1670           @new = ();
1671           @old = 1 .. 10;  # just a demo
1672           while (@old) {
1673               push(@new, splice(@old, rand @old, 1));
1674           }
1675
1676       This is bad because splice is already O(N), and since you do it N
1677       times, you just invented a quadratic algorithm; that is, O(N**2).  This
1678       does not scale, although Perl is so efficient that you probably won't
1679       notice this until you have rather largish arrays.
1680
1681   How do I process/modify each element of an array?
1682       Use "for"/"foreach":
1683
1684           for (@lines) {
1685               s/foo/bar/;    # change that word
1686               tr/XZ/ZX/;    # swap those letters
1687           }
1688
1689       Here's another; let's compute spherical volumes:
1690
1691           my @volumes = @radii;
1692           for (@volumes) {   # @volumes has changed parts
1693               $_ **= 3;
1694               $_ *= (4/3) * 3.14159;  # this will be constant folded
1695           }
1696
1697       which can also be done with map() which is made to transform one list
1698       into another:
1699
1700           my @volumes = map {$_ ** 3 * (4/3) * 3.14159} @radii;
1701
1702       If you want to do the same thing to modify the values of the hash, you
1703       can use the "values" function. As of Perl 5.6 the values are not
1704       copied, so if you modify $orbit (in this case), you modify the value.
1705
1706           for my $orbit ( values %orbits ) {
1707               ($orbit **= 3) *= (4/3) * 3.14159;
1708           }
1709
1710       Prior to perl 5.6 "values" returned copies of the values, so older perl
1711       code often contains constructions such as @orbits{keys %orbits} instead
1712       of "values %orbits" where the hash is to be modified.
1713
1714   How do I select a random element from an array?
1715       Use the rand() function (see "rand" in perlfunc):
1716
1717           my $index   = rand @array;
1718           my $element = $array[$index];
1719
1720       Or, simply:
1721
1722           my $element = $array[ rand @array ];
1723
1724   How do I permute N elements of a list?
1725       Use the List::Permutor module on CPAN. If the list is actually an
1726       array, try the Algorithm::Permute module (also on CPAN). It's written
1727       in XS code and is very efficient:
1728
1729           use Algorithm::Permute;
1730
1731           my @array = 'a'..'d';
1732           my $p_iterator = Algorithm::Permute->new ( \@array );
1733
1734           while (my @perm = $p_iterator->next) {
1735              print "next permutation: (@perm)\n";
1736           }
1737
1738       For even faster execution, you could do:
1739
1740           use Algorithm::Permute;
1741
1742           my @array = 'a'..'d';
1743
1744           Algorithm::Permute::permute {
1745               print "next permutation: (@array)\n";
1746           } @array;
1747
1748       Here's a little program that generates all permutations of all the
1749       words on each line of input. The algorithm embodied in the permute()
1750       function is discussed in Volume 4 (still unpublished) of Knuth's The
1751       Art of Computer Programming and will work on any list:
1752
1753           #!/usr/bin/perl -n
1754           # Fischer-Krause ordered permutation generator
1755
1756           sub permute (&@) {
1757               my $code = shift;
1758               my @idx = 0..$#_;
1759               while ( $code->(@_[@idx]) ) {
1760                   my $p = $#idx;
1761                   --$p while $idx[$p-1] > $idx[$p];
1762                   my $q = $p or return;
1763                   push @idx, reverse splice @idx, $p;
1764                   ++$q while $idx[$p-1] > $idx[$q];
1765                   @idx[$p-1,$q]=@idx[$q,$p-1];
1766               }
1767           }
1768
1769           permute { print "@_\n" } split;
1770
1771       The Algorithm::Loops module also provides the "NextPermute" and
1772       "NextPermuteNum" functions which efficiently find all unique
1773       permutations of an array, even if it contains duplicate values,
1774       modifying it in-place: if its elements are in reverse-sorted order then
1775       the array is reversed, making it sorted, and it returns false;
1776       otherwise the next permutation is returned.
1777
1778       "NextPermute" uses string order and "NextPermuteNum" numeric order, so
1779       you can enumerate all the permutations of 0..9 like this:
1780
1781           use Algorithm::Loops qw(NextPermuteNum);
1782
1783           my @list= 0..9;
1784           do { print "@list\n" } while NextPermuteNum @list;
1785
1786   How do I sort an array by (anything)?
1787       Supply a comparison function to sort() (described in "sort" in
1788       perlfunc):
1789
1790           @list = sort { $a <=> $b } @list;
1791
1792       The default sort function is cmp, string comparison, which would sort
1793       "(1, 2, 10)" into "(1, 10, 2)". "<=>", used above, is the numerical
1794       comparison operator.
1795
1796       If you have a complicated function needed to pull out the part you want
1797       to sort on, then don't do it inside the sort function. Pull it out
1798       first, because the sort BLOCK can be called many times for the same
1799       element. Here's an example of how to pull out the first word after the
1800       first number on each item, and then sort those words case-
1801       insensitively.
1802
1803           my @idx;
1804           for (@data) {
1805               my $item;
1806               ($item) = /\d+\s*(\S+)/;
1807               push @idx, uc($item);
1808           }
1809           my @sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ];
1810
1811       which could also be written this way, using a trick that's come to be
1812       known as the Schwartzian Transform:
1813
1814           my @sorted = map  { $_->[0] }
1815               sort { $a->[1] cmp $b->[1] }
1816               map  { [ $_, uc( (/\d+\s*(\S+)/)[0]) ] } @data;
1817
1818       If you need to sort on several fields, the following paradigm is
1819       useful.
1820
1821           my @sorted = sort {
1822               field1($a) <=> field1($b) ||
1823               field2($a) cmp field2($b) ||
1824               field3($a) cmp field3($b)
1825           } @data;
1826
1827       This can be conveniently combined with precalculation of keys as given
1828       above.
1829
1830       See the sort article in the "Far More Than You Ever Wanted To Know"
1831       collection in <http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> for more
1832       about this approach.
1833
1834       See also the question later in perlfaq4 on sorting hashes.
1835
1836   How do I manipulate arrays of bits?
1837       Use pack() and unpack(), or else vec() and the bitwise operations.
1838
1839       For example, you don't have to store individual bits in an array (which
1840       would mean that you're wasting a lot of space). To convert an array of
1841       bits to a string, use vec() to set the right bits. This sets $vec to
1842       have bit N set only if $ints[N] was set:
1843
1844           my @ints = (...); # array of bits, e.g. ( 1, 0, 0, 1, 1, 0 ... )
1845           my $vec = '';
1846           foreach( 0 .. $#ints ) {
1847               vec($vec,$_,1) = 1 if $ints[$_];
1848           }
1849
1850       The string $vec only takes up as many bits as it needs. For instance,
1851       if you had 16 entries in @ints, $vec only needs two bytes to store them
1852       (not counting the scalar variable overhead).
1853
1854       Here's how, given a vector in $vec, you can get those bits into your
1855       @ints array:
1856
1857           sub bitvec_to_list {
1858               my $vec = shift;
1859               my @ints;
1860               # Find null-byte density then select best algorithm
1861               if ($vec =~ tr/\0// / length $vec > 0.95) {
1862                   use integer;
1863                   my $i;
1864
1865                   # This method is faster with mostly null-bytes
1866                   while($vec =~ /[^\0]/g ) {
1867                       $i = -9 + 8 * pos $vec;
1868                       push @ints, $i if vec($vec, ++$i, 1);
1869                       push @ints, $i if vec($vec, ++$i, 1);
1870                       push @ints, $i if vec($vec, ++$i, 1);
1871                       push @ints, $i if vec($vec, ++$i, 1);
1872                       push @ints, $i if vec($vec, ++$i, 1);
1873                       push @ints, $i if vec($vec, ++$i, 1);
1874                       push @ints, $i if vec($vec, ++$i, 1);
1875                       push @ints, $i if vec($vec, ++$i, 1);
1876                   }
1877               }
1878               else {
1879                   # This method is a fast general algorithm
1880                   use integer;
1881                   my $bits = unpack "b*", $vec;
1882                   push @ints, 0 if $bits =~ s/^(\d)// && $1;
1883                   push @ints, pos $bits while($bits =~ /1/g);
1884               }
1885
1886               return \@ints;
1887           }
1888
1889       This method gets faster the more sparse the bit vector is.  (Courtesy
1890       of Tim Bunce and Winfried Koenig.)
1891
1892       You can make the while loop a lot shorter with this suggestion from
1893       Benjamin Goldberg:
1894
1895           while($vec =~ /[^\0]+/g ) {
1896               push @ints, grep vec($vec, $_, 1), $-[0] * 8 .. $+[0] * 8;
1897           }
1898
1899       Or use the CPAN module Bit::Vector:
1900
1901           my $vector = Bit::Vector->new($num_of_bits);
1902           $vector->Index_List_Store(@ints);
1903           my @ints = $vector->Index_List_Read();
1904
1905       Bit::Vector provides efficient methods for bit vector, sets of small
1906       integers and "big int" math.
1907
1908       Here's a more extensive illustration using vec():
1909
1910           # vec demo
1911           my $vector = "\xff\x0f\xef\xfe";
1912           print "Ilya's string \\xff\\x0f\\xef\\xfe represents the number ",
1913           unpack("N", $vector), "\n";
1914           my $is_set = vec($vector, 23, 1);
1915           print "Its 23rd bit is ", $is_set ? "set" : "clear", ".\n";
1916           pvec($vector);
1917
1918           set_vec(1,1,1);
1919           set_vec(3,1,1);
1920           set_vec(23,1,1);
1921
1922           set_vec(3,1,3);
1923           set_vec(3,2,3);
1924           set_vec(3,4,3);
1925           set_vec(3,4,7);
1926           set_vec(3,8,3);
1927           set_vec(3,8,7);
1928
1929           set_vec(0,32,17);
1930           set_vec(1,32,17);
1931
1932           sub set_vec {
1933               my ($offset, $width, $value) = @_;
1934               my $vector = '';
1935               vec($vector, $offset, $width) = $value;
1936               print "offset=$offset width=$width value=$value\n";
1937               pvec($vector);
1938           }
1939
1940           sub pvec {
1941               my $vector = shift;
1942               my $bits = unpack("b*", $vector);
1943               my $i = 0;
1944               my $BASE = 8;
1945
1946               print "vector length in bytes: ", length($vector), "\n";
1947               @bytes = unpack("A8" x length($vector), $bits);
1948               print "bits are: @bytes\n\n";
1949           }
1950
1951   Why does defined() return true on empty arrays and hashes?
1952       The short story is that you should probably only use defined on scalars
1953       or functions, not on aggregates (arrays and hashes). See "defined" in
1954       perlfunc in the 5.004 release or later of Perl for more detail.
1955

Data: Hashes (Associative Arrays)

1957   How do I process an entire hash?
1958       (contributed by brian d foy)
1959
1960       There are a couple of ways that you can process an entire hash. You can
1961       get a list of keys, then go through each key, or grab a one key-value
1962       pair at a time.
1963
1964       To go through all of the keys, use the "keys" function. This extracts
1965       all of the keys of the hash and gives them back to you as a list. You
1966       can then get the value through the particular key you're processing:
1967
1968           foreach my $key ( keys %hash ) {
1969               my $value = $hash{$key}
1970               ...
1971           }
1972
1973       Once you have the list of keys, you can process that list before you
1974       process the hash elements. For instance, you can sort the keys so you
1975       can process them in lexical order:
1976
1977           foreach my $key ( sort keys %hash ) {
1978               my $value = $hash{$key}
1979               ...
1980           }
1981
1982       Or, you might want to only process some of the items. If you only want
1983       to deal with the keys that start with "text:", you can select just
1984       those using "grep":
1985
1986           foreach my $key ( grep /^text:/, keys %hash ) {
1987               my $value = $hash{$key}
1988               ...
1989           }
1990
1991       If the hash is very large, you might not want to create a long list of
1992       keys. To save some memory, you can grab one key-value pair at a time
1993       using each(), which returns a pair you haven't seen yet:
1994
1995           while( my( $key, $value ) = each( %hash ) ) {
1996               ...
1997           }
1998
1999       The "each" operator returns the pairs in apparently random order, so if
2000       ordering matters to you, you'll have to stick with the "keys" method.
2001
2002       The each() operator can be a bit tricky though. You can't add or delete
2003       keys of the hash while you're using it without possibly skipping or re-
2004       processing some pairs after Perl internally rehashes all of the
2005       elements. Additionally, a hash has only one iterator, so if you mix
2006       "keys", "values", or "each" on the same hash, you risk resetting the
2007       iterator and messing up your processing. See the "each" entry in
2008       perlfunc for more details.
2009
2010   How do I merge two hashes?
2011       (contributed by brian d foy)
2012
2013       Before you decide to merge two hashes, you have to decide what to do if
2014       both hashes contain keys that are the same and if you want to leave the
2015       original hashes as they were.
2016
2017       If you want to preserve the original hashes, copy one hash (%hash1) to
2018       a new hash (%new_hash), then add the keys from the other hash (%hash2
2019       to the new hash. Checking that the key already exists in %new_hash
2020       gives you a chance to decide what to do with the duplicates:
2021
2022           my %new_hash = %hash1; # make a copy; leave %hash1 alone
2023
2024           foreach my $key2 ( keys %hash2 ) {
2025               if( exists $new_hash{$key2} ) {
2026                   warn "Key [$key2] is in both hashes!";
2027                   # handle the duplicate (perhaps only warning)
2028                   ...
2029                   next;
2030               }
2031               else {
2032                   $new_hash{$key2} = $hash2{$key2};
2033               }
2034           }
2035
2036       If you don't want to create a new hash, you can still use this looping
2037       technique; just change the %new_hash to %hash1.
2038
2039           foreach my $key2 ( keys %hash2 ) {
2040               if( exists $hash1{$key2} ) {
2041                   warn "Key [$key2] is in both hashes!";
2042                   # handle the duplicate (perhaps only warning)
2043                   ...
2044                   next;
2045               }
2046               else {
2047                   $hash1{$key2} = $hash2{$key2};
2048               }
2049             }
2050
2051       If you don't care that one hash overwrites keys and values from the
2052       other, you could just use a hash slice to add one hash to another. In
2053       this case, values from %hash2 replace values from %hash1 when they have
2054       keys in common:
2055
2056           @hash1{ keys %hash2 } = values %hash2;
2057
2058   What happens if I add or remove keys from a hash while iterating over it?
2059       (contributed by brian d foy)
2060
2061       The easy answer is "Don't do that!"
2062
2063       If you iterate through the hash with each(), you can delete the key
2064       most recently returned without worrying about it. If you delete or add
2065       other keys, the iterator may skip or double up on them since perl may
2066       rearrange the hash table. See the entry for each() in perlfunc.
2067
2068   How do I look up a hash element by value?
2069       Create a reverse hash:
2070
2071           my %by_value = reverse %by_key;
2072           my $key = $by_value{$value};
2073
2074       That's not particularly efficient. It would be more space-efficient to
2075       use:
2076
2077           while (my ($key, $value) = each %by_key) {
2078               $by_value{$value} = $key;
2079           }
2080
2081       If your hash could have repeated values, the methods above will only
2082       find one of the associated keys.  This may or may not worry you. If it
2083       does worry you, you can always reverse the hash into a hash of arrays
2084       instead:
2085
2086           while (my ($key, $value) = each %by_key) {
2087                push @{$key_list_by_value{$value}}, $key;
2088           }
2089
2090   How can I know how many entries are in a hash?
2091       (contributed by brian d foy)
2092
2093       This is very similar to "How do I process an entire hash?", also in
2094       perlfaq4, but a bit simpler in the common cases.
2095
2096       You can use the keys() built-in function in scalar context to find out
2097       how many entries you have in a hash:
2098
2099           my $key_count = keys %hash; # must be scalar context!
2100
2101       If you want to find out how many entries have a defined value, that's a
2102       bit different. You have to check each value. A "grep" is handy:
2103
2104           my $defined_value_count = grep { defined } values %hash;
2105
2106       You can use that same structure to count the entries any way that you
2107       like. If you want the count of the keys with vowels in them, you just
2108       test for that instead:
2109
2110           my $vowel_count = grep { /[aeiou]/ } keys %hash;
2111
2112       The "grep" in scalar context returns the count. If you want the list of
2113       matching items, just use it in list context instead:
2114
2115           my @defined_values = grep { defined } values %hash;
2116
2117       The keys() function also resets the iterator, which means that you may
2118       see strange results if you use this between uses of other hash
2119       operators such as each().
2120
2121   How do I sort a hash (optionally by value instead of key)?
2122       (contributed by brian d foy)
2123
2124       To sort a hash, start with the keys. In this example, we give the list
2125       of keys to the sort function which then compares them as strings. The
2126       output list has the keys in string order. Once we have the keys, we can
2127       go through them to create a report which lists the keys in string
2128       order:
2129
2130           my @keys = sort { $a cmp $b } keys %hash;
2131
2132           foreach my $key ( @keys ) {
2133               printf "%-20s %6d\n", $key, $hash{$key};
2134           }
2135
2136       We could get more fancy in the sort() block though. Instead of
2137       comparing the keys, we can compute a value with them and use that value
2138       as the comparison.
2139
2140       For instance, to make our report order case-insensitive, we use "fc" to
2141       safely lowercase the keys before comparing them:
2142
2143           use v5.16;
2144           my @keys = sort { fc($a) cmp fc($b) } keys %hash;
2145
2146       Earlier versions of this answer used "lc", but that could give
2147       unexpected results with some Unicode strings. See "fc" in perlfunc for
2148       the details. The Unicode::UCD module does the same thing for earlier
2149       perls.
2150
2151       Note: if the computation is expensive or the hash has many elements,
2152       you may want to look at the Schwartzian Transform to cache the
2153       computation results.
2154
2155       If we want to sort by the hash value instead, we use the hash key to
2156       look it up. We still get out a list of keys, but this time they are
2157       ordered by their value:
2158
2159           my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash;
2160
2161       From there we can get more complex. If the hash values are the same, we
2162       can provide a secondary sort on the hash key:
2163
2164           use v5.16;
2165           my @keys = sort {
2166               $hash{$a} <=> $hash{$b}
2167                   or
2168               fc($a) cmp fc($b)
2169           } keys %hash;
2170
2171   How can I always keep my hash sorted?
2172       You can look into using the "DB_File" module and tie() using the
2173       $DB_BTREE hash bindings as documented in "In Memory Databases" in
2174       DB_File. The Tie::IxHash module from CPAN might also be instructive.
2175       Although this does keep your hash sorted, you might not like the
2176       slowdown you suffer from the tie interface. Are you sure you need to do
2177       this? :)
2178
2179   What's the difference between "delete" and "undef" with hashes?
2180       Hashes contain pairs of scalars: the first is the key, the second is
2181       the value. The key will be coerced to a string, although the value can
2182       be any kind of scalar: string, number, or reference. If a key $key is
2183       present in %hash, exists($hash{$key}) will return true. The value for a
2184       given key can be "undef", in which case $hash{$key} will be "undef"
2185       while "exists $hash{$key}" will return true. This corresponds to ($key,
2186       "undef") being in the hash.
2187
2188       Pictures help... Here's the %hash table:
2189
2190             keys  values
2191           +------+------+
2192           |  a   |  3   |
2193           |  x   |  7   |
2194           |  d   |  0   |
2195           |  e   |  2   |
2196           +------+------+
2197
2198       And these conditions hold
2199
2200           $hash{'a'}                       is true
2201           $hash{'d'}                       is false
2202           defined $hash{'d'}               is true
2203           defined $hash{'a'}               is true
2204           exists $hash{'a'}                is true (Perl 5 only)
2205           grep ($_ eq 'a', keys %hash)     is true
2206
2207       If you now say
2208
2209           undef $hash{'a'}
2210
2211       your table now reads:
2212
2213             keys  values
2214           +------+------+
2215           |  a   | undef|
2216           |  x   |  7   |
2217           |  d   |  0   |
2218           |  e   |  2   |
2219           +------+------+
2220
2221       and these conditions now hold; changes in caps:
2222
2223           $hash{'a'}                       is FALSE
2224           $hash{'d'}                       is false
2225           defined $hash{'d'}               is true
2226           defined $hash{'a'}               is FALSE
2227           exists $hash{'a'}                is true (Perl 5 only)
2228           grep ($_ eq 'a', keys %hash)     is true
2229
2230       Notice the last two: you have an undef value, but a defined key!
2231
2232       Now, consider this:
2233
2234           delete $hash{'a'}
2235
2236       your table now reads:
2237
2238             keys  values
2239           +------+------+
2240           |  x   |  7   |
2241           |  d   |  0   |
2242           |  e   |  2   |
2243           +------+------+
2244
2245       and these conditions now hold; changes in caps:
2246
2247           $hash{'a'}                       is false
2248           $hash{'d'}                       is false
2249           defined $hash{'d'}               is true
2250           defined $hash{'a'}               is false
2251           exists $hash{'a'}                is FALSE (Perl 5 only)
2252           grep ($_ eq 'a', keys %hash)     is FALSE
2253
2254       See, the whole entry is gone!
2255
2256   Why don't my tied hashes make the defined/exists distinction?
2257       This depends on the tied hash's implementation of EXISTS().  For
2258       example, there isn't the concept of undef with hashes that are tied to
2259       DBM* files. It also means that exists() and defined() do the same thing
2260       with a DBM* file, and what they end up doing is not what they do with
2261       ordinary hashes.
2262
2263   How do I reset an each() operation part-way through?
2264       (contributed by brian d foy)
2265
2266       You can use the "keys" or "values" functions to reset "each". To simply
2267       reset the iterator used by "each" without doing anything else, use one
2268       of them in void context:
2269
2270           keys %hash; # resets iterator, nothing else.
2271           values %hash; # resets iterator, nothing else.
2272
2273       See the documentation for "each" in perlfunc.
2274
2275   How can I get the unique keys from two hashes?
2276       First you extract the keys from the hashes into lists, then solve the
2277       "removing duplicates" problem described above. For example:
2278
2279           my %seen = ();
2280           for my $element (keys(%foo), keys(%bar)) {
2281               $seen{$element}++;
2282           }
2283           my @uniq = keys %seen;
2284
2285       Or more succinctly:
2286
2287           my @uniq = keys %{{%foo,%bar}};
2288
2289       Or if you really want to save space:
2290
2291           my %seen = ();
2292           while (defined ($key = each %foo)) {
2293               $seen{$key}++;
2294           }
2295           while (defined ($key = each %bar)) {
2296               $seen{$key}++;
2297           }
2298           my @uniq = keys %seen;
2299
2300   How can I store a multidimensional array in a DBM file?
2301       Either stringify the structure yourself (no fun), or else get the MLDBM
2302       (which uses Data::Dumper) module from CPAN and layer it on top of
2303       either DB_File or GDBM_File. You might also try DBM::Deep, but it can
2304       be a bit slow.
2305
2306   How can I make my hash remember the order I put elements into it?
2307       Use the Tie::IxHash from CPAN.
2308
2309           use Tie::IxHash;
2310
2311           tie my %myhash, 'Tie::IxHash';
2312
2313           for (my $i=0; $i<20; $i++) {
2314               $myhash{$i} = 2*$i;
2315           }
2316
2317           my @keys = keys %myhash;
2318           # @keys = (0,1,2,3,...)
2319
2320   Why does passing a subroutine an undefined element in a hash create it?
2321       (contributed by brian d foy)
2322
2323       Are you using a really old version of Perl?
2324
2325       Normally, accessing a hash key's value for a nonexistent key will not
2326       create the key.
2327
2328           my %hash  = ();
2329           my $value = $hash{ 'foo' };
2330           print "This won't print\n" if exists $hash{ 'foo' };
2331
2332       Passing $hash{ 'foo' } to a subroutine used to be a special case,
2333       though.  Since you could assign directly to $_[0], Perl had to be ready
2334       to make that assignment so it created the hash key ahead of time:
2335
2336           my_sub( $hash{ 'foo' } );
2337           print "This will print before 5.004\n" if exists $hash{ 'foo' };
2338
2339           sub my_sub {
2340               # $_[0] = 'bar'; # create hash key in case you do this
2341               1;
2342           }
2343
2344       Since Perl 5.004, however, this situation is a special case and Perl
2345       creates the hash key only when you make the assignment:
2346
2347           my_sub( $hash{ 'foo' } );
2348           print "This will print, even after 5.004\n" if exists $hash{ 'foo' };
2349
2350           sub my_sub {
2351               $_[0] = 'bar';
2352           }
2353
2354       However, if you want the old behavior (and think carefully about that
2355       because it's a weird side effect), you can pass a hash slice instead.
2356       Perl 5.004 didn't make this a special case:
2357
2358           my_sub( @hash{ qw/foo/ } );
2359
2360   How can I make the Perl equivalent of a C structure/C++ class/hash or array
2361       of hashes or arrays?
2362       Usually a hash ref, perhaps like this:
2363
2364           $record = {
2365               NAME   => "Jason",
2366               EMPNO  => 132,
2367               TITLE  => "deputy peon",
2368               AGE    => 23,
2369               SALARY => 37_000,
2370               PALS   => [ "Norbert", "Rhys", "Phineas"],
2371           };
2372
2373       References are documented in perlref and perlreftut.  Examples of
2374       complex data structures are given in perldsc and perllol. Examples of
2375       structures and object-oriented classes are in perlootut.
2376
2377   How can I use a reference as a hash key?
2378       (contributed by brian d foy and Ben Morrow)
2379
2380       Hash keys are strings, so you can't really use a reference as the key.
2381       When you try to do that, perl turns the reference into its stringified
2382       form (for instance, HASH(0xDEADBEEF)). From there you can't get back
2383       the reference from the stringified form, at least without doing some
2384       extra work on your own.
2385
2386       Remember that the entry in the hash will still be there even if the
2387       referenced variable  goes out of scope, and that it is entirely
2388       possible for Perl to subsequently allocate a different variable at the
2389       same address. This will mean a new variable might accidentally be
2390       associated with the value for an old.
2391
2392       If you have Perl 5.10 or later, and you just want to store a value
2393       against the reference for lookup later, you can use the core
2394       Hash::Util::Fieldhash module. This will also handle renaming the keys
2395       if you use multiple threads (which causes all variables to be
2396       reallocated at new addresses, changing their stringification), and
2397       garbage-collecting the entries when the referenced variable goes out of
2398       scope.
2399
2400       If you actually need to be able to get a real reference back from each
2401       hash entry, you can use the Tie::RefHash module, which does the
2402       required work for you.
2403
2404   How can I check if a key exists in a multilevel hash?
2405       (contributed by brian d foy)
2406
2407       The trick to this problem is avoiding accidental autovivification. If
2408       you want to check three keys deep, you might naïvely try this:
2409
2410           my %hash;
2411           if( exists $hash{key1}{key2}{key3} ) {
2412               ...;
2413           }
2414
2415       Even though you started with a completely empty hash, after that call
2416       to "exists" you've created the structure you needed to check for
2417       "key3":
2418
2419           %hash = (
2420                     'key1' => {
2421                                 'key2' => {}
2422                               }
2423                   );
2424
2425       That's autovivification. You can get around this in a few ways. The
2426       easiest way is to just turn it off. The lexical "autovivification"
2427       pragma is available on CPAN. Now you don't add to the hash:
2428
2429           {
2430               no autovivification;
2431               my %hash;
2432               if( exists $hash{key1}{key2}{key3} ) {
2433                   ...;
2434               }
2435           }
2436
2437       The Data::Diver module on CPAN can do it for you too. Its "Dive"
2438       subroutine can tell you not only if the keys exist but also get the
2439       value:
2440
2441           use Data::Diver qw(Dive);
2442
2443           my @exists = Dive( \%hash, qw(key1 key2 key3) );
2444           if(  ! @exists  ) {
2445               ...; # keys do not exist
2446           }
2447           elsif(  ! defined $exists[0]  ) {
2448               ...; # keys exist but value is undef
2449           }
2450
2451       You can easily do this yourself too by checking each level of the hash
2452       before you move onto the next level. This is essentially what
2453       Data::Diver does for you:
2454
2455           if( check_hash( \%hash, qw(key1 key2 key3) ) ) {
2456               ...;
2457           }
2458
2459           sub check_hash {
2460              my( $hash, @keys ) = @_;
2461
2462              return unless @keys;
2463
2464              foreach my $key ( @keys ) {
2465                  return unless eval { exists $hash->{$key} };
2466                  $hash = $hash->{$key};
2467               }
2468
2469              return 1;
2470           }
2471
2472   How can I prevent addition of unwanted keys into a hash?
2473       Since version 5.8.0, hashes can be restricted to a fixed number of
2474       given keys. Methods for creating and dealing with restricted hashes are
2475       exported by the Hash::Util module.
2476

Data: Misc

2478   How do I handle binary data correctly?
2479       Perl is binary-clean, so it can handle binary data just fine.  On
2480       Windows or DOS, however, you have to use "binmode" for binary files to
2481       avoid conversions for line endings. In general, you should use
2482       "binmode" any time you want to work with binary data.
2483
2484       Also see "binmode" in perlfunc or perlopentut.
2485
2486       If you're concerned about 8-bit textual data then see perllocale.  If
2487       you want to deal with multibyte characters, however, there are some
2488       gotchas. See the section on Regular Expressions.
2489
2490   How do I determine whether a scalar is a number/whole/integer/float?
2491       Assuming that you don't care about IEEE notations like "NaN" or
2492       "Infinity", you probably just want to use a regular expression (see
2493       also perlretut and perlre):
2494
2495           use 5.010;
2496
2497           if ( /\D/ )
2498               { say "\thas nondigits"; }
2499           if ( /^\d+\z/ )
2500               { say "\tis a whole number"; }
2501           if ( /^-?\d+\z/ )
2502               { say "\tis an integer"; }
2503           if ( /^[+-]?\d+\z/ )
2504               { say "\tis a +/- integer"; }
2505           if ( /^-?(?:\d+\.?|\.\d)\d*\z/ )
2506               { say "\tis a real number"; }
2507           if ( /^[+-]?(?=\.?\d)\d*\.?\d*(?:e[+-]?\d+)?\z/i )
2508               { say "\tis a C float" }
2509
2510       There are also some commonly used modules for the task.  Scalar::Util
2511       (distributed with 5.8) provides access to perl's internal function
2512       "looks_like_number" for determining whether a variable looks like a
2513       number. Data::Types exports functions that validate data types using
2514       both the above and other regular expressions. Thirdly, there is
2515       Regexp::Common which has regular expressions to match various types of
2516       numbers. Those three modules are available from the CPAN.
2517
2518       If you're on a POSIX system, Perl supports the "POSIX::strtod" function
2519       for converting strings to doubles (and also "POSIX::strtol" for longs).
2520       Its semantics are somewhat cumbersome, so here's a "getnum" wrapper
2521       function for more convenient access. This function takes a string and
2522       returns the number it found, or "undef" for input that isn't a C float.
2523       The "is_numeric" function is a front end to "getnum" if you just want
2524       to say, "Is this a float?"
2525
2526           sub getnum {
2527               use POSIX qw(strtod);
2528               my $str = shift;
2529               $str =~ s/^\s+//;
2530               $str =~ s/\s+$//;
2531               $! = 0;
2532               my($num, $unparsed) = strtod($str);
2533               if (($str eq '') || ($unparsed != 0) || $!) {
2534                       return undef;
2535               }
2536               else {
2537                   return $num;
2538               }
2539           }
2540
2541           sub is_numeric { defined getnum($_[0]) }
2542
2543       Or you could check out the String::Scanf module on the CPAN instead.
2544
2545   How do I keep persistent data across program calls?
2546       For some specific applications, you can use one of the DBM modules.
2547       See AnyDBM_File. More generically, you should consult the FreezeThaw or
2548       Storable modules from CPAN. Starting from Perl 5.8, Storable is part of
2549       the standard distribution. Here's one example using Storable's "store"
2550       and "retrieve" functions:
2551
2552           use Storable;
2553           store(\%hash, "filename");
2554
2555           # later on...
2556           $href = retrieve("filename");        # by ref
2557           %hash = %{ retrieve("filename") };   # direct to hash
2558
2559   How do I print out or copy a recursive data structure?
2560       The Data::Dumper module on CPAN (or the 5.005 release of Perl) is great
2561       for printing out data structures. The Storable module on CPAN (or the
2562       5.8 release of Perl), provides a function called "dclone" that
2563       recursively copies its argument.
2564
2565           use Storable qw(dclone);
2566           $r2 = dclone($r1);
2567
2568       Where $r1 can be a reference to any kind of data structure you'd like.
2569       It will be deeply copied. Because "dclone" takes and returns
2570       references, you'd have to add extra punctuation if you had a hash of
2571       arrays that you wanted to copy.
2572
2573           %newhash = %{ dclone(\%oldhash) };
2574
2575   How do I define methods for every class/object?
2576       (contributed by Ben Morrow)
2577
2578       You can use the "UNIVERSAL" class (see UNIVERSAL). However, please be
2579       very careful to consider the consequences of doing this: adding methods
2580       to every object is very likely to have unintended consequences. If
2581       possible, it would be better to have all your object inherit from some
2582       common base class, or to use an object system like Moose that supports
2583       roles.
2584
2585   How do I verify a credit card checksum?
2586       Get the Business::CreditCard module from CPAN.
2587
2588   How do I pack arrays of doubles or floats for XS code?
2589       The arrays.h/arrays.c code in the PGPLOT module on CPAN does just this.
2590       If you're doing a lot of float or double processing, consider using the
2591       PDL module from CPAN instead--it makes number-crunching easy.
2592
2593       See <https://metacpan.org/release/PGPLOT> for the code.
2594

AUTHOR AND COPYRIGHT

2596       Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and other
2597       authors as noted. All rights reserved.
2598
2599       This documentation is free; you can redistribute it and/or modify it
2600       under the same terms as Perl itself.
2601
2602       Irrespective of its distribution, all code examples in this file are
2603       hereby placed into the public domain. You are permitted and encouraged
2604       to use this code in your own programs for fun or for profit as you see
2605       fit. A simple comment in the code giving credit would be courteous but
2606       is not required.
2607
2608
2609
2610perl v5.38.0                      2023-08-24                       perlfaq4(3)