perlreftut(1)

1PERLREFTUT(1)          Perl Programmers Reference Guide          PERLREFTUT(1)
2
3
4

NAME

6       perlreftut - Mark's very short tutorial about references
7

DESCRIPTION

9       One of the most important new features in Perl 5 was the capability to
10       manage complicated data structures like multidimensional arrays and
11       nested hashes.  To enable these, Perl 5 introduced a feature called
12       `references', and using references is the key to managing complicated,
13       structured data in Perl.  Unfortunately, there's a lot of funny syntax
14       to learn, and the main manual page can be hard to follow.  The manual
15       is quite complete, and sometimes people find that a problem, because it
16       can be hard to tell what is important and what isn't.
17
18       Fortunately, you only need to know 10% of what's in the main page to
19       get 90% of the benefit.  This page will show you that 10%.
20

Who Needs Complicated Data Structures?

22       One problem that came up all the time in Perl 4 was how to represent a
23       hash whose values were lists.  Perl 4 had hashes, of course, but the
24       values had to be scalars; they couldn't be lists.
25
26       Why would you want a hash of lists?  Let's take a simple example: You
27       have a file of city and country names, like this:
28
29               Chicago, USA
30               Frankfurt, Germany
31               Berlin, Germany
32               Washington, USA
33               Helsinki, Finland
34               New York, USA
35
36       and you want to produce an output like this, with each country men‐
37       tioned once, and then an alphabetical list of the cities in that coun‐
38       try:
39
40               Finland: Helsinki.
41               Germany: Berlin, Frankfurt.
42               USA:  Chicago, New York, Washington.
43
44       The natural way to do this is to have a hash whose keys are country
45       names.  Associated with each country name key is a list of the cities
46       in that country.  Each time you read a line of input, split it into a
47       country and a city, look up the list of cities already known to be in
48       that country, and append the new city to the list.  When you're done
49       reading the input, iterate over the hash as usual, sorting each list of
50       cities before you print it out.
51
52       If hash values can't be lists, you lose.  In Perl 4, hash values can't
53       be lists; they can only be strings.  You lose.  You'd probably have to
54       combine all the cities into a single string somehow, and then when time
55       came to write the output, you'd have to break the string into a list,
56       sort the list, and turn it back into a string.  This is messy and
57       error-prone.  And it's frustrating, because Perl already has perfectly
58       good lists that would solve the problem if only you could use them.
59

The Solution

61       By the time Perl 5 rolled around, we were already stuck with this
62       design: Hash values must be scalars.  The solution to this is refer‐
63       ences.
64
65       A reference is a scalar value that refers to an entire array or an
66       entire hash (or to just about anything else).  Names are one kind of
67       reference that you're already familiar with.  Think of the President of
68       the United States: a messy, inconvenient bag of blood and bones.  But
69       to talk about him, or to represent him in a computer program, all you
70       need is the easy, convenient scalar string "George Bush".
71
72       References in Perl are like names for arrays and hashes.  They're
73       Perl's private, internal names, so you can be sure they're unambiguous.
74       Unlike "George Bush", a reference only refers to one thing, and you
75       always know what it refers to.  If you have a reference to an array,
76       you can recover the entire array from it.  If you have a reference to a
77       hash, you can recover the entire hash.  But the reference is still an
78       easy, compact scalar value.
79
80       You can't have a hash whose values are arrays; hash values can only be
81       scalars.  We're stuck with that.  But a single reference can refer to
82       an entire array, and references are scalars, so you can have a hash of
83       references to arrays, and it'll act a lot like a hash of arrays, and
84       it'll be just as useful as a hash of arrays.
85
86       We'll come back to this city-country problem later, after we've seen
87       some syntax for managing references.
88

Syntax

90       There are just two ways to make a reference, and just two ways to use
91       it once you have it.
92
93       Making References
94
95       Make Rule 1
96
97       If you put a "\" in front of a variable, you get a reference to that
98       variable.
99
100           $aref = \@array;         # $aref now holds a reference to @array
101           $href = \%hash;          # $href now holds a reference to %hash
102           $sref = \$scalar;        # $sref now holds a reference to $scalar
103
104       Once the reference is stored in a variable like $aref or $href, you can
105       copy it or store it just the same as any other scalar value:
106
107           $xy = $aref;             # $xy now holds a reference to @array
108           $p[3] = $href;           # $p[3] now holds a reference to %hash
109           $z = $p[3];              # $z now holds a reference to %hash
110
111       These examples show how to make references to variables with names.
112       Sometimes you want to make an array or a hash that doesn't have a name.
113       This is analogous to the way you like to be able to use the string "\n"
114       or the number 80 without having to store it in a named variable first.
115
116       Make Rule 2
117
118       "[ ITEMS ]" makes a new, anonymous array, and returns a reference to
119       that array.  "{ ITEMS }" makes a new, anonymous hash, and returns a
120       reference to that hash.
121
122           $aref = [ 1, "foo", undef, 13 ];
123           # $aref now holds a reference to an array
124
125           $href = { APR => 4, AUG => 8 };
126           # $href now holds a reference to a hash
127
128       The references you get from rule 2 are the same kind of references that
129       you get from rule 1:
130
131               # This:
132               $aref = [ 1, 2, 3 ];
133
134               # Does the same as this:
135               @array = (1, 2, 3);
136               $aref = \@array;
137
138       The first line is an abbreviation for the following two lines, except
139       that it doesn't create the superfluous array variable @array.
140
141       If you write just "[]", you get a new, empty anonymous array.  If you
142       write just "{}", you get a new, empty anonymous hash.
143
144       Using References
145
146       What can you do with a reference once you have it?  It's a scalar
147       value, and we've seen that you can store it as a scalar and get it back
148       again just like any scalar.  There are just two more ways to use it:
149
150       Use Rule 1
151
152       You can always use an array reference, in curly braces, in place of the
153       name of an array.  For example, "@{$aref}" instead of @array.
154
155       Here are some examples of that:
156
157       Arrays:
158
159               @a              @{$aref}                An array
160               reverse @a      reverse @{$aref}        Reverse the array
161               $a[3]           ${$aref}[3]             An element of the array
162               $a[3] = 17;     ${$aref}[3] = 17        Assigning an element
163
164       On each line are two expressions that do the same thing.  The left-hand
165       versions operate on the array @a.  The right-hand versions operate on
166       the array that is referred to by $aref.  Once they find the array
167       they're operating on, both versions do the same things to the arrays.
168
169       Using a hash reference is exactly the same:
170
171               %h              %{$href}              A hash
172               keys %h         keys %{$href}         Get the keys from the hash
173               $h{'red'}       ${$href}{'red'}       An element of the hash
174               $h{'red'} = 17  ${$href}{'red'} = 17  Assigning an element
175
176       Whatever you want to do with a reference, Use Rule 1 tells you how to
177       do it.  You just write the Perl code that you would have written for
178       doing the same thing to a regular array or hash, and then replace the
179       array or hash name with "{$reference}".  "How do I loop over an array
180       when all I have is a reference?"  Well, to loop over an array, you
181       would write
182
183               for my $element (@array) {
184                  ...
185               }
186
187       so replace the array name, @array, with the reference:
188
189               for my $element (@{$aref}) {
190                  ...
191               }
192
193       "How do I print out the contents of a hash when all I have is a refer‐
194       ence?"  First write the code for printing out a hash:
195
196               for my $key (keys %hash) {
197                 print "$key => $hash{$key}\n";
198               }
199
200       And then replace the hash name with the reference:
201
202               for my $key (keys %{$href}) {
203                 print "$key => ${$href}{$key}\n";
204               }
205
206       Use Rule 2
207
208       Use Rule 1 is all you really need, because it tells you how to do abso‐
209       lutely everything you ever need to do with references.  But the most
210       common thing to do with an array or a hash is to extract a single ele‐
211       ment, and the Use Rule 1 notation is cumbersome.  So there is an abbre‐
212       viation.
213
214       "${$aref}[3]" is too hard to read, so you can write "$aref->[3]"
215       instead.
216
217       "${$href}{red}" is too hard to read, so you can write "$href->{red}"
218       instead.
219
220       If $aref holds a reference to an array, then "$aref->[3]" is the fourth
221       element of the array.  Don't confuse this with $aref[3], which is the
222       fourth element of a totally different array, one deceptively named
223       @aref.  $aref and @aref are unrelated the same way that $item and @item
224       are.
225
226       Similarly, "$href->{'red'}" is part of the hash referred to by the
227       scalar variable $href, perhaps even one with no name.  $href{'red'} is
228       part of the deceptively named %href hash.  It's easy to forget to leave
229       out the "->", and if you do, you'll get bizarre results when your pro‐
230       gram gets array and hash elements out of totally unexpected hashes and
231       arrays that weren't the ones you wanted to use.
232
233       An Example
234
235       Let's see a quick example of how all this is useful.
236
237       First, remember that "[1, 2, 3]" makes an anonymous array containing
238       "(1, 2, 3)", and gives you a reference to that array.
239
240       Now think about
241
242               @a = ( [1, 2, 3],
243                      [4, 5, 6],
244                      [7, 8, 9]
245                    );
246
247       @a is an array with three elements, and each one is a reference to
248       another array.
249
250       $a[1] is one of these references.  It refers to an array, the array
251       containing "(4, 5, 6)", and because it is a reference to an array, Use
252       Rule 2 says that we can write $a[1]->[2] to get the third element from
253       that array.  $a[1]->[2] is the 6.  Similarly, $a[0]->[1] is the 2.
254       What we have here is like a two-dimensional array; you can write
255       $a[ROW]->[COLUMN] to get or set the element in any row and any column
256       of the array.
257
258       The notation still looks a little cumbersome, so there's one more
259       abbreviation:
260
261       Arrow Rule
262
263       In between two subscripts, the arrow is optional.
264
265       Instead of $a[1]->[2], we can write $a[1][2]; it means the same thing.
266       Instead of "$a[0]->[1] = 23", we can write "$a[0][1] = 23"; it means
267       the same thing.
268
269       Now it really looks like two-dimensional arrays!
270
271       You can see why the arrows are important.  Without them, we would have
272       had to write "${$a[1]}[2]" instead of $a[1][2].  For three-dimensional
273       arrays, they let us write $x[2][3][5] instead of the unreadable
274       "${${$x[2]}[3]}[5]".
275

Solution

277       Here's the answer to the problem I posed earlier, of reformatting a
278       file of city and country names.
279
280           1   my %table;
281
282           2   while (<>) {
283           3    chomp;
284           4     my ($city, $country) = split /, /;
285           5     $table{$country} = [] unless exists $table{$country};
286           6     push @{$table{$country}}, $city;
287           7   }
288
289           8   foreach $country (sort keys %table) {
290           9     print "$country: ";
291          10     my @cities = @{$table{$country}};
292          11     print join ', ', sort @cities;
293          12     print ".\n";
294          13   }
295
296       The program has two pieces: Lines 2--7 read the input and build a data
297       structure, and lines 8-13 analyze the data and print out the report.
298       We're going to have a hash, %table, whose keys are country names, and
299       whose values are references to arrays of city names.  The data struc‐
300       ture will look like this:
301
302                  %table
303               +-------+---+
304               ⎪       ⎪   ⎪   +-----------+--------+
305               ⎪Germany⎪ *---->⎪ Frankfurt ⎪ Berlin ⎪
306               ⎪       ⎪   ⎪   +-----------+--------+
307               +-------+---+
308               ⎪       ⎪   ⎪   +----------+
309               ⎪Finland⎪ *---->⎪ Helsinki ⎪
310               ⎪       ⎪   ⎪   +----------+
311               +-------+---+
312               ⎪       ⎪   ⎪   +---------+------------+----------+
313               ⎪  USA  ⎪ *---->⎪ Chicago ⎪ Washington ⎪ New York ⎪
314               ⎪       ⎪   ⎪   +---------+------------+----------+
315               +-------+---+
316
317       We'll look at output first.  Supposing we already have this structure,
318       how do we print it out?
319
320           8   foreach $country (sort keys %table) {
321           9     print "$country: ";
322          10     my @cities = @{$table{$country}};
323          11     print join ', ', sort @cities;
324          12     print ".\n";
325          13   }
326
327       %table is an ordinary hash, and we get a list of keys from it, sort the
328       keys, and loop over the keys as usual.  The only use of references is
329       in line 10.  $table{$country} looks up the key $country in the hash and
330       gets the value, which is a reference to an array of cities in that
331       country.  Use Rule 1 says that we can recover the array by saying
332       "@{$table{$country}}".  Line 10 is just like
333
334               @cities = @array;
335
336       except that the name "array" has been replaced by the reference "{$ta‐
337       ble{$country}}".  The "@" tells Perl to get the entire array.  Having
338       gotten the list of cities, we sort it, join it, and print it out as
339       usual.
340
341       Lines 2-7 are responsible for building the structure in the first
342       place.  Here they are again:
343
344           2   while (<>) {
345           3    chomp;
346           4     my ($city, $country) = split /, /;
347           5     $table{$country} = [] unless exists $table{$country};
348           6     push @{$table{$country}}, $city;
349           7   }
350
351       Lines 2-4 acquire a city and country name.  Line 5 looks to see if the
352       country is already present as a key in the hash.  If it's not, the pro‐
353       gram uses the "[]" notation (Make Rule 2) to manufacture a new, empty
354       anonymous array of cities, and installs a reference to it into the hash
355       under the appropriate key.
356
357       Line 6 installs the city name into the appropriate array.  $ta‐
358       ble{$country} now holds a reference to the array of cities seen in that
359       country so far.  Line 6 is exactly like
360
361               push @array, $city;
362
363       except that the name "array" has been replaced by the reference "{$ta‐
364       ble{$country}}".  The "push" adds a city name to the end of the
365       referred-to array.
366
367       There's one fine point I skipped.  Line 5 is unnecessary, and we can
368       get rid of it.
369
370           2   while (<>) {
371           3    chomp;
372           4     my ($city, $country) = split /, /;
373           5   ####  $table{$country} = [] unless exists $table{$country};
374           6     push @{$table{$country}}, $city;
375           7   }
376
377       If there's already an entry in %table for the current $country, then
378       nothing is different.  Line 6 will locate the value in $table{$coun‐
379       try}, which is a reference to an array, and push $city into the array.
380       But what does it do when $country holds a key, say "Greece", that is
381       not yet in %table?
382
383       This is Perl, so it does the exact right thing.  It sees that you want
384       to push "Athens" onto an array that doesn't exist, so it helpfully
385       makes a new, empty, anonymous array for you, installs it into %table,
386       and then pushes "Athens" onto it.  This is called `autovivifica‐
387       tion'--bringing things to life automatically.  Perl saw that they key
388       wasn't in the hash, so it created a new hash entry automatically. Perl
389       saw that you wanted to use the hash value as an array, so it created a
390       new empty array and installed a reference to it in the hash automati‐
391       cally.  And as usual, Perl made the array one element longer to hold
392       the new city name.
393

The Rest

395       I promised to give you 90% of the benefit with 10% of the details, and
396       that means I left out 90% of the details.  Now that you have an over‐
397       view of the important parts, it should be easier to read the perlref
398       manual page, which discusses 100% of the details.
399
400       Some of the highlights of perlref:
401
402       ·   You can make references to anything, including scalars, functions,
403           and other references.
404
405       ·   In Use Rule 1, you can omit the curly brackets whenever the thing
406           inside them is an atomic scalar variable like $aref.  For example,
407           @$aref is the same as "@{$aref}", and $$aref[1] is the same as
408           "${$aref}[1]".  If you're just starting out, you may want to adopt
409           the habit of always including the curly brackets.
410
411       ·   This doesn't copy the underlying array:
412
413                   $aref2 = $aref1;
414
415           You get two references to the same array.  If you modify
416           "$aref1->[23]" and then look at "$aref2->[23]" you'll see the
417           change.
418
419           To copy the array, use
420
421                   $aref2 = [@{$aref1}];
422
423           This uses "[...]" notation to create a new anonymous array, and
424           $aref2 is assigned a reference to the new array.  The new array is
425           initialized with the contents of the array referred to by $aref1.
426
427           Similarly, to copy an anonymous hash, you can use
428
429                   $href2 = {%{$href1}};
430
431       ·   To see if a variable contains a reference, use the "ref" function.
432           It returns true if its argument is a reference.  Actually it's a
433           little better than that: It returns "HASH" for hash references and
434           "ARRAY" for array references.
435
436       ·   If you try to use a reference like a string, you get strings like
437
438                   ARRAY(0x80f5dec)   or    HASH(0x826afc0)
439
440           If you ever see a string that looks like this, you'll know you
441           printed out a reference by mistake.
442
443           A side effect of this representation is that you can use "eq" to
444           see if two references refer to the same thing.  (But you should
445           usually use "==" instead because it's much faster.)
446
447       ·   You can use a string as if it were a reference.  If you use the
448           string "foo" as an array reference, it's taken to be a reference to
449           the array @foo.  This is called a soft reference or symbolic refer‐
450           ence.  The declaration "use strict 'refs'" disables this feature,
451           which can cause all sorts of trouble if you use it by accident.
452
453       You might prefer to go on to perllol instead of perlref; it discusses
454       lists of lists and multidimensional arrays in detail.  After that, you
455       should move on to perldsc; it's a Data Structure Cookbook that shows
456       recipes for using and printing out arrays of hashes, hashes of arrays,
457       and other kinds of data.
458

Summary

460       Everyone needs compound data structures, and in Perl the way you get
461       them is with references.  There are four important rules for managing
462       references: Two for making references and two for using them.  Once you
463       know these rules you can do most of the important things you need to do
464       with references.
465

Credits

467       Author: Mark Jason Dominus, Plover Systems ("mjd-perl-ref+@plover.com")
468
469       This article originally appeared in The Perl Journal (
470       http://www.tpj.com/ ) volume 3, #2.  Reprinted with permission.
471
472       The original title was Understand References Today.
473
474       Distribution Conditions
475
476       Copyright 1998 The Perl Journal.
477
478       This documentation is free; you can redistribute it and/or modify it
479       under the same terms as Perl itself.
480
481       Irrespective of its distribution, all code examples in these files are
482       hereby placed into the public domain.  You are permitted and encouraged
483       to use this code in your own programs for fun or for profit as you see
484       fit.  A simple comment in the code giving credit would be courteous but
485       is not required.
486
487
488
489perl v5.8.8                       2006-01-07                     PERLREFTUT(1)