1Data::Structure::Util(3U)ser Contributed Perl DocumentatiDoanta::Structure::Util(3)
2
3
4

NAME

6       Data::Structure::Util - Change nature of data within a structure
7

SYNOPSIS

9         use Data::Structure::Util qw(has_utf8 utf8_off utf8_on
10           _utf8_off _utf8_on unbless get_blessed get_refs
11           has_circular_ref circular_off signature);
12
13         # get the objects in the data structure
14         my $objects_arrayref = get_blessed($data);
15
16         # unbless all objects
17         unbless($data);
18
19         if (has_circular_ref($data))
20         {
21           print "Removing circular ref!\n";
22           circular_off($data)
23         }
24
25         # convert back to latin1 if needed and possible
26         utf8_off($data) if defined has_utf8($data);
27

DESCRIPTION

29       "Data::Structure::Util" is a toolbox to manipulate the data inside a
30       data structure.  It can process an entire tree and perform the opera‐
31       tion requested on each appropriate element.
32
33       For example: It can transform all strings within a data structure to
34       utf8 or transform any utf8 string back to the default encoding.  It can
35       remove the blessing on any reference. It can collect all the objects or
36       detect if there is a circular reference.
37
38       It is written in C for decent speed.
39

FUNCTIONS

41       All Data::Structure::Util functions operate on a whole tree.  If you
42       pass them a simple scalar then they will operate on that one scalar.
43       However, if you pass them a reference to a hash, array, or scalar then
44       they will iterate though that structure and apply the manipulation to
45       all elements, and in turn if they are references to hashes, arrays or
46       scalars to all their elements and so on, recursively.
47
48       For speed reasons all manipulations that alter the data structure do
49       in-place manipulation meaning that rather than returning an altered
50       copy of the data structure the passed data structure which has been
51       altered.
52
53       Manipulating Data Structures
54
55       has_circular_ref($ref)
56           This function detects if the passed data structure has a circular
57           reference, that is to say if it is possible by following references
58           contained in the structure to return to a part of the data struc‐
59           ture you have already visited.  Data structures that have circular
60           references will not be automatically reclaimed by Perl's garbage
61           collector.
62
63           If a circular reference is detected the function returns a refer‐
64           ence to an element within circuit, otherwise the function will
65           return a false value.
66
67           If the version of perl that you are using supports weak references
68           then any weak references found within the data structure will not
69           be traversed, meaning that circular references that have had links
70           successfully weakened will not be returned by this function.
71
72       circular_off($ref)
73           Detects circular references in $ref (as above) and weakens a link
74           in each so that they can be properly garbage collected when no
75           external references to the data structure are left.
76
77           This means that one (or more) of the references in the data struc‐
78           ture will be told that the should not count towards reference
79           counting.  You should be aware that if you later modify the data
80           structure and leave parts of it only 'accessible' via weakened ref‐
81           erences that those parts of the data structure will be immediately
82           garbage collected as the weakened references will not be strong
83           enough to maintain the connection on their own.
84
85           The number of references weakened is returned.
86
87       get_refs($ref)
88           Examine the data structure and return a reference to flat array
89           that contains one copy of every reference in the data structure you
90           passed.
91
92           For example:
93
94             my $foo = {
95               first  => [ "inner", "array", { inmost => "hash" } ],
96               second => \"refed scalar",
97             };
98
99             use Data::Dumper;
100             # tell Data::Dumper to show nodes multiple times
101             $Data::Dumper::Deepcopy = 1;
102             print Dumper get_refs($foo);
103
104             $VAR1 = [
105                       {
106                         'inmost' => 'hash'
107                       },
108                       [
109                         'inner',
110                         'array',
111                         {
112                           'inmost' => 'hash'
113                         }
114                       ],
115                     \'refed scalar',
116                     {
117                       'first' => [
118                                    'inner',
119                                    {
120                                      'inmost' => 'hash'
121                                    },
122                                    'array'
123                                  ],
124                       'second' => \'refed scalar'
125                     }
126                   ];
127
128           As you can see, the data structure is traversed depth first, so the
129           top most references should be the last elements of the array.  See
130           get_blessed($ref) below for a similar function for blessed objects.
131
132       signature($ref)
133           Returns a md5 of the passed data structure.  Any change at all to
134           the data structure will cause a different md5 to be returned.
135
136           The function examines the structure, addresses, value types and
137           flags to generate the signature, meaning that even data structures
138           that would look identical when dumped with Data::Dumper produce
139           different signatures:
140
141             $ref1 = { key1 => [] };
142
143             $ref2 = $ref1;
144             $ref2->{key1} = [];
145
146             # this produces the same result, as they look the same
147             # even though they are different data structures
148             use Data::Dumper;
149             use Digest::MD5 qw(md5_hex);
150             print md5_hex(Dumper($ref1))," ",md5_hex(Dumper($ref2)),"\n";
151             # cb55d41da284a5869a0401bb65ab74c1 cb55d41da284a5869a0401bb65ab74c1
152
153             # this produces differing results
154             use Data::Structure::Util qw(signature);
155             print signature($ref1)," ",signature($ref2),"\n";
156             # 5d20c5e81a53b2be90521167aefed9db 8b4cba2cbae0fec4bab263e9866d3911
157
158       Object Blessing
159
160       unbless($ref)
161           Remove the blessing from any objects found within the passed data
162           structure. For example:
163
164             my $foo = {
165                      'a' => bless({
166                               'b' => bless({},"c"),
167                             },"d"),
168                      'e' => [
169                               bless([],"f"),
170                               bless([],"g"),
171                             ]
172                    };
173
174             use Data::Dumper;
175             use Data::Structure::Util qw(unbless);
176             print Dumper( unbless( $foo ));
177
178             $VAR1 = {
179                       'a' => {
180                                'b' => {}
181                              },
182                       'e' => [
183                                [],
184                                []
185                              ]
186                     };
187
188           Note that the structure looks inside blessed objects for other
189           objects to unbless.
190
191       get_blessed($ref)
192           Examine the data structure and return a reference to flat array
193           that contains every object in the data structure you passed.  For
194           example:
195
196             my $foo = {
197                      'a' => bless({
198                               'b' => bless({},"c"),
199                             },"d"),
200                      'e' => [
201                               bless([],"f"),
202                               bless([],"g"),
203                             ]
204                    };
205
206             use Data::Dumper;
207             # tell Data::Dumper to show nodes multiple times
208             $Data::Dumper::Deepcopy = 1;
209             use Data::Structure::Util qw(get_blessed);
210             print Dumper( get_blessed( $foo ));
211
212             $VAR1 = [
213                       bless( {}, 'c' ),
214                       bless( {
215                                'b' => bless( {}, 'c' )
216                              }, 'd' ),
217                       bless( [], 'f' ),
218                       bless( [], 'g' )
219                     ];
220
221           This function is essentially the same as "get_refs" but only
222           returns blessed objects rather than all objects.  As with that
223           function the data structure is traversed depth first, so the top
224           most objects should be the last elements of the array.  Note also
225           (as shown in the above example shows) that objects within objects
226           are returned.
227
228       utf8 Manipulation Functions
229
230       These functions allow you to manipulate the state of the utf8 flags in
231       the scalars contained in the data structure.  Information on the utf8
232       flag and it's significance can be found in Encode.
233
234       has_utf8($var)
235           Returns $var if the utf8 flag is enabled for $var or any scalar
236           that a data structure passed in $var contains.
237
238             print "this will be printed"  if defined has_utf8("\x{1234}");
239             print "this won't be printed" if defined has_utf8("foo bar");
240
241           Note that you should not check the truth of the return value of
242           this function when calling it with a single scalar as it is possi‐
243           ble to have a string "0" or "" for which the utf8 flag set; Since
244           "undef" can never have the utf8 flag set the function will never
245           return a defined value if the data structure does not contain a
246           utf8 flagged scalar.
247
248       _utf8_off($var)
249           Recursively disables the utf8 flag on all scalars within $var.
250           This is the same the "_utf8_off" function of Encode but applies to
251           any string within $var.  The data structure is converted in-place,
252           and as a convenience the passed variable is returned from the func‐
253           tion.
254
255           This function makes no attempt to do any character set conversion
256           to the strings stored in any of the scalars in the passed data
257           structure.  This means that if perl was internally storing any
258           character as sequence of bytes in the utf8 encoding each byte in
259           that sequence will then be henceforth treated as a character in
260           it's own right.
261
262           For example:
263
264             my $emoticons = { smile => "\x{236a}" };
265             use Data::Structure::Util qw(_utf8_on);
266             print length($emoticons->{smile}), "\n";  # prints 1
267             _utf8_off($emoticons);
268             print length($emoticons->{smile}), "\n";  # prints 3
269
270       _utf8_on($var)
271           Recursively enables the utf8 flag on all scalars within $var.  This
272           is the same the "_utf8_on" function of Encode but applies to any
273           string within $var. The data structure is converted in-place and as
274           a convenience the passed variable is returned from the function.
275
276           As above, this makes no attempt to do any character set conversion
277           meaning that unless your string contains the valid utf8 byte
278           sequences for the characters you want you are in trouble.  In some
279           cases incorrect byte sequences can segfault perl.  In particular,
280           the regular expression engine has significant problems with invalid
281           utf8 that has been incorrectly marked as utf8.  You should know
282           what you are doing if you are using this function; Consider using
283           the Encode module as an alternative.
284
285           Contrary example to the above:
286
287             my $emoticons = { smile => "\342\230\272" };
288             use Data::Structure::Util qw(_utf8_on);
289             print length($emoticons->{smile}), "\n";  # prints 3
290             _utf8_on($emoticons);
291             print length($emoticons->{smile}), "\n";  # prints 1
292
293       utf8_on($var)
294           This routine performs a "sv_utf8_upgrade" on each scalar string in
295           the passed data structure that does not have the utf8 flag turned
296           on.  This will cause the perl to change the method it uses inter‐
297           nally to store the string from the native encoding (normally
298           Latin-1 unless locales come into effect) into a utf8 encoding and
299           set the utf8 flag for that scalar.  This means that single byte
300           letters will now be represented by multi-byte sequences.  However,
301           as long as the "use bytes" pragma is not in effect the string will
302           be the same length as because as far as perl is concerned the
303           string still contains the same number of characters (but not
304           bytes).
305
306           This routine is significantly different from "_utf8_on"; That rou‐
307           tine assumes that your string is encoded in utf8 but was marked
308           (wrongly) in the native encoding.  This routine assumes that your
309           string is encoded in the native encoding and is marked that way,
310           but you'd rather it be encoded and marked as utf8.
311
312       utf8_off($var)
313           This routine performs a "sv_utf8_downgrade" on each scalar string
314           in the passed data structure that has the utf8 flag turned on.
315           This will cause the perl to change the method it uses internally to
316           store the string from the utf8 encoding into a the native encoding
317           (normally Latin-1 unless locales are used) and disable the utf8
318           flag for that scalar.  This means that multiple byte sequences that
319           represent a single character will be replaced by one byte per char‐
320           acter. However, as long as the "use bytes" pragma is not in effect
321           the string will be the same length as because as far as perl is
322           concerned the string still contains the same number of characters
323           (but not bytes).
324
325           Please note that not all strings can be converted from utf8 to the
326           native encoding; In the case that the utf8 character has no corre‐
327           sponding character in the native encoding Perl will die with "Wide
328           character in subroutine entry" exception.
329
330           This routine is significantly different from "_utf8_off"; That rou‐
331           tine assumes that your string is encoded in utf8 and that you want
332           to simply mark it as being in the native encoding so that perl will
333           treat every byte that makes up the character sequences as a charac‐
334           ter in it's own right in the native encoding.  This routine assumes
335           that your string is encoded in utf8, but you want it each character
336           that is currently represented by multi-byte strings to be replaced
337           by the single byte representation of the same character.
338

SEE ALSO

340       Encode, Scalar::Util, Devel::Leak, Devel::LeakTrace
341
342       See the excellent article http://www.perl.com/pub/a/2002/08/07/proxyob
343       ject.html from Matt Sergeant for more info on circular references.
344
345       The development version of this module and others can be found at
346       http://opensource.fotango.com/svn/trunk/Data-Structure-Util/
347

BUGS

349       "signature()" is sensitive to the hash randomisation algorithm
350
351       This module only recurses through basic hashes, lists and scalar refer‐
352       ences.  It doesn't attempt anything more complicated.
353

THANKS TO

355       James Duncan and Arthur Bergman who helped me and found a name for this
356       module.  Leon Brocard and Richard Clamp have provided invaluable help
357       to debug this module.  Mark Fowler rewrote large chunks of the documen‐
358       tation and patched a few bugs.
359

AUTHOR

361       Pierre Denis <pdenis@fotango.com>
362
363       http://opensource.fotango.com/
364
366       Copyright 2003, 2004 Fotango - All Rights Reserved.
367
368       This module is released under the same license as Perl itself.
369
370
371
372perl v5.8.8                       2007-04-17          Data::Structure::Util(3)
Impressum