1Data::Structure::Util(3U)ser Contributed Perl DocumentatiDoanta::Structure::Util(3)
2
3
4

NAME

6       Data::Structure::Util - Change nature of data within a structure
7

SYNOPSIS

9           use Data::Structure::Util qw(
10             has_utf8 utf8_off utf8_on unbless get_blessed get_refs
11             has_circular_ref circular_off signature
12           );
13
14           # get the objects in the data structure
15           my $objects_arrayref = get_blessed( $data );
16
17           # unbless all objects
18           unbless( $data );
19
20           if ( has_circular_ref( $data ) ) {
21               print "Removing circular ref!\n";
22               circular_off( $data );
23           }
24
25           # convert back to latin1 if needed and possible
26           utf8_off( $data ) if defined has_utf8( $data );
27

DESCRIPTION

29       "Data::Structure::Util" is a toolbox to manipulate the data inside a
30       data structure. It can process an entire tree and perform the operation
31       requested on each appropriate element.
32
33       For example: It can transform all strings within a data structure to
34       utf8 or transform any utf8 string back to the default encoding. It can
35       remove the blessing on any reference. It can collect all the objects or
36       detect if there is a circular reference.
37
38       It is written in C for decent speed.
39

FUNCTIONS

41       All Data::Structure::Util functions operate on a whole tree. If you
42       pass them a simple scalar then they will operate on that one scalar.
43       However, if you pass them a reference to a hash, array, or scalar then
44       they will iterate though that structure and apply the manipulation to
45       all elements, and in turn if they are references to hashes, arrays or
46       scalars to all their elements and so on, recursively.
47
48       For speed reasons all manipulations that alter the data structure do
49       in- place manipulation meaning that rather than returning an altered
50       copy of the data structure the passed data structure which has been
51       altered.
52
53   Manipulating Data Structures
54       has_circular_ref($ref)
55           This function detects if the passed data structure has a circular
56           reference, that is to say if it is possible by following references
57           contained in the structure to return to a part of the data
58           structure you have already visited. Data structures that have
59           circular references will not be automatically reclaimed by Perl's
60           garbage collector.
61
62           If a circular reference is detected the function returns a
63           reference to an element within circuit, otherwise the function will
64           return a false value.
65
66           If the version of perl that you are using supports weak references
67           then any weak references found within the data structure will not
68           be traversed, meaning that circular references that have had links
69           successfully weakened will not be returned by this function.
70
71       circular_off($ref)
72           Detects circular references in $ref (as above) and weakens a link
73           in each so that they can be properly garbage collected when no
74           external references to the data structure are left.
75
76           This means that one (or more) of the references in the data
77           structure will be told that the should not count towards reference
78           counting. You should be aware that if you later modify the data
79           structure and leave parts of it only 'accessible' via weakened
80           references that those parts of the data structure will be
81           immediately garbage collected as the weakened references will not
82           be strong enough to maintain the connection on their own.
83
84           The number of references weakened is returned.
85
86       get_refs($ref)
87           Examine the data structure and return a reference to flat array
88           that contains one copy of every reference in the data structure you
89           passed.
90
91           For example:
92
93               my $foo = {
94                   first  => [ "inner", "array", { inmost => "hash" } ],
95                   second => \"refed scalar",
96               };
97
98               use Data::Dumper;
99               # tell Data::Dumper to show nodes multiple times
100               $Data::Dumper::Deepcopy = 1;
101               print Dumper get_refs( $foo );
102
103               $VAR1 = [
104                   { 'inmost' => 'hash' },
105                   [ 'inner', 'array', { 'inmost' => 'hash' } ],
106                   \'refed scalar',
107                   {
108                       'first'  => [ 'inner', { 'inmost' => 'hash' }, 'array' ],
109                       'second' => \'refed scalar'
110                   }
111               ];
112
113           As you can see, the data structure is traversed depth first, so the
114           top most references should be the last elements of the array.  See
115           get_blessed($ref) below for a similar function for blessed objects.
116
117       signature($ref)
118           Returns a md5 of the passed data structure.  Any change at all to
119           the data structure will cause a different md5 to be returned.
120
121           The function examines the structure, addresses, value types and
122           flags to generate the signature, meaning that even data structures
123           that would look identical when dumped with Data::Dumper produce
124           different signatures:
125
126               $ref1 = { key1 => [] };
127
128               $ref2 = $ref1;
129               $ref2->{key1} = [];
130
131               # this produces the same result, as they look the same
132               # even though they are different data structures
133               use Data::Dumper;
134               use Digest::MD5 qw(md5_hex);
135               print md5_hex( Dumper( $ref1 ) ), " ", md5_hex( Dumper( $ref2 ) ), "\n";
136               # cb55d41da284a5869a0401bb65ab74c1 cb55d41da284a5869a0401bb65ab74c1
137
138               # this produces differing results
139               use Data::Structure::Util qw(signature);
140               print signature( $ref1 ), " ", signature( $ref2 ), "\n";
141               # 5d20c5e81a53b2be90521167aefed9db 8b4cba2cbae0fec4bab263e9866d3911
142
143   Object Blessing
144       unbless($ref)
145           Remove the blessing from any objects found within the passed data
146           structure. For example:
147
148               my $foo = {
149                   'a' => bless( { 'b' => bless( {}, "c" ), }, "d" ),
150                   'e' => [ bless( [], "f" ), bless( [], "g" ), ]
151               };
152
153               use Data::Dumper;
154               use Data::Structure::Util qw(unbless);
155               print Dumper( unbless( $foo ) );
156
157               $VAR1 = {
158                   'a' => { 'b' => {} },
159                   'e' => [ [], [] ]
160               };
161
162           Note that the structure looks inside blessed objects for other
163           objects to unbless.
164
165       get_blessed($ref)
166           Examine the data structure and return a reference to flat array
167           that contains every object in the data structure you passed.  For
168           example:
169
170               my $foo = {
171                   'a' => bless( { 'b' => bless( {}, "c" ), }, "d" ),
172                   'e' => [ bless( [], "f" ), bless( [], "g" ), ]
173               };
174
175               use Data::Dumper;
176               # tell Data::Dumper to show nodes multiple times
177               $Data::Dumper::Deepcopy = 1;
178               use Data::Structure::Util qw(get_blessed);
179               print Dumper( get_blessed( $foo ) );
180
181               $VAR1 = [
182                   bless( {}, 'c' ),
183                   bless( { 'b' => bless( {}, 'c' ) }, 'd' ),
184                   bless( [], 'f' ),
185                   bless( [], 'g' )
186               ];
187
188           This function is essentially the same as "get_refs" but only
189           returns blessed objects rather than all objects.  As with that
190           function the data structure is traversed depth first, so the top
191           most objects should be the last elements of the array.  Note also
192           (as shown in the above example shows) that objects within objects
193           are returned.
194
195   utf8 Manipulation Functions
196       These functions allow you to manipulate the state of the utf8 flags in
197       the scalars contained in the data structure.  Information on the utf8
198       flag and it's significance can be found in Encode.
199
200       has_utf8($var)
201           Returns $var if the utf8 flag is enabled for $var or any scalar
202           that a data structure passed in $var contains.
203
204               print "this will be printed"  if defined has_utf8( "\x{1234}" );
205               print "this won't be printed" if defined has_utf8( "foo bar" );
206
207           Note that you should not check the truth of the return value of
208           this function when calling it with a single scalar as it is
209           possible to have a string "0" or "" for which the utf8 flag set;
210           Since "undef" can never have the utf8 flag set the function will
211           never return a defined value if the data structure does not contain
212           a utf8 flagged scalar.
213
214       _utf8_off($var)
215           Recursively disables the utf8 flag on all scalars within $var.
216           This is the same the "_utf8_off" function of Encode but applies to
217           any string within $var.  The data structure is converted in-place,
218           and as a convenience the passed variable is returned from the
219           function.
220
221           This function makes no attempt to do any character set conversion
222           to the strings stored in any of the scalars in the passed data
223           structure.  This means that if perl was internally storing any
224           character as sequence of bytes in the utf8 encoding each byte in
225           that sequence will then be henceforth treated as a character in
226           it's own right.
227
228           For example:
229
230               my $emoticons = { smile => "\x{236a}" };
231               use Data::Structure::Util qw(_utf8_on);
232               print length( $emoticons->{smile} ), "\n";    # prints 1
233               _utf8_off( $emoticons );
234               print length( $emoticons->{smile} ), "\n";    # prints 3
235
236       _utf8_on($var)
237           Recursively enables the utf8 flag on all scalars within $var.  This
238           is the same the "_utf8_on" function of Encode but applies to any
239           string within $var. The data structure is converted in-place and as
240           a convenience the passed variable is returned from the function.
241
242           As above, this makes no attempt to do any character set conversion
243           meaning that unless your string contains the valid utf8 byte
244           sequences for the characters you want you are in trouble.  In some
245           cases incorrect byte sequences can segfault perl.  In particular,
246           the regular expression engine has significant problems with invalid
247           utf8 that has been incorrectly marked as utf8.  You should know
248           what you are doing if you are using this function; Consider using
249           the Encode module as an alternative.
250
251           Contrary example to the above:
252
253               my $emoticons = { smile => "\342\230\272" };
254               use Data::Structure::Util qw(_utf8_on);
255               print length( $emoticons->{smile} ), "\n";    # prints 3
256               _utf8_on( $emoticons );
257               print length( $emoticons->{smile} ), "\n";    # prints 1
258
259       utf8_on($var)
260           This routine performs a "sv_utf8_upgrade" on each scalar string in
261           the passed data structure that does not have the utf8 flag turned
262           on.  This will cause the perl to change the method it uses
263           internally to store the string from the native encoding (normally
264           Latin-1 unless locales come into effect) into a utf8 encoding and
265           set the utf8 flag for that scalar.  This means that single byte
266           letters will now be represented by multi-byte sequences.  However,
267           as long as the "use bytes" pragma is not in effect the string will
268           be the same length as because as far as perl is concerned the
269           string still contains the same number of characters (but not
270           bytes).
271
272           This routine is significantly different from "_utf8_on"; That
273           routine assumes that your string is encoded in utf8 but was marked
274           (wrongly) in the native encoding.  This routine assumes that your
275           string is encoded in the native encoding and is marked that way,
276           but you'd rather it be encoded and marked as utf8.
277
278       utf8_off($var)
279           This routine performs a "sv_utf8_downgrade" on each scalar string
280           in the passed data structure that has the utf8 flag turned on.
281           This will cause the perl to change the method it uses internally to
282           store the string from the utf8 encoding into a the native encoding
283           (normally Latin-1 unless locales are used) and disable the utf8
284           flag for that scalar.  This means that multiple byte sequences that
285           represent a single character will be replaced by one byte per
286           character. However, as long as the "use bytes" pragma is not in
287           effect the string will be the same length as because as far as perl
288           is concerned the string still contains the same number of
289           characters (but not bytes).
290
291           Please note that not all strings can be converted from utf8 to the
292           native encoding; In the case that the utf8 character has no
293           corresponding character in the native encoding Perl will die with
294           "Wide character in subroutine entry" exception.
295
296           This routine is significantly different from "_utf8_off"; That
297           routine assumes that your string is encoded in utf8 and that you
298           want to simply mark it as being in the native encoding so that perl
299           will treat every byte that makes up the character sequences as a
300           character in it's own right in the native encoding.  This routine
301           assumes that your string is encoded in utf8, but you want it each
302           character that is currently represented by multi-byte strings to be
303           replaced by the single byte representation of the same character.
304

SEE ALSO

306       Encode, Scalar::Util, Devel::Leak, Devel::LeakTrace
307
308       See the excellent article
309       http://www.perl.com/pub/a/2002/08/07/proxyobject.html from Matt
310       Sergeant for more info on circular references.
311

REPOSITORY

313       https://github.com/AndyA/Data--Structure--Util
314

BUGS

316       "signature()" is sensitive to the hash randomisation algorithm
317
318       This module only recurses through basic hashes, lists and scalar
319       references.  It doesn't attempt anything more complicated.
320

THANKS TO

322       James Duncan and Arthur Bergman who helped me and found a name for this
323       module.  Leon Brocard and Richard Clamp have provided invaluable help
324       to debug this module.  Mark Fowler rewrote large chunks of the
325       documentation and patched a few bugs.
326

AUTHOR

328       This release by Andy Armstrong <andy@hexten.net>
329
330       Originally by Pierre Denis <pdenis@fotango.com>
331
332       http://opensource.fotango.com/
333
335       Copyright 2003, 2004 Fotango - All Rights Reserved.
336
337       This module is released under the same license as Perl itself.
338
339
340
341perl v5.32.0                      2020-07-28          Data::Structure::Util(3)
Impressum