1String::Util(3) User Contributed Perl Documentation String::Util(3)
2
3
4
6 String::Util -- String processing utilities
7
9 use String::Util ':all';
10
11 # "crunch" whitespace and remove leading/trailing whitespace
12 $val = crunch($val);
13
14 # does this value have "content", i.e. it's defined
15 # and has something besides whitespace?
16 if (hascontent $val) {...}
17
18 # format for display in a web page
19 $val = htmlesc($val);
20
21 # format for display in a web page table cell
22 $val = cellfill($val);
23
24 # remove leading/trailing whitespace
25 $val = trim($val);
26
27 # ensure defined value
28 $val = define($val);
29
30 # repeat string x number of times
31 $val = repeat($val, $iterations);
32
33 # remove leading/trailing quotes
34 $val = unquote($val);
35
36 # remove all whitespace
37 $val = no_space($val);
38
39 # remove trailing \r and \n, regardless of what
40 # the OS considers an end-of-line
41 $val = fullchomp($val);
42
43 # or call in void context:
44 fullchomp $val;
45
46 # encrypt string using random seed
47 $val = randcrypt($val);
48
49 # are these two values equal, where two undefs count as "equal"?
50 if (eqq $a, $b) {...}
51
52 # are these two values different, where two undefs count as "equal"?
53 if (neqq $a, $b) {...}
54
55 # get a random string of some specified length
56 $val = randword(10);
57
59 String::Util provides a collection of small, handy utilities for
60 processing strings.
61
63 String::Util can be installed with the usual routine:
64
65 perl Makefile.PL
66 make
67 make test
68 make install
69
71 collapse(string), crunch(string)
72 "collapse()" collapses all whitespace in the string down to single
73 spaces. Also removes all leading and trailing whitespace. Undefined
74 input results in undefined output.
75
76 "crunch()" is the old name for "collapse()". I decided that "crunch"
77 never sounded right. Spaces don't go "crunch", they go "poof" like a
78 collapsing ballon. However, "crunch()" will continue to work as an
79 alias for "collapse()".
80
81 hascontent(scalar), nocontent(scalar)
82 hascontent() returns true if the given argument is defined and contains
83 something besides whitespace.
84
85 An undefined value returns false. An empty string returns false. A
86 value containing nothing but whitespace (spaces, tabs, carriage
87 returns, newlines, backspace) returns false. A string containing any
88 other characters (including zero) returns true.
89
90 "nocontent()" returns the negation of "hascontent()".
91
92 trim(string)
93 Returns the string with all leading and trailing whitespace removed.
94 Trim on undef returns undef.
95
96 So, for example, the following code changes " my string " to "my
97 string":
98
99 $var = " my string ";
100 $var = trim($var);
101
102 trim accepts two optional arguments, 'left' and 'right', both of which
103 are true by default. So, to avoid trimming the left side of the
104 string, set the 'left' argument to false:
105
106 $var = trim($var, left=>0);
107
108 To avoid trimming the right side, set 'right' to false:
109
110 $var = trim($var, right=>0);
111
112 ltrim, rtrim
113 ltrim trims leading whitespace. rtrim trims trailing whitespace. They
114 are exactly equivalent to
115
116 trim($var, left=>0);
117
118 and
119
120 trim($var, right=>0);
121
122 no_space(string)
123 Removes all whitespace characters from the given string.
124
125 htmlesc(string)
126 Formats a string for literal output in HTML. An undefined value is
127 returned as an empty string.
128
129 htmlesc() is very similar to CGI.pm's escapeHTML. However, there are a
130 few differences. htmlesc() changes an undefined value to an empty
131 string, whereas escapeHTML() returns undefs as undefs.
132
133 cellfill(string)
134 Formats a string for literal output in an HTML table cell. Works just
135 like htmlesc() except that strings with no content (i.e. are undef or
136 are just whitespace) are returned as " ".
137
138 jsquote($string)
139 Escapes and quotes a string for use in JavaScript. Escapes single
140 quotes and surrounds the string in single quotes. Returns the modified
141 string.
142
143 unquote(string)
144 If the given string starts and ends with quotes, removes them.
145 Recognizes single quotes and double quotes. The value must begin and
146 end with same type of quotes or nothing is done to the value. Undef
147 input results in undef output. Some examples and what they return:
148
149 unquote(q|'Hendrix'|); # Hendrix
150 unquote(q|"Hendrix"|); # Hendrix
151 unquote(q|Hendrix|); # Hendrix
152 unquote(q|"Hendrix'|); # "Hendrix'
153 unquote(q|O'Sullivan|); # O'Sullivan
154
155 option: braces
156
157 If the braces option is true, surrounding braces such as [] and {} are
158 also removed. Some examples:
159
160 unquote(q|[Janis]|, braces=>1); # Janis
161 unquote(q|{Janis}|, braces=>1); # Janis
162 unquote(q|(Janis)|, braces=>1); # Janis
163
164 define(scalar)
165 Takes a single value as input. If the value is defined, it is returned
166 unchanged. If it is not defined, an empty string is returned.
167
168 This subroutine is useful for printing when an undef should simply be
169 represented as an empty string. Perl already treats undefs as empty
170 strings in string context, but this subroutine makes the warnings
171 module <http://perldoc.perl.org/warnings.html> go away. And you ARE
172 using warnings, right?
173
174 repeat($string, $count)
175 Returns the given string repeated the given number of times. The
176 following command outputs "Fred" three times:
177
178 print repeat('Fred', 3), "\n";
179
180 Note that repeat() was created a long time based on a misunderstanding
181 of how the perl operator 'x' works. The following command using 'x'
182 would perform exactly the same as the above command.
183
184 print 'Fred' x 3, "\n";
185
186 Use whichever you prefer.
187
188 randword(length, %options)
189 Returns a random string of characters. String will not contain any
190 vowels (to avoid distracting dirty words). First argument is the length
191 of the return string. So this code:
192
193 foreach my $idx (1..3) {
194 print randword(4), "\n";
195 }
196
197 would output something like this:
198
199 kBGV
200 NCWB
201 3tHJ
202
203 If the string 'dictionary' is sent instead of an integer, then a word
204 is randomly selected from a dictionary file. By default, the
205 dictionary file is assumed to be at /usr/share/dict/words and the shuf
206 command is used to pull out a word. The hash %String::Util::PATHS sets
207 the paths to the dictionary file and the shuf executable. Modify that
208 hash to change the paths. So this code:
209
210 foreach my $idx (1..3) {
211 print randword('dictionary'), "\n";
212 }
213
214 would output something like this:
215
216 mustache
217 fronds
218 browning
219
220 option: alpha
221
222 If the alpha option is true, only alphabetic characters are returned,
223 no numerals. For example, this code:
224
225 foreach my $idx (1..3) {
226 print randword(4, alpha=>1), "\n";
227 }
228
229 would output something like this:
230
231 qrML
232 wmWf
233 QGvF
234
235 option: numerals
236
237 If the numerals option is true, only numerals are returned, no
238 alphabetic characters. So this code:
239
240 foreach my $idx (1..3) {
241 print randword(4, numerals=>1), "\n";
242 }
243
244 would output something like this:
245
246 3981
247 4734
248 2657
249
250 option: strip_vowels
251
252 This option is true by default. If true, vowels are not included in
253 the returned random string. So this code:
254
255 foreach my $idx (1..3) {
256 print randword(4, strip_vowels=>1), "\n";
257 }
258
259 would output something like this:
260
261 Sk3v
262 pV5z
263 XhSX
264
265 eqq($val1, $val2)
266 Returns true if the two given values are equal. Also returns true if
267 both are undef. If only one is undef, or if they are both defined but
268 different, returns false. Here are some examples and what they return.
269
270 eqq('x', 'x'), "\n"; # 1
271 eqq('x', undef), "\n"; # 0
272 eqq(undef, undef), "\n"; # 1
273
274 neqq($str1, $str2)
275 The opposite of neqq, returns true if the two values are *not* the
276 same. Here are some examples and what they return.
277
278 print neqq('x', 'x'), "\n"; # 0
279 print neqq('x', undef), "\n"; # 1
280 print neqq(undef, undef), "\n"; # 0
281
282 equndef(), neundef()
283 equndef() has been renamed to eqq(). neundef() has been renamed to
284 neqq(). Those old names have been kept as aliases.
285
286 fullchomp(string)
287 Works like chomp, but is a little more thorough about removing \n's and
288 \r's even if they aren't part of the OS's standard end-of-line.
289
290 Undefs are returned as undefs.
291
292 randcrypt(string)
293 Crypts the given string, seeding the encryption with a random two
294 character seed.
295
296 randpost(%opts)
297 Returns a string that sorta looks like one or more paragraphs.
298
299 option: word_count
300
301 Sets how many words should be in the post. By default a random number
302 from 1 to 250 is used.
303
304 option: par_odds
305
306 Sets the odds of starting a new paragraph after any given word. By
307 default the value is .05, which means paragraphs will have an average
308 about twenty words.
309
310 option: par
311
312 Sets the string to put at the end or the start and end of a paragraph.
313 Defaults to two newlines for the end of a pargraph.
314
315 If this option is a single scalar, that string is added to the end of
316 each paragraph.
317
318 To set both the start and end string, use an array reference. The
319 first element should be the string to put at the start of a paragraph,
320 the second should be the string to put at the end of a paragraph.
321
322 option: max_length
323
324 Sets the maximum length of the returned string, including paragraph
325 delimiters.
326
327 ords($string)
328 Returns the given string represented as the ascii value of each
329 character.
330
331 For example, this code:
332
333 ords('Hendrix')
334
335 returns this string:
336
337 {72}{101}{110}{100}{114}{105}{120}
338
339 options
340
341 · convert_spaces=>[true|false]
342
343 If convert_spaces is true (which is the default) then spaces are
344 converted to their matching ord values. So, for example, this code:
345
346 ords('a b', convert_spaces=>1)
347
348 returns this:
349
350 {97}{32}{98}
351
352 This code returns the same thing:
353
354 ords('a b')
355
356 If convert_spaces is false, then spaces are just returned as
357 spaces. So this code:
358
359 ords('a b', convert_spaces=>0);
360
361 returns
362
363 {97} {98}
364
365 · alpha_nums
366
367 If the alpha_nums option is false, then characters 0-9, a-z, and
368 A-Z are not converted. For example, this code:
369
370 ords('a=b', alpha_nums=>0)
371
372 returns this:
373
374 a{61}b
375
376 deords($string)
377 Takes the output from ords() and returns the string that original
378 created that output.
379
380 For example, this command:
381
382 deords('{72}{101}{110}{100}{114}{105}{120}')
383
384 returns this string:
385 Hendrix
386
387 crunchlines($str)
388 Compacts contiguous newlines into single newlines. Whitespace between
389 newlines is ignored, so that two newlines separated by whitespace is
390 compacted down to a single newline.
391
392 For example, this code:
393
394 crunchlines("x\n\n\nx")
395
396 outputs two x's with a single empty line between them:
397
398 x
399
400 x
401
402 spacepad
404 Copyright (c) 2012-2016 by Miko O'Sullivan. All rights reserved. This
405 program is free software; you can redistribute it and/or modify it
406 under the same terms as Perl itself. This software comes with NO
407 WARRANTY of any kind.
408
410 Miko O'Sullivan miko@idocs.com
411
413 Version 0.10, December 1, 2005
414 Initial release
415
416 Version 0.11, December 22, 2005
417 This is a non-backwards compatible version.
418
419 urldecode, urlencode were removed entirely. All of the subs that
420 used to modify values in place were changed so that they do not do
421 so anymore, except for fullchomp.
422
423 See
424 http://www.xray.mpe.mpg.de/mailing-lists/modules/2005-12/msg00112.html
425 for why these changes were made.
426
427 Version 1.01, November 7, 2010
428 Decided it was time to upload five years worth of changes.
429
430 Version 1.20, July, 2012
431 Properly listing prerequisites.
432
433 Version 1.21, July 18, 2012
434 Fixed error in POD.
435
436 Version 1.22, July 20, 2012
437 Fix in documentation for randpost().
438
439 Clarified documentation for hascontent() and nocontent().
440
441 Version 1.23, Sep 1, 2012
442 Fixed error in META.yml.
443
444 Version 1.24, December 31, 2014
445 Cleaned up POD formatting.
446
447 Changed file to using Unixish style newlines. I hadn't realized
448 until now that it was using Windowish newline. How embarrasing.
449
450 Added some features to ords().
451
452 Version 1.25, January 4, 2015
453 Added parentheses to braces option for unquote. Cleaned up and
454 added to POD. Minor fixes to comments.
455
456 Renamed equndef to eqq, and neundef to neqq. However, the old names
457 have been kept as aliases.
458
459 Minor cleanup of formatting.
460
461 Version 1.26, Aug 29, 2016
462 Fixed tests. No significant changes to module.
463
464
465
466perl v5.30.1 2020-01-30 String::Util(3)