1IntSpan(3) User Contributed Perl Documentation IntSpan(3)
2
3
4
6 Set::IntSpan - Manages sets of integers
7
9 use Set::IntSpan qw(grep_set map_set grep_spans map_spans);
10
11 $Set::IntSpan::Empty_String = $string;
12
13 $set = new Set::IntSpan $set_spec;
14 $set = new Set::IntSpan @set_specs;
15 $valid = valid Set::IntSpan $run_list;
16 $set = copy $set $set_spec;
17
18 $run_list = run_list $set;
19 @elements = elements $set;
20 @sets = sets $set;
21 @spans = spans $set;
22
23 $u_set = union $set $set_spec;
24 $i_set = intersect $set $set_spec;
25 $x_set = xor $set $set_spec;
26 $d_set = diff $set $set_spec;
27 $c_set = complement $set;
28
29 $set->U($set_spec); # Union
30 $set->I($set_spec); # Intersect
31 $set->X($set_spec); # Xor
32 $set->D($set_spec); # Diff
33 $set->C; # Complement
34
35 equal $set $set_spec
36 equivalent $set $set_spec
37 superset $set $set_spec
38 subset $set $set_spec
39
40 $n = cardinality $set;
41 $n = size $set;
42
43 empty $set
44 finite $set
45 neg_inf $set
46 pos_inf $set
47 infinite $set
48 universal $set
49
50 member $set $n;
51 insert $set $n;
52 remove $set $n;
53
54 $min = min $set;
55 $max = max $set;
56
57 $holes = holes $set;
58 $cover = cover $set;
59 $inset = inset $set $n;
60 $smaller = trim $set $n;
61 $bigger = pad $set $n;
62
63 $subset = grep_set { ... } $set;
64 $mapset = map_set { ... } $set;
65
66 $subset = grep_spans { ... } $set;
67 $mapset = map_spans { ... } $set;
68
69 for ($element=$set->first; defined $element; $element=$set->next) { ... }
70 for ($element=$set->last ; defined $element; $element=$set->prev) { ... }
71
72 $element = $set->start($n);
73 $element = $set->current;
74
75 $n = $set->at($i);
76 $slice = $set->slice($from, $to);
77 $i = $set->ord($n);
78 $i = $set->span_ord($n);
79
80 Operator overloads
81 $u_set = $set + $set_spec; # union
82 $i_set = $set * $set_spec; # intersect
83 $x_set = $set ^ $set_spec; # xor
84 $d_set = $set - $set_spec; # diff
85 $c_set = ~$set; # complement
86
87 $set += $set_spec; # union
88 $set *= $set_spec; # intersect
89 $set ^= $set_spec; # xor
90 $set -= $set_spec; # diff
91
92 $set eq $set_spec # equal
93 $set ne $set_spec # not equal
94 $set le $set_spec # subset
95 $set lt $set_spec # proper subset
96 $set ge $set_spec # superset
97 $set gt $set_spec # proper superset
98
99 # compare sets by cardinality
100 $set1 == $set2
101 $set1 != $set2
102 $set1 <= $set2
103 $set1 < $set2
104 $set1 >= $set2
105 $set1 > $set2
106 $set1 <=> $set2
107
108 # compare cardinality of set to an integer
109 $set1 == $n
110 $set1 != $n
111 $set1 <= $n
112 $set1 < $n
113 $set1 >= $n
114 $set1 > $n
115 $set1 <=> $n
116
117 @sorted = sort @sets; # sort sets by cardinality
118
119 if ($set) { ... } # true if $set is not empty
120
121 print "$set\n"; # stringizes to the run list
122
124 @EXPORT
125 Nothing
126
127 @EXPORT_OK
128 "grep_set", "map_set", "grep_spans", "map_spans"
129
131 "Set::IntSpan" manages sets of integers. It is optimized for sets that
132 have long runs of consecutive integers. These arise, for example, in
133 .newsrc files, which maintain lists of articles:
134
135 alt.foo: 1-21,28,31
136 alt.bar: 1-14192,14194,14196-14221
137
138 A run of consecutive integers is sometimes called a span.
139
140 Sets are stored internally in a run-length coded form. This provides
141 for both compact storage and efficient computation. In particular, set
142 operations can be performed directly on the encoded representation.
143
144 "Set::IntSpan" is designed to manage finite sets. However, it can also
145 represent some simple infinite sets, such as { x | x>n }. This allows
146 operations involving complements to be carried out consistently,
147 without having to worry about the actual value of INT_MAX on your
148 machine.
149
151 A span is a run of consecutive integers. A span may be represented by
152 an array reference, in any of 5 forms:
153
154 Finite forms
155 Span Set
156 [ $n, $n ] { n }
157 [ $a, $b ] { x | a<=x && x<=b}
158
159 Infinite forms
160 Span Set
161 [ undef, $b ] { x | x<=b }
162 [ $a , undef ] { x | x>=a }
163 [ undef, undef ] The set of all integers
164
165 Some methods operate directly on spans.
166
168 Many of the methods take a set specification. There are four kinds of
169 set specifications.
170
171 Empty
172 If a set specification is omitted, then the empty set is assumed.
173 Thus,
174
175 $set = new Set::IntSpan;
176
177 creates a new, empty set. Similarly,
178
179 copy $set;
180
181 removes all elements from $set.
182
183 Object reference
184 If an object reference is given, it is taken to be a "Set::IntSpan"
185 object.
186
187 Run list
188 If a string is given, it is taken to be a run list. A run list
189 specifies a set using a syntax similar to that in newsrc files.
190
191 A run list is a comma-separated list of runs. Each run specifies a set
192 of consecutive integers. The set is the union of all the runs.
193
194 Runs may be written in any of 5 forms.
195
196 Finite forms
197 n { n }
198
199 a-b { x | a<=x && x<=b }
200
201 Infinite forms
202 (-n { x | x<=n }
203
204 n-) { x | x>=n }
205
206 (-) The set of all integers
207
208 Empty forms
209 The empty set is consistently written as '' (the null string). It is
210 also denoted by the special form '-' (a single dash).
211
212 Restrictions
213 The runs in a run list must be disjoint, and must be listed in
214 increasing order.
215
216 Valid characters in a run list are 0-9, '(', ')', '-' and ','. White
217 space and underscore (_) are ignored. Other characters are not
218 allowed.
219
220 Examples
221 Run list Set
222 "-" { }
223 "1" { 1 }
224 "1-2" { 1, 2 }
225 "-5--1" { -5, -4, -3, -2, -1 }
226 "(-)" the integers
227 "(--1" the negative integers
228 "1-3, 4, 18-21" { 1, 2, 3, 4, 18, 19, 20, 21 }
229
230 Array reference
231 If an array reference is given, then the elements of the array specify
232 the elements of the set. The array may contain
233
234 · integers
235
236 · spans
237
238 The set is the union of all the integers and spans in the array. The
239 integers and spans need not be disjoint. The integers and spans may be
240 in any order.
241
242 Examples
243 Array ref Set
244 [ ] { }
245 [ 1, 1 ] { 1 }
246 [ 1, 3, 2 ] { 1, 2, 3 }
247 [ 1, [ 5, 8 ], 5, [ 7, 9 ], 2 ] { 1, 2, 5, 6, 7, 8, 9 }
248 [ undef, undef ] the integers
249 [ undef, -1 ] the negative integers
250
252 Each set has a single iterator, which is shared by all calls to
253 "first", "last", "start", "next", "prev", and "current". At all times,
254 the iterator is either an element of the set, or "undef".
255
256 "first", "last", and "start" set the iterator; "next", and "prev" move
257 it; and "current" returns it. Calls to these methods may be freely
258 intermixed.
259
260 Using "next" and "prev", a single loop can move both forwards and
261 backwards through a set. Using "start", a loop can iterate over
262 portions of an infinite set.
263
265 Creation
266 $set = "new" "Set::IntSpan" $set_spec
267 $set = "new" "Set::IntSpan" @set_specs
268 Creates and returns a "Set::IntSpan" object.
269
270 The initial contents of the set are given by $set_spec, or by the
271 union of all the @set_specs.
272
273 $ok = "valid" "Set::IntSpan" $run_list
274 Returns true if $run_list is a valid run list. Otherwise, returns
275 false and leaves an error message in $@.
276
277 $set = "copy" $set $set_spec
278 Copies $set_spec into $set. The previous contents of $set are
279 lost. For convenience, "copy" returns $set.
280
281 $run_list = "run_list" $set
282 Returns a run list that represents $set. The run list will not
283 contain white space. $set is not affected.
284
285 By default, the empty set is formatted as '-'; a different string
286 may be specified in $Set::IntSpan::Empty_String.
287
288 @elements = "elements" $set
289 Returns an array containing the elements of $set. The elements
290 will be sorted in numerical order. In scalar context, returns an
291 array reference. $set is not affected.
292
293 @sets = "sets" $set
294 Returns the runs in $set, as a list of "Set::IntSpan" objects. The
295 sets in the list are in order.
296
297 @spans = "spans" $set
298 Returns the runs in $set, as a list of the form
299
300 ([$a1, $b1],
301 [$a2, $b2],
302 ...
303 [$aN, $bN])
304
305 If a run contains only a single integer, then the upper and lower
306 bounds of the corresponding span will be equal.
307
308 If the set has no lower bound, then $a1 will be "undef".
309 Similarly, if the set has no upper bound, then $bN will be "undef".
310
311 The runs in the list are in order.
312
313 Set operations
314 For these operations, a new "Set::IntSpan" object is created and
315 returned. The operands are not affected.
316
317 $u_set = "union" $set $set_spec
318 Returns the set of integers in either $set or $set_spec.
319
320 $i_set = "intersect" $set $set_spec
321 Returns the set of integers in both $set and $set_spec.
322
323 $x_set = "xor" $set $set_spec
324 Returns the set of integers in $set or $set_spec, but not both.
325
326 $d_set = "diff" $set $set_spec
327 Returns the set of integers in $set but not in $set_spec.
328
329 $c_set = "complement" $set
330 Returns the set of integers that are not in $set.
331
332 Mutators
333 By popular demand, "Set::IntSpan" now has mutating forms of the binary
334 set operations. These methods alter the object on which they are
335 called.
336
337 $set->"U"($set_spec)
338 Makes $set the union of $set and $set_spec. Returns $set.
339
340 $set->"I"($set_spec)
341 Makes $set the intersection of $set and $set_spec. Returns $set.
342
343 $set->"X"($set_spec)
344 Makes $set the symmetric difference of $set and $set_spec. Returns
345 $set.
346
347 $set->"D"($set_spec)
348 Makes $set the difference of $set and $set_spec. Returns $set.
349
350 $set->"C"
351 Converts $set to its own complement. Returns $set.
352
353 Comparison
354 "equal" $set $set_spec
355 Returns true iff $set and $set_spec contain the same elements.
356
357 "equivalent" $set $set_spec
358 Returns true iff $set and $set_spec contain the same number of
359 elements. All infinite sets are equivalent.
360
361 "superset" $set $set_spec
362 Returns true iff $set is a superset of $set_spec.
363
364 "subset" $set $set_spec
365 Returns true iff $set is a subset of $set_spec.
366
367 Cardinality
368 $n = "cardinality" $set
369 $n = "size" $set
370 Returns the number of elements in $set. Returns -1 for infinite
371 sets. "size" is provided as an alias for "cardinality".
372
373 "empty" $set
374 Returns true iff $set is empty.
375
376 "finite" $set
377 Returns true iff $set is finite.
378
379 "neg_inf" $set
380 Returns true iff $set contains {x | x<n} for some n.
381
382 "pos_inf" $set
383 Returns true iff $set contains {x | x>n} for some n.
384
385 "infinite" $set
386 Returns true iff $set is infinite.
387
388 "universal" $set
389 Returns true iff $set contains all integers.
390
391 Membership
392 "member" $set $n
393 Returns true iff the integer $n is a member of $set.
394
395 "insert" $set $n
396 Inserts the integer $n into $set. Does nothing if $n is already a
397 member of $set.
398
399 "remove" $set $n
400 Removes the integer $n from $set. Does nothing if $n is not a
401 member of $set.
402
403 Extrema
404 "min" $set
405 Returns the smallest element of $set, or "undef" if there is none.
406
407 "max" $set
408 Returns the largest element of $set, or "undef" if there is none.
409
410 Spans
411 $holes = "holes" $set
412 Returns a set containing all the holes in $set, that is, all the
413 integers that are in-between spans of $set.
414
415 "holes" is always a finite set.
416
417 $cover = "cover" $set
418 Returns a set consisting of a single span from $set->"min" to
419 $set->"max". This is the same as
420
421 union $set $set->holes
422
423 $inset = "inset" $set $n
424 $smaller = "trim" $set $n
425 $bigger = "pad" $set $n
426 "inset" returns a set constructed by removing $n integers from each
427 end of each span of $set. If $n is negative, then -$n integers are
428 added to each end of each span.
429
430 In the first case, spans may vanish from the set; in the second
431 case, holes may vanish.
432
433 "trim" is provided as a synonym for "inset".
434
435 "pad" $set $n is the same as "inset" $set -$n.
436
437 Iterators
438 $set->"first"
439 Sets the iterator for $set to the smallest element of $set. If
440 there is no smallest element, sets the iterator to "undef".
441 Returns the iterator.
442
443 $set->"last"
444 Sets the iterator for $set to the largest element of $set. If
445 there is no largest element, sets the iterator to "undef". Returns
446 the iterator.
447
448 $set->"start"($n)
449 Sets the iterator for $set to $n. If $n is not an element of $set,
450 sets the iterator to "undef". Returns the iterator.
451
452 $set->"next"
453 Sets the iterator for $set to the next element of $set. If there
454 is no next element, sets the iterator to "undef". Returns the
455 iterator.
456
457 "next" will return "undef" only once; the next call to "next" will
458 reset the iterator to the smallest element of $set.
459
460 $set->"prev"
461 Sets the iterator for $set to the previous element of $set. If
462 there is no previous element, sets the iterator to "undef".
463 Returns the iterator.
464
465 "prev" will return "undef" only once; the next call to "prev" will
466 reset the iterator to the largest element of $set.
467
468 $set->"current"
469 Returns the iterator for $set.
470
471 Indexing
472 The elements of a set are kept in numerical order. These methods index
473 into the set based on this ordering.
474
475 $n = $set->"at"($i)
476 Returns the $ith element of $set, or "undef" if there is no $ith
477 element. Negative indices count backwards from the end of the set.
478
479 Dies if
480
481 · $i is non-negative and $set is "neg_inf"
482
483 · $i is negative and $set is "pos_inf"
484
485 $slice = $set->"slice"($from, $to)
486 Returns a "Set::IntSpan" object containing the elements of $set at
487 indices $from..$to. Negative indices count backwards from the end
488 of the set.
489
490 Dies if
491
492 · $from is non-negative and $set is "neg_inf"
493
494 · $from is negative and $set is "pos_inf"
495
496 $i = $set->"ord"($n)
497 The inverse of "at".
498
499 Returns the index $i of the integer $n in $set, or "undef" if $n if
500 not an element of $set.
501
502 Dies if $set is "neg_inf".
503
504 $i = $set->"span_ord"($n)
505 Returns the index $i of the span containing the integer $n, or
506 "undef" if $n if not an element of $set.
507
508 To recover the span containing $n, write
509
510 ($set->spans)[$i]
511
513 For convenience, some operators are overloaded on "Set::IntSpan"
514 objects.
515
516 set operations
517 One operand must be a "Set::IntSpan" object. The other operand may be
518 a "Set::IntSpan" object or a set specification.
519
520 $u_set = $set + $set_spec; # union
521 $i_set = $set * $set_spec; # intersect
522 $x_set = $set ^ $set_spec; # xor
523 $d_set = $set - $set_spec; # diff
524 $c_set = ~$set; # complement
525
526 $set += $set_spec; # union
527 $set *= $set_spec; # intersect
528 $set ^= $set_spec; # xor
529 $set -= $set_spec; # diff
530
531 equality
532 The string comparison operations are overloaded to compare sets for
533 equality and containment. One operand must be a "Set::IntSpan" object.
534 The other operand may be a "Set::IntSpan" object or a set
535 specification.
536
537 $set eq $set_spec # equal
538 $set ne $set_spec # not equal
539 $set le $set_spec # subset
540 $set lt $set_spec # proper subset
541 $set ge $set_spec # superset
542 $set gt $set_spec # proper superset
543
544 equivalence
545 The numerical comparison operations are overloaded to compare sets by
546 cardinality. One operand must be a "Set::IntSpan" object. The other
547 operand may be a "Set::IntSpan" object or an integer.
548
549 $set1 == $set2
550 $set1 != $set2
551 $set1 <= $set2
552 $set1 < $set2
553 $set1 >= $set2
554 $set1 > $set2
555 $set1 <=> $set2
556 $set1 cmp $set2
557
558 $set1 == $n
559 $set1 != $n
560 $set1 <= $n
561 $set1 < $n
562 $set1 >= $n
563 $set1 > $n
564 $set1 <=> $n
565 $set1 cmp $n
566
567 N.B. The "cmp" operator is overloaded to compare sets by cardinality,
568 not containment. This is done so that
569
570 sort @sets
571
572 will sort a list of sets by cardinality.
573
574 conversion
575 In boolean context, a "Set::IntSpan" object evaluates to true if it is
576 not empty.
577
578 A "Set::IntSpan" object stringizes to its run list.
579
581 $sub_set = "grep_set" { ... } $set
582 Evaluates the BLOCK for each integer in $set (locally setting $_ to
583 each integer) and returns a "Set::IntSpan" object containing those
584 integers for which the BLOCK returns TRUE.
585
586 Returns "undef" if $set is infinite.
587
588 $map_set = "map_set" { ... } $set
589 Evaluates the BLOCK for each integer in $set (locally setting $_ to
590 each integer) and returns a "Set::IntSpan" object containing all
591 the integers returned as results of all those evaluations.
592
593 Evaluates the BLOCK in list context, so each element of $set may
594 produce zero, one, or more elements in the returned set. The
595 elements may be returned in any order, and need not be disjoint.
596
597 Returns "undef" if $set is infinite.
598
599 $sub_set = "grep_spans" { ... } $set
600 Evaluates the BLOCK for each span in $set and returns a
601 "Set::IntSpan" object containing those spans for which the BLOCK
602 returns TRUE.
603
604 Within BLOCK, $_ is locally set to an array ref of the form
605
606 [ $lower, $upper ]
607
608 where $lower and $upper are the bounds of the span. If the span
609 contains only one integer, then $lower and $upper will be equal.
610 If the span is unbounded, then the corresponding element(s) of the
611 array will be "undef".
612
613 $map_set = "map_spans" { ... } $set
614 Evaluates the BLOCK for each span in $set, and returns a
615 "Set::IntSpan" object consisting of the union of all the spans
616 returned as results of all those evaluations.
617
618 Within BLOCK, $_ is locally set to an array ref of the form
619
620 [ $lower, $upper ]
621
622 as described above for "grep_spans". Each evaluation of BLOCK must
623 return a list of spans. Each returned list may contain zero, one,
624 or more spans. Spans may be returned in any order, and need not be
625 disjoint. However, for each bounded span, the constraint
626
627 $lower <= $upper
628
629 must hold.
630
632 $Set::IntSpan::Empty_String
633 $Set::IntSpan::Empty_String contains the string that is returned
634 when "run_list" is called on the empty set. $Empty_String is
635 initially '-'; alternatively, it may be set to ''. Other values
636 should be avoided, to ensure that "run_list" always returns a valid
637 run list.
638
639 "run_list" accesses $Empty_String through a reference stored in
640 $set->{"empty_string"}. Subclasses that wish to override the value
641 of $Empty_String can reassign this reference.
642
644 Any method (except "valid") will "die" if it is passed an invalid run
645 list.
646
647 "Set::IntSpan::_copy_run_list: Bad syntax:" $runList
648 (F) $run_list has bad syntax
649
650 "Set::IntSpan::_copy_run_list: Bad order:" $runList
651 (F) $run_list has overlapping runs or runs that are out of order.
652
653 "Set::IntSpan::elements: infinite set"
654 (F) An infinite set was passed to "elements".
655
656 "Set::IntSpan::at: negative infinite set"
657 (F) "at" was called with a non-negative index on a negative
658 infinite set.
659
660 "Set::IntSpan::at: positive infinite set"
661 (F) "at" was called with a negative index on a positive infinite
662 set.
663
664 "Set::IntSpan::slice: negative infinite set"
665 (F) "slice" was called with $from non-negative on a negative
666 infinite set.
667
668 "Set::IntSpan::slice: positive infinite set"
669 (F) "slice" was called with $from negative on a positive infinite
670 set.
671
672 "Set::IntSpan::ord: negative infinite set"
673 (F) "ord" was called on a negative infinite set.
674
675 Out of memory!
676 (X) "elements" $set can generate an "Out of memory!" message on
677 sufficiently large finite sets.
678
680 Traps
681 Beware of forms like
682
683 union $set [1..5];
684
685 This passes an element of @set to union, which is probably not what you
686 want. To force interpretation of $set and [1..5] as separate
687 arguments, use forms like
688
689 union $set +[1..5];
690
691 or
692
693 $set->union([1..5]);
694
695 grep_set and map_set
696 "grep_set" and "map_set" make it easy to construct sets for which the
697 internal representation used by "Set::IntSpan" is not small. Consider:
698
699 $billion = new Set::IntSpan '0-1_000_000_000'; # OK
700 $odd = grep_set { $_ & 1 } $billion; # trouble
701 $even = map_set { $_ * 2 } $billion; # double trouble
702
703 Error handling
704 There are two common approaches to error handling: exceptions and
705 return codes. There seems to be some religion on the topic, so
706 "Set::IntSpan" provides support for both.
707
708 To catch exceptions, protect method calls with an eval:
709
710 $run_list = <STDIN>;
711 eval { $set = new Set::IntSpan $run_list };
712 $@ and print "$@: try again\n";
713
714 To check return codes, use an appropriate method call to validate
715 arguments:
716
717 $run_list = <STDIN>;
718 if (valid Set::IntSpan $run_list)
719 { $set = new Set::IntSpan $run_list }
720 else
721 { print "$@ try again\n" }
722
723 Similarly, use "finite" to protect calls to "elements":
724
725 finite $set and @elements = elements $set;
726
727 Calling "elements" on a large, finite set can generate an "Out of
728 memory!" message, which cannot (easily) be trapped. Applications that
729 must retain control after an error can use "intersect" to protect calls
730 to "elements":
731
732 @elements = elements { intersect $set "-1_000_000 - 1_000_000" };
733
734 or check the size of $set first:
735
736 finite $set and cardinality $set < 2_000_000 and @elements = elements $set;
737
738 Limitations
739 Although "Set::IntSpan" can represent some infinite sets, it does not
740 perform infinite-precision arithmetic. Therefore, finite elements are
741 restricted to the range of integers on your machine.
742
743 Extensions
744 Users report that you can construct Set::IntSpan objects on anything
745 that behaves like an integer. For example:
746
747 $x = new Math::BigInt ...;
748 $set = new Set::Intspan [ [ $x, $x+5 ] ];
749
750 I'm not documenting this as supported behavior, because I don't have
751 the resources to test it, but I'll try not to break it. If anyone
752 finds problems with it, let me know.
753
754 Roots
755 The sets implemented here are based on a Macintosh data structure
756 called a region. See Inside Macintosh for more information.
757
758 "Set::IntSpan" was originally written to manage run lists for the
759 "News::Newsrc" module.
760
762 Steven McDougall <swmcd@world.std.com>
763
765 · Malcolm Cook <mec@stowers-institute.org>
766
767 · David Hawthorne <dsrthorne@hotmail.com>
768
769 · Martin Krzywinski <martink@bcgsc.ca>
770
771 · Marc Lehmann <schmorp@schmorp.de>
772
773 · Andrew Olson <aolson@me.com>
774
776 Copyright (c) 1996-2010 by Steven McDougall. This module is free
777 software; you can redistribute it and/or modify it under the same terms
778 as Perl itself.
779
780
781
782perl v5.12.2 2010-11-11 IntSpan(3)