1IntSpan(3)            User Contributed Perl Documentation           IntSpan(3)
2
3
4

NAME

6       Set::IntSpan - Manages sets of integers
7

SYNOPSIS

9         use Set::IntSpan qw(grep_set map_set grep_spans map_spans);
10
11         $Set::IntSpan::Empty_String = $string;
12
13         $set    = new   Set::IntSpan $set_spec;
14         $set    = new   Set::IntSpan @set_specs;
15         $valid  = valid Set::IntSpan $run_list;
16         $set    = copy  $set $set_spec;
17
18         $run_list = run_list $set;
19         @elements = elements $set;
20         @sets     = sets     $set;
21         @spans    = spans    $set;
22
23         $u_set = union      $set $set_spec;
24         $i_set = intersect  $set $set_spec;
25         $x_set = xor        $set $set_spec;
26         $d_set = diff       $set $set_spec;
27         $c_set = complement $set;
28
29         $set->U($set_spec);   # Union
30         $set->I($set_spec);   # Intersect
31         $set->X($set_spec);   # Xor
32         $set->D($set_spec);   # Diff
33         $set->C;              # Complement
34
35         equal      $set $set_spec
36         equivalent $set $set_spec
37         superset   $set $set_spec
38         subset     $set $set_spec
39
40         $n = cardinality $set;
41         $n = size        $set;
42
43         empty      $set
44         finite     $set
45         neg_inf    $set
46         pos_inf    $set
47         infinite   $set
48         universal  $set
49
50         member     $set $n;
51         insert     $set $n;
52         remove     $set $n;
53
54         $min = min $set;
55         $max = max $set;
56
57         $holes   = holes $set;
58         $cover   = cover $set;
59         $inset   = inset $set $n;
60         $smaller = trim  $set $n;
61         $bigger  = pad   $set $n;
62
63         $subset  = grep_set   { ... } $set;
64         $mapset  = map_set    { ... } $set;
65
66         $subset  = grep_spans { ... } $set;
67         $mapset  = map_spans  { ... } $set;
68
69         for ($element=$set->first; defined $element; $element=$set->next) { ... }
70         for ($element=$set->last ; defined $element; $element=$set->prev) { ... }
71
72         $element = $set->start($n);
73         $element = $set->current;
74
75         $n       = $set->at($i);
76         $slice   = $set->slice($from, $to);
77
78       Operator overloads
79
80         $u_set =  $set + $set_spec;   # union
81         $i_set =  $set * $set_spec;   # intersect
82         $x_set =  $set ^ $set_spec;   # xor
83         $d_set =  $set - $set_spec;   # diff
84         $c_set = ~$set;               # complement
85
86         $set += $set_spec;            # union
87         $set *= $set_spec;            # intersect
88         $set ^= $set_spec;            # xor
89         $set -= $set_spec;            # diff
90
91         $set eq $set_spec             # equal
92         $set ne $set_spec             # not equal
93         $set le $set_spec             # subset
94         $set lt $set_spec             # proper subset
95         $set ge $set_spec             # superset
96         $set gt $set_spec             # proper superset
97
98         # compare sets by cardinality
99         $set1 ==  $set2
100         $set1 !=  $set2
101         $set1 <=  $set2
102         $set1 <   $set2
103         $set1 >=  $set2
104         $set1 >   $set2
105         $set1 <=> $set2
106
107         # compare cardinality of set to an integer
108         $set1 ==  $n
109         $set1 !=  $n
110         $set1 <=  $n
111         $set1 <   $n
112         $set1 >=  $n
113         $set1 >   $n
114         $set1 <=> $n
115
116         @sorted = sort @sets;         # sort sets by cardinality
117
118         if ($set) { ... }             # true if $set is not empty
119
120         print "$set\n";               # stringizes to the run list
121

EXPORTS

123       @EXPORT
124
125       Nothing
126
127       @EXPORT_OK
128
129       "grep_set", "map_set", "grep_spans", "map_spans"
130

DESCRIPTION

132       "Set::IntSpan" manages sets of integers.  It is optimized for sets that
133       have long runs of consecutive integers.  These arise, for example, in
134       .newsrc files, which maintain lists of articles:
135
136         alt.foo: 1-21,28,31
137         alt.bar: 1-14192,14194,14196-14221
138
139       A run of consecutive integers is sometimes called a span.
140
141       Sets are stored internally in a run-length coded form.  This provides
142       for both compact storage and efficient computation.  In particular, set
143       operations can be performed directly on the encoded representation.
144
145       "Set::IntSpan" is designed to manage finite sets.  However, it can also
146       represent some simple infinite sets, such as { x ⎪ x>n }.  This allows
147       operations involving complements to be carried out consistently, with‐
148       out having to worry about the actual value of INT_MAX on your machine.
149

SPANS

151       A span is a run of consecutive integers.  A span may be represented by
152       an array reference, in any of 5 forms:
153
154       Finite forms
155
156           Span                Set
157         [ $n,    $n    ]      { n }
158         [ $a,    $b    ]      { x ⎪ a<=x && x<=b}
159
160       Infinite forms
161
162           Span                Set
163         [ undef, $b    ]      { x ⎪ x<=b }
164         [ $a   , undef ]      { x ⎪ x>=a }
165         [ undef, undef ]      The set of all integers
166
167       Some methods operate directly on spans.
168

SET SPECIFICATIONS

170       Many of the methods take a set specification.  There are four kinds of
171       set specifications.
172
173       Empty
174
175       If a set specification is omitted, then the empty set is assumed.
176       Thus,
177
178         $set = new Set::IntSpan;
179
180       creates a new, empty set.  Similarly,
181
182         copy $set;
183
184       removes all elements from $set.
185
186       Object reference
187
188       If an object reference is given, it is taken to be a "Set::IntSpan"
189       object.
190
191       Run list
192
193       If a string is given, it is taken to be a run list.  A run list speci‐
194       fies a set using a syntax similar to that in newsrc files.
195
196       A run list is a comma-separated list of runs.  Each run specifies a set
197       of consecutive integers.  The set is the union of all the runs.
198
199       Runs may be written in any of 5 forms.
200
201       Finite forms
202
203       n       { n }
204
205       a-b     { x ⎪ a<=x && x<=b }
206
207       Infinite forms
208
209       (-n     { x ⎪ x<=n }
210
211       n-)     { x ⎪ x>=n }
212
213       (-)     The set of all integers
214
215       Empty forms
216
217       The empty set is consistently written as '' (the null string).  It is
218       also denoted by the special form '-' (a single dash).
219
220       Restrictions
221
222       The runs in a run list must be disjoint, and must be listed in increas‐
223       ing order.
224
225       Valid characters in a run list are 0-9, '(', ')', '-' and ','.  White
226       space and underscore (_) are ignored.  Other characters are not
227       allowed.
228
229       Examples
230
231         Run list          Set
232         "-"               { }
233         "1"               { 1 }
234         "1-2"             { 1, 2 }
235         "-5--1"           { -5, -4, -3, -2, -1 }
236         "(-)"             the integers
237         "(--1"            the negative integers
238         "1-3, 4, 18-21"   { 1, 2, 3, 4, 18, 19, 20, 21 }
239
240       Array reference
241
242       If an array reference is given, then the elements of the array specify
243       the elements of the set.  The array may contain
244
245       ·   integers
246
247       ·   spans
248
249       The set is the union of all the integers and spans in the array.  The
250       integers and spans need not be disjoint.  The integers and spans may be
251       in any order.
252
253       Examples
254
255         Array ref                         Set
256         [ ]                               { }
257         [ 1, 1 ]                          { 1 }
258         [ 1, 3, 2 ]                       { 1, 2, 3 }
259         [ 1, [ 5, 8 ], 5, [ 7, 9 ], 2 ]   { 1, 2, 5, 6, 7, 8, 9 }
260         [ undef, undef ]                  the integers
261         [ undef, -1 ]                     the negative integers
262

ITERATORS

264       Each set has a single iterator, which is shared by all calls to
265       "first", "last", "start", "next", "prev", and "current".  At all times,
266       the iterator is either an element of the set, or "undef".
267
268       "first", "last", and "start" set the iterator; "next", and "prev" move
269       it; and "current" returns it.  Calls to these methods may be freely
270       intermixed.
271
272       Using "next" and "prev", a single loop can move both forwards and back‐
273       wards through a set.  Using "start", a loop can iterate over portions
274       of an infinite set.
275

METHODS

277       Creation
278
279       $set = "new" "Set::IntSpan" $set_spec
280       $set = "new" "Set::IntSpan" @set_specs
281           Creates and returns a "Set::IntSpan" object.
282
283           The initial contents of the set are given by $set_spec, or by the
284           union of all the @set_specs.
285
286       $ok = "valid" "Set::IntSpan" $run_list
287           Returns true if $run_list is a valid run list.  Otherwise, returns
288           false and leaves an error message in $@.
289
290       $set = "copy" $set $set_spec
291           Copies $set_spec into $set.  The previous contents of $set are
292           lost.  For convenience, "copy" returns $set.
293
294       $run_list = "run_list" $set
295           Returns a run list that represents $set.  The run list will not
296           contain white space.  $set is not affected.
297
298           By default, the empty set is formatted as '-'; a different string
299           may be specified in $Set::IntSpan::Empty_String.
300
301       @elements = "elements" $set
302           Returns an array containing the elements of $set.  The elements
303           will be sorted in numerical order.  In scalar context, returns an
304           array reference.  $set is not affected.
305
306       @sets = "sets" $set
307           Returns the runs in $set, as a list of "Set::IntSpan" objects.  The
308           sets in the list are in order.
309
310       @spans = "spans" $set
311           Returns the runs in $set, as a list of the form
312
313             ([$a1, $b1],
314              [$a2, $b2],
315              ...
316              [$aN, $bN])
317
318           If a run contains only a single integer, then the upper and lower
319           bounds of the corresponding span will be equal.
320
321           If the set has no lower bound, then $a1 will be "undef".  Simi‐
322           larly, if the set has no upper bound, then $bN will be "undef".
323
324           The runs in the list are in order.
325
326       Set operations
327
328       For these operations, a new "Set::IntSpan" object is created and
329       returned.  The operands are not affected.
330
331       $u_set = "union" $set $set_spec
332           Returns the set of integers in either $set or $set_spec.
333
334       $i_set = "intersect" $set $set_spec
335           Returns the set of integers in both $set and $set_spec.
336
337       $x_set = "xor" $set $set_spec
338           Returns the set of integers in $set or $set_spec, but not both.
339
340       $d_set = "diff" $set $set_spec
341           Returns the set of integers in $set but not in $set_spec.
342
343       $c_set = "complement" $set
344           Returns the set of integers that are not in $set.
345
346       Mutators
347
348       By popular demand, "Set::IntSpan" now has mutating forms of the binary
349       set operations.  These methods alter the object on which they are
350       called.
351
352       $set->"U"($set_spec)
353           Makes $set the union of $set and $set_spec.  Returns $set.
354
355       $set->"I"($set_spec)
356           Makes $set the intersection of $set and $set_spec.  Returns $set.
357
358       $set->"X"($set_spec)
359           Makes $set the symmetric difference of $set and $set_spec.  Returns
360           $set.
361
362       $set->"D"($set_spec)
363           Makes $set the difference of $set and $set_spec.  Returns $set.
364
365       $set->"C"
366           Converts $set to its own complement.  Returns $set.
367
368       Comparison
369
370       "equal" $set $set_spec
371           Returns true iff $set and $set_spec contain the same elements.
372
373       "equivalent" $set $set_spec
374           Returns true iff $set and $set_spec contain the same number of ele‐
375           ments.  All infinite sets are equivalent.
376
377       "superset" $set $set_spec
378           Returns true iff $set is a superset of $set_spec.
379
380       "subset" $set $set_spec
381           Returns true iff $set is a subset of $set_spec.
382
383       Cardinality
384
385       $n = "cardinality" $set
386       $n = "size" $set
387           Returns the number of elements in $set.  Returns -1 for infinite
388           sets.  "size" is provided as an alias for "cardinality".
389
390       "empty" $set
391           Returns true iff $set is empty.
392
393       "finite" $set
394           Returns true iff $set is finite.
395
396       "neg_inf" $set
397           Returns true iff $set contains {x ⎪ x<n} for some n.
398
399       "pos_inf" $set
400           Returns true iff $set contains {x ⎪ x>n} for some n.
401
402       "infinite" $set
403           Returns true iff $set is infinite.
404
405       "universal" $set
406           Returns true iff $set contains all integers.
407
408       Membership
409
410       "member" $set $n
411           Returns true iff the integer $n is a member of $set.
412
413       "insert" $set $n
414           Inserts the integer $n into $set.  Does nothing if $n is already a
415           member of $set.
416
417       "remove" $set $n
418           Removes the integer $n from $set.  Does nothing if $n is not a mem‐
419           ber of $set.
420
421       Extrema
422
423       "min" $set
424           Returns the smallest element of $set, or "undef" if there is none.
425
426       "max" $set
427           Returns the largest element of $set, or "undef" if there is none.
428
429       Spans
430
431       $holes = "holes" $set
432           Returns a set containing all the holes in $set, that is, all the
433           integers that are in-between spans of $set.
434
435           "holes" is always a finite set.
436
437       $cover = "cover" $set
438           Returns a set consisting of a single span from $set->"min" to
439           $set->"max". This is the same as
440
441             union $set $set->holes
442
443       $inset = "inset" $set $n
444       $smaller = "trim" $set $n
445       $bigger = "pad" $set $n
446           "inset" returns a set constructed by removing $n integers from each
447           end of each span of $set. If $n is negative, then -$n integers are
448           added to each end of each span.
449
450           In the first case, spans may vanish from the set; in the second
451           case, holes may vanish.
452
453           "trim" is provided as a synonym for "inset".
454
455           "pad" $set $n is the same as "inset" $set -$n.
456
457       Iterators
458
459       $set->"first"
460           Sets the iterator for $set to the smallest element of $set.  If
461           there is no smallest element, sets the iterator to "undef".
462           Returns the iterator.
463
464       $set->"last"
465           Sets the iterator for $set to the largest element of $set.  If
466           there is no largest element, sets the iterator to "undef".  Returns
467           the iterator.
468
469       $set->"start"($n)
470           Sets the iterator for $set to $n.  If $n is not an element of $set,
471           sets the iterator to "undef".  Returns the iterator.
472
473       $set->"next"
474           Sets the iterator for $set to the next element of $set.  If there
475           is no next element, sets the iterator to "undef".  Returns the
476           iterator.
477
478           "next" will return "undef" only once; the next call to "next" will
479           reset the iterator to the smallest element of $set.
480
481       $set->"prev"
482           Sets the iterator for $set to the previous element of $set.  If
483           there is no previous element, sets the iterator to "undef".
484           Returns the iterator.
485
486           "prev" will return "undef" only once; the next call to "prev" will
487           reset the iterator to the largest element of $set.
488
489       $set->"current"
490           Returns the iterator for $set.
491
492       Indexing
493
494       The elements of a set are kept in numerical order.  These methods index
495       into the set based on this ordering.
496
497       $n = $set->"at"($i)
498           Returns the $ith element of $set, or "undef" if there is no $ith
499           element.  Negative indices count backwards from the end of the set.
500
501           Dies if
502
503           *   $i is non-negative and $set is "neg_inf"
504
505           *   $i is negative and $set is "pos_inf"
506
507       $slice = $set->"slice"($from, $to)
508           Returns a "Set::IntSpan" object containing the elements of $set at
509           indices $from..$to.  Negative indices count backwards from the end
510           of the set.
511
512           Dies if
513
514           *   $from is non-negative and $set is "neg_inf"
515
516           *   $from is negative and $set is "pos_inf"
517

OPERATOR OVERLOADS

519       For convenience, some operators are overloaded on "Set::IntSpan"
520       objects.
521
522       set operations
523
524       One operand must be a "Set::IntSpan" object.  The other operand may be
525       a "Set::IntSpan" object or a set specification.
526
527         $u_set =  $set + $set_spec;   # union
528         $i_set =  $set * $set_spec;   # intersect
529         $x_set =  $set ^ $set_spec;   # xor
530         $d_set =  $set - $set_spec;   # diff
531         $c_set = ~$set;               # complement
532
533         $set += $set_spec;            # union
534         $set *= $set_spec;            # intersect
535         $set ^= $set_spec;            # xor
536         $set -= $set_spec;            # diff
537
538       equality
539
540       The string comparison operations are overloaded to compare sets for
541       equality and containment.  One operand must be a "Set::IntSpan" object.
542       The other operand may be a "Set::IntSpan" object or a set specifica‐
543       tion.
544
545         $set eq $set_spec             # equal
546         $set ne $set_spec             # not equal
547         $set le $set_spec             # subset
548         $set lt $set_spec             # proper subset
549         $set ge $set_spec             # superset
550         $set gt $set_spec             # proper superset
551
552       equivalence
553
554       The numerical comparison operations are overloaded to compare sets by
555       cardinality.  One operand must be a "Set::IntSpan" object.  The other
556       operand may be a "Set::IntSpan" object or an integer.
557
558         $set1 ==  $set2
559         $set1 !=  $set2
560         $set1 <=  $set2
561         $set1 <   $set2
562         $set1 >=  $set2
563         $set1 >   $set2
564         $set1 <=> $set2
565         $set1 cmp $set2
566
567         $set1 ==  $n
568         $set1 !=  $n
569         $set1 <=  $n
570         $set1 <   $n
571         $set1 >=  $n
572         $set1 >   $n
573         $set1 <=> $n
574         $set1 cmp $n
575
576       N.B. The "cmp" operator is overloaded to compare sets by cardinality,
577       not containment.  This is done so that
578
579         sort @sets
580
581       will sort a list of sets by cardinality.
582
583       conversion
584
585       In boolean context, a $Set::IntSpan object evaluates to true if it is
586       not empty.
587
588       A $Set::IntSpan object stringizes to its run list.
589

FUNCTIONS

591       $sub_set = "grep_set" { ... } $set
592           Evaluates the BLOCK for each integer in $set (locally setting $_ to
593           each integer) and returns a "Set::IntSpan" object containing those
594           integers for which the BLOCK returns TRUE.
595
596           Returns "undef" if $set is infinite.
597
598       $map_set = "map_set" { ... } $set
599           Evaluates the BLOCK for each integer in $set (locally setting $_ to
600           each integer) and returns a "Set::IntSpan" object containing all
601           the integers returned as results of all those evaluations.
602
603           Evaluates the BLOCK in list context, so each element of $set may
604           produce zero, one, or more elements in the returned set.  The ele‐
605           ments may be returned in any order, and need not be disjoint.
606
607           Returns "undef" if $set is infinite.
608
609       $sub_set = "grep_spans" { ... } $set
610           Evaluates the BLOCK for each span in $set and returns a
611           "Set::IntSpan" object containing those spans for which the BLOCK
612           returns TRUE.
613
614           Within BLOCK, $_ is locally set to an array ref of the form
615
616             [ $lower, $upper ]
617
618           where $lower and $upper are the bounds of the span.  If the span
619           contains only one integer, then $lower and $upper will be equal.
620           If the span is unbounded, then the corresponding element(s) of the
621           array will be "undef".
622
623       $map_set = "map_spans" { ... } $set
624           Evaluates the BLOCK for each span in $set, and returns a
625           "Set::IntSpan" object consisting of the union of all the spans
626           returned as results of all those evaluations.
627
628           Within BLOCK, $_ is locally set to an array ref of the form
629
630             [ $lower, $upper ]
631
632           as described above for "grep_spans".  Each evaluation of BLOCK must
633           return a list of spans.  Each returned list may contain zero, one,
634           or more spans.  Spans may be returned in any order, and need not be
635           disjoint.  However, for each bounded span, the constraint
636
637             $lower <= $upper
638
639           must hold.
640

CLASS VARIABLES

642       $Set::IntSpan::Empty_String
643           $Set::IntSpan::Empty_String contains the string that is returned
644           when "run_list" is called on the empty set.  $Empty_String is ini‐
645           tially '-'; alternatively, it may be set to ''.  Other values
646           should be avoided, to ensure that "run_list" always returns a valid
647           run list.
648
649           "run_list" accesses $Empty_String through a reference stored in
650           $set->{"empty_string"}.  Subclasses that wish to override the value
651           of $Empty_String can reassign this reference.
652

DIAGNOSTICS

654       Any method (except "valid") will "die" if it is passed an invalid run
655       list.
656
657       "Set::IntSpan::_copy_run_list: Bad syntax:" $runList
658           (F) $run_list has bad syntax
659
660       "Set::IntSpan::_copy_run_list: Bad order:" $runList
661           (F) $run_list has overlapping runs or runs that are out of order.
662
663       "Set::IntSpan::elements: infinite set"
664           (F) An infinite set was passed to "elements".
665
666       "Set::IntSpan::at: negative infinite set"
667           (F) "at" was called with a non-negative index on a negative infi‐
668           nite set.
669
670       "Set::IntSpan::at: positive infinite set"
671           (F) "at" was called with a negative index on a positive infinite
672           set.
673
674       "Set::IntSpan::slice: negative infinite set"
675           (F) "slice" was called with $from non-negative on a negative infi‐
676           nite set.
677
678       "Set::IntSpan::slice: positive infinite set"
679           (F) "slice" was called with $from negative on a positive infinite
680           set.
681
682       Out of memory!
683           (X) "elements" $set can generate an "Out of memory!"  message on
684           sufficiently large finite sets.
685

NOTES

687       Traps
688
689       Beware of forms like
690
691         union $set [1..5];
692
693       This passes an element of @set to union, which is probably not what you
694       want.  To force interpretation of $set and [1..5] as separate argu‐
695       ments, use forms like
696
697           union $set +[1..5];
698
699       or
700
701           $set->union([1..5]);
702
703       grep_set and map_set
704
705       "grep_set" and "map_set" make it easy to construct sets for which the
706       internal representation used by "Set::IntSpan" is not small. Consider:
707
708         $billion = new Set::IntSpan '0-1_000_000_000';   # OK
709         $odd     = grep_set { $_ & 1 } $billion;         # trouble
710         $even    = map_set  { $_ * 2 } $billion;         # double trouble
711
712       Error handling
713
714       There are two common approaches to error handling: exceptions and
715       return codes.  There seems to be some religion on the topic, so
716       "Set::IntSpan" provides support for both.
717
718       To catch exceptions, protect method calls with an eval:
719
720           $run_list = <STDIN>;
721           eval { $set = new Set::IntSpan $run_list };
722           $@ and print "$@: try again\n";
723
724       To check return codes, use an appropriate method call to validate argu‐
725       ments:
726
727           $run_list = <STDIN>;
728           if (valid Set::IntSpan $run_list)
729              { $set = new Set::IntSpan $run_list }
730           else
731              { print "$@ try again\n" }
732
733       Similarly, use "finite" to protect calls to "elements":
734
735           finite $set and @elements = elements $set;
736
737       Calling "elements" on a large, finite set can generate an "Out of mem‐
738       ory!" message, which cannot (easily) be trapped.  Applications that
739       must retain control after an error can use "intersect" to protect calls
740       to "elements":
741
742           @elements = elements { intersect $set "-1_000_000 - 1_000_000" };
743
744       or check the size of $set first:
745
746           finite $set and cardinality $set < 2_000_000 and @elements = elements $set;
747
748       Limitations
749
750       Although "Set::IntSpan" can represent some infinite sets, it does not
751       perform infinite-precision arithmetic.  Therefore, finite elements are
752       restricted to the range of integers on your machine.
753
754       Extensions
755
756       Users report that you can construct Set::IntSpan objects on anything
757       that behaves like an integer. For example:
758
759           $x   = new Math::BigInt ...;
760           $set = new Set::Intspan [ [ $x, $x+5 ] ];
761
762       I'm not documenting this as supported behavior, because I don't have
763       the resources to test it, but I'll try not to break it.  If anyone
764       finds problems with it, let me know.
765
766       Roots
767
768       The sets implemented here are based on a Macintosh data structure
769       called a region. See Inside Macintosh for more information.
770
771       "Set::IntSpan" was originally written to manage run lists for the
772       "News::Newsrc" module.
773

AUTHOR

775       Steven McDougall <swmcd@world.std.com>
776

ACKNOWLEDGMENTS

778       ·   Malcolm Cook <mec@stowers-institute.org>
779
780       ·   David Hawthorne <dsrthorne@hotmail.com>
781
782       ·   Martin Krzywinski <martink@bcgsc.ca>
783
784       ·   Marc Lehmann <schmorp@schmorp.de>
785
787       Copyright (c) 1996-2007 by Steven McDougall. This module is free soft‐
788       ware; you can redistribute it and/or modify it under the same terms as
789       Perl itself.
790
791
792
793perl v5.8.8                       2007-10-27                        IntSpan(3)
Impressum