1Data::Stream::Bulk(3) User Contributed Perl DocumentationData::Stream::Bulk(3)
2
3
4

NAME

6       Data::Stream::Bulk - N at a time iteration API
7

VERSION

9       version 0.11
10

SYNOPSIS

12               # get a bulk stream from somewere
13               my $s = Data::Stream::Bulk::Foo->new( ... );
14
15               # can be used like this:
16               until ( $s->is_done ) {
17                       foreach my $item ( $s->items ) {
18                               process($item);
19                       }
20               }
21
22               # or like this:
23               while( my $block = $s->next ) {
24                       foreach my $item ( @$block ) {
25                               process($item);
26                       }
27               }
28

DESCRIPTION

30       This module tries to find middle ground between one at a time and all
31       at once processing of data sets.
32
33       The purpose of this module is to avoid the overhead of implementing an
34       iterative api when this isn't necessary, without breaking forward
35       compatibility in case that becomes necessary later on.
36
37       The API optimizes for when a data set typically fits in memory and is
38       returned as an array, but the consumer cannot assume that the data set
39       is bounded.
40
41       The API is destructive in order to minimize the chance that resultsets
42       are leaked due to improper usage.
43

API

45   Required Methods
46       The API requires two methods to be implemented:
47
48       is_done
49           Should return true if the stream is exhausted.
50
51           As long as this method returns a false value (not done) "next"
52           could potentially return another block.
53
54       next
55           Returns the next block.
56
57           Note that "next" is not guaranteed to return an array reference,
58           even if "is_done" returned false prior to calling it.
59
60   Convenience Methods
61       items
62           This method calls "next" and dereferences the result if there are
63           pending items.
64
65       all Force evaluation of the entire resultset.
66
67           Note that for large data sets this might cause swap thrashing of
68           various other undesired effects. Use with caution.
69
70       cat @streams
71           Concatenates this stream with @streams, returning a single stream.
72
73       list_cat @tail
74           Returns a possibly cleaned up list of streams.
75
76           Used by "cat".
77
78           Overridden by Data::Stream::Bulk::Array, Data::Stream::Bulk::Cat
79           and Data::Stream::Bulk::Nil to implement some simple short
80           circuiting.
81
82       filter $filter
83           Applies a per-block block filter to the stream.
84
85           Returns a possibly new stream with the filtering layered.
86
87           $filter is invoked once per block and should return an array
88           reference to the filtered block.
89
90       chunk $chunk_size
91           Chunks the input stream so that each block returned by "next" will
92           have at least $chunk_size items.
93
94       loaded
95           Should be overridden to return true if all the items are already
96           realized (e.g.  in the case of Data::Stream::Bulk::Array).
97
98           Returns false by default.
99
100           When true calling "all" is supposed to be safe (memory usage should
101           be in the same order of magnitude as stream's own usage).
102
103           This is typically useful when tranforming an array is easier than
104           transorming a stream (e.g. optional duplicate filtering).
105

CLASSES

107       Data::Stream::Bulk::Array
108           This class is not a stream at all, but just one block. When the
109           data set easily fits in memory this class can be used, while
110           retaining forward compatibility with larger data sets.
111
112       Data::Stream::Bulk::Callback
113           Callback driven iteration.
114
115       Data::Stream::Bulk::Chunked
116           Wrapper to return larger blocks from an existing stream.
117
118       Data::Stream::Bulk::DBI
119           Bulk fetching of data from DBI statement handles.
120
121       Data::Stream::Bulk::DBIC
122           DBIx::Class::ResultSet iteration.
123
124       Data::Stream::Bulk::FileHandle
125           Iterates over lines in a text file.
126
127       Data::Stream::Bulk::Nil
128           An empty result set.
129
130       Data::Stream::Bulk::Cat
131           A concatenation of several streams.
132
133       Data::Stream::Bulk::Filter
134           A filter wrapping a stream.
135

SEE ALSO

137       HOP::Stream, Iterator, Class::Iterator etc for one by one iteration
138
139       DBI, DBIx::Class::ResultSet
140
141       POE::Filter
142
143       Data::Page
144
145       Parallel::Iterator
146
147       <http://en.wikipedia.org/wiki/MapReduce>, LISP, and all that other kool
148       aid
149

TODO

151       Sorted streams
152           Add a hint for sorted streams (like "loaded" but as an attribute in
153           the base role).
154
155           Introduce a "merge" operation for merging of sorted streams.
156
157           Optimize "unique" to make use of sorting hints for constant space
158           uniquing.
159
160       More utility functions
161           To assist in proccessing and creating streams.
162
163       Coercion tables
164           Moose::Util::TypeConstraints
165

VERSION CONTROL

167       This module is maintained using git. You can get the latest version
168       from <http://github.com/nothingmuch/data-stream-bulk/>.
169

AUTHOR

171       Yuval Kogman <nothingmuch@woobling.org>
172
174       This software is copyright (c) 2012 by Yuval Kogman.
175
176       This is free software; you can redistribute it and/or modify it under
177       the same terms as the Perl 5 programming language system itself.
178
179
180
181perl v5.38.0                      2023-07-20             Data::Stream::Bulk(3)
Impressum