1Data::Stream::Bulk(3) User Contributed Perl DocumentationData::Stream::Bulk(3)
2
3
4
6 Data::Stream::Bulk - N at a time iteration API
7
9 version 0.11
10
12 # get a bulk stream from somewere
13 my $s = Data::Stream::Bulk::Foo->new( ... );
14
15 # can be used like this:
16 until ( $s->is_done ) {
17 foreach my $item ( $s->items ) {
18 process($item);
19 }
20 }
21
22 # or like this:
23 while( my $block = $s->next ) {
24 foreach my $item ( @$block ) {
25 process($item);
26 }
27 }
28
30 This module tries to find middle ground between one at a time and all
31 at once processing of data sets.
32
33 The purpose of this module is to avoid the overhead of implementing an
34 iterative api when this isn't necessary, without breaking forward
35 compatibility in case that becomes necessary later on.
36
37 The API optimizes for when a data set typically fits in memory and is
38 returned as an array, but the consumer cannot assume that the data set
39 is bounded.
40
41 The API is destructive in order to minimize the chance that resultsets
42 are leaked due to improper usage.
43
45 Required Methods
46 The API requires two methods to be implemented:
47
48 is_done
49 Should return true if the stream is exhausted.
50
51 As long as this method returns a false value (not done) "next"
52 could potentially return another block.
53
54 next
55 Returns the next block.
56
57 Note that "next" is not guaranteed to return an array reference,
58 even if "is_done" returned false prior to calling it.
59
60 Convenience Methods
61 items
62 This method calls "next" and dereferences the result if there are
63 pending items.
64
65 all Force evaluation of the entire resultset.
66
67 Note that for large data sets this might cause swap thrashing of
68 various other undesired effects. Use with caution.
69
70 cat @streams
71 Concatenates this stream with @streams, returning a single stream.
72
73 list_cat @tail
74 Returns a possibly cleaned up list of streams.
75
76 Used by "cat".
77
78 Overridden by Data::Stream::Bulk::Array, Data::Stream::Bulk::Cat
79 and Data::Stream::Bulk::Nil to implement some simple short
80 circuiting.
81
82 filter $filter
83 Applies a per-block block filter to the stream.
84
85 Returns a possibly new stream with the filtering layered.
86
87 $filter is invoked once per block and should return an array
88 reference to the filtered block.
89
90 chunk $chunk_size
91 Chunks the input stream so that each block returned by "next" will
92 have at least $chunk_size items.
93
94 loaded
95 Should be overridden to return true if all the items are already
96 realized (e.g. in the case of Data::Stream::Bulk::Array).
97
98 Returns false by default.
99
100 When true calling "all" is supposed to be safe (memory usage should
101 be in the same order of magnitude as stream's own usage).
102
103 This is typically useful when tranforming an array is easier than
104 transorming a stream (e.g. optional duplicate filtering).
105
107 Data::Stream::Bulk::Array
108 This class is not a stream at all, but just one block. When the
109 data set easily fits in memory this class can be used, while
110 retaining forward compatibility with larger data sets.
111
112 Data::Stream::Bulk::Callback
113 Callback driven iteration.
114
115 Data::Stream::Bulk::Chunked
116 Wrapper to return larger blocks from an existing stream.
117
118 Data::Stream::Bulk::DBI
119 Bulk fetching of data from DBI statement handles.
120
121 Data::Stream::Bulk::DBIC
122 DBIx::Class::ResultSet iteration.
123
124 Data::Stream::Bulk::FileHandle
125 Iterates over lines in a text file.
126
127 Data::Stream::Bulk::Nil
128 An empty result set.
129
130 Data::Stream::Bulk::Cat
131 A concatenation of several streams.
132
133 Data::Stream::Bulk::Filter
134 A filter wrapping a stream.
135
137 HOP::Stream, Iterator, Class::Iterator etc for one by one iteration
138
139 DBI, DBIx::Class::ResultSet
140
141 POE::Filter
142
143 Data::Page
144
145 Parallel::Iterator
146
147 <http://en.wikipedia.org/wiki/MapReduce>, LISP, and all that other kool
148 aid
149
151 Sorted streams
152 Add a hint for sorted streams (like "loaded" but as an attribute in
153 the base role).
154
155 Introduce a "merge" operation for merging of sorted streams.
156
157 Optimize "unique" to make use of sorting hints for constant space
158 uniquing.
159
160 More utility functions
161 To assist in proccessing and creating streams.
162
163 Coercion tables
164 Moose::Util::TypeConstraints
165
167 This module is maintained using git. You can get the latest version
168 from <http://github.com/nothingmuch/data-stream-bulk/>.
169
171 Yuval Kogman <nothingmuch@woobling.org>
172
174 This software is copyright (c) 2012 by Yuval Kogman.
175
176 This is free software; you can redistribute it and/or modify it under
177 the same terms as the Perl 5 programming language system itself.
178
179
180
181perl v5.34.0 2022-01-21 Data::Stream::Bulk(3)