1Search::Elasticsearch::UCsleirenCto:n:t6r_i0bS:ue:taSercdcrhoP:le:lrE(ll3a)Dsotciucmseenatracthi:o:nClient::6_0::Scroll(3)
2
3
4

NAME

6       Search::Elasticsearch::Client::6_0::Scroll - A helper module for
7       scrolled searches
8

VERSION

10       version 6.00
11

SYNOPSIS

13           use Search::Elasticsearch;
14
15           my $es     = Search::Elasticsearch->new;
16
17           my $scroll = $es->scroll_helper(
18               index       => 'my_index',
19               body => {
20                   query   => {...},
21                   size    => 1000,
22                   sort    => '_doc'
23               }
24           );
25
26           say "Total hits: ". $scroll->total;
27
28           while (my $doc = $scroll->next) {
29               # do something
30           }
31

DESCRIPTION

33       A scrolled search is a search that allows you to keep pulling results
34       until there are no more matching results, much like a cursor in an SQL
35       database.
36
37       Unlike paginating through results (with the "from" parameter in
38       search()), scrolled searches take a snapshot of the current state of
39       the index. Even if you keep adding new documents to the index or
40       updating existing documents, a scrolled search will only see the index
41       as it was when the search began.
42
43       This module is a helper utility that wraps the functionality of the
44       search() and scroll() methods to make them easier to use.
45
46       This class does Search::Elasticsearch::Client::6_0::Role::Scroll and
47       Search::Elasticsearch::Role::Is_Sync.
48

USE CASES

50       There are two primary use cases:
51
52   Pulling enough results
53       Perhaps you want to group your results by some field, and you don't
54       know exactly how many results you will need in order to return 10
55       grouped results.  With a scrolled search you can keep pulling more
56       results until you have enough.  For instance, you can search emails in
57       a mailing list, and return results grouped by "thread_id":
58
59           my (%groups,@results);
60
61           my $scroll = $es->scroll_helper(
62               index => 'my_emails',
63               type  => 'email',
64               body  => { query => {... some query ... }}
65           );
66
67           my $doc;
68           while (@results < 10 and $doc = $scroll->next) {
69
70               my $thread = $doc->{_source}{thread_id};
71
72               unless ($groups{$thread}) {
73                   $groups{$thread} = [];
74                   push @results, $groups{$thread};
75               }
76               push @{$groups{$thread}},$doc;
77
78           }
79
80   Extracting all documents
81       Often you will want to extract all (or a subset of) documents in an
82       index.  If you want to change your type mappings, you will need to
83       reindex all of your data. Or perhaps you want to move a subset of the
84       data in one index into a new dedicated index. In these cases, you don't
85       care about sort order, you just want to retrieve all documents which
86       match a query, and do something with them. For instance, to retrieve
87       all the docs for a particular "client_id":
88
89           my $scroll = $es->scroll_helper(
90               index       => 'my_index',
91               size        => 1000,
92               body        => {
93                   query => {
94                       match => {
95                           client_id => 123
96                       }
97                   },
98                   sort => '_doc'
99               }
100           );
101
102           while (my $doc = $scroll->next) {
103               # do something
104           }
105
106       Very often the something that you will want to do with these results
107       involves bulk-indexing them into a new index. The easiest way to do
108       this is to use the built-in "reindex()" in
109       Search::Elasticsearch::Client::6_0::Direct functionality provided by
110       Elasticsearch.
111

METHODS

113   "new()"
114           use Search::Elasticsearch;
115
116           my $es = Search::Elasticsearch->new(...);
117           my $scroll = $es->scroll_helper(
118               scroll         => '1m',            # optional
119               scroll_in_qs   => 0|1,             # optional
120               %search_params
121           );
122
123       The "scroll_helper()" in Search::Elasticsearch::Client::6_0::Direct
124       method loads Search::Elasticsearch::Client::6_0::Scroll class and calls
125       "new()", passing in any arguments.
126
127       You can specify a "scroll" duration (which defaults to "1m") and
128       "scroll_in_qs" (which defaults to "false"). Any other parameters are
129       passed directly to "search()" in
130       Search::Elasticsearch::Client::6_0::Direct.
131
132       The "scroll" duration tells Elasticearch how long it should keep the
133       scroll alive.  Note: this duration doesn't need to be long enough to
134       process all results, just long enough to process a single batch of
135       results.  The expiry gets renewed for another "scroll" period every
136       time new a new batch of results is retrieved from the cluster.
137
138       By default, the "scroll_id" is passed as the "body" to the scroll
139       request.  To send it in the query string instead, set "scroll_in_qs" to
140       a true value, but be aware: when querying very many indices, the scroll
141       ID can become too long for intervening proxies.
142
143       The "scroll" request uses "GET" by default.  To use "POST" instead, set
144       send_get_body_as to "POST".
145
146   "next()"
147           $doc  = $scroll->next;
148           @docs = $scroll->next($num);
149
150       The "next()" method returns the next result, or the next $num results
151       (pulling more results if required).  If all results have been
152       exhausted, it returns an empty list.
153
154   "drain_buffer()"
155           @docs = $scroll->drain_buffer;
156
157       The "drain_buffer()" method returns all of the documents currently in
158       the buffer, without fetching any more from the cluster.
159
160   "refill_buffer()"
161           $total = $scroll->refill_buffer;
162
163       The "refill_buffer()" method fetches the next batch of results from the
164       cluster, stores them in the buffer, and returns the total number of
165       docs currently in the buffer.
166
167   "buffer_size()"
168           $total = $scroll->buffer_size;
169
170       The "buffer_size()" method returns the total number of docs currently
171       in the buffer.
172
173   "finish()"
174           $scroll->finish;
175
176       The "finish()" method clears out the buffer, sets "is_finished()" to
177       "true" and tries to clear the "scroll_id" on Elasticsearch.  This API
178       is only supported since v0.90.6, but the call to "clear_scroll" is
179       wrapped in an "eval" so the "finish()" method can be safely called with
180       any version of Elasticsearch.
181
182       When the $scroll instance goes out of scope, "finish()" is called
183       automatically if required.
184
185   "is_finished()"
186           $bool = $scroll->is_finished;
187
188       A flag which returns "true" if all results have been processed or
189       "finish()" has been called.
190

INFO ACCESSORS

192       The information from the original search is returned via the following
193       accessors:
194
195   "total"
196       The total number of documents that matched your query.
197
198   "max_score"
199       The maximum score of any documents in your query.
200
201   "aggregations"
202       Any aggregations that were specified, or "undef"
203
204   "facets"
205       Any facets that were specified, or "undef"
206
207   "suggest"
208       Any suggestions that were specified, or "undef"
209
210   "took"
211       How long the original search took, in milliseconds
212
213   "took_total"
214       How long the original search plus all subsequent batches took, in
215       milliseconds.
216

SEE ALSO

218       ·   "search()" in Search::Elasticsearch::Client::6_0::Direct
219
220       ·   "scroll()" in Search::Elasticsearch::Client::6_0::Direct
221
222       ·   "reindex()" in Search::Elasticsearch::Client::6_0::Direct
223

AUTHOR

225       Clinton Gormley <drtech@cpan.org>
226
228       This software is Copyright (c) 2017 by Elasticsearch BV.
229
230       This is free software, licensed under:
231
232         The Apache License, Version 2.0, January 2004
233
234
235
236perl v5.30.0                     S2e0a1r9c-h0:7:-E2l6asticsearch::Client::6_0::Scroll(3)
Impressum