1Search::Elasticsearch::UCsleirenCto:n:t8r_i0bS:ue:taSercdcrhoP:le:lrE(ll3a)Dsotciucmseenatracthi:o:nClient::8_0::Scroll(3)
2
3
4

NAME

6       Search::Elasticsearch::Client::8_0::Scroll - A helper module for
7       scrolled searches
8

VERSION

10       version 8.00
11

SYNOPSIS

13           use Search::Elasticsearch;
14
15           my $es     = Search::Elasticsearch->new;
16
17           my $scroll = $es->scroll_helper(
18               index       => 'my_index',
19               body => {
20                   query   => {...},
21                   size    => 1000,
22                   sort    => '_doc'
23               }
24           );
25
26           say "Total hits: ". $scroll->total;
27
28           while (my $doc = $scroll->next) {
29               # do something
30           }
31

DESCRIPTION

33       A scrolled search is a search that allows you to keep pulling results
34       until there are no more matching results, much like a cursor in an SQL
35       database.
36
37       Unlike paginating through results (with the "from" parameter in
38       search()), scrolled searches take a snapshot of the current state of
39       the index. Even if you keep adding new documents to the index or
40       updating existing documents, a scrolled search will only see the index
41       as it was when the search began.
42
43       This module is a helper utility that wraps the functionality of the
44       search() and scroll() methods to make them easier to use.
45
46       This class does Search::Elasticsearch::Client::8_0::Role::Scroll and
47       Search::Elasticsearch::Role::Is_Sync.
48

USE CASES

50       There are two primary use cases:
51
52   Pulling enough results
53       Perhaps you want to group your results by some field, and you don't
54       know exactly how many results you will need in order to return 10
55       grouped results.  With a scrolled search you can keep pulling more
56       results until you have enough.  For instance, you can search emails in
57       a mailing list, and return results grouped by "thread_id":
58
59           my (%groups,@results);
60
61           my $scroll = $es->scroll_helper(
62               index => 'my_emails',
63               type  => 'email',
64               body  => { query => {... some query ... }}
65           );
66
67           my $doc;
68           while (@results < 10 and $doc = $scroll->next) {
69
70               my $thread = $doc->{_source}{thread_id};
71
72               unless ($groups{$thread}) {
73                   $groups{$thread} = [];
74                   push @results, $groups{$thread};
75               }
76               push @{$groups{$thread}},$doc;
77
78           }
79
80   Extracting all documents
81       Often you will want to extract all (or a subset of) documents in an
82       index.  If you want to change your type mappings, you will need to
83       reindex all of your data. Or perhaps you want to move a subset of the
84       data in one index into a new dedicated index. In these cases, you don't
85       care about sort order, you just want to retrieve all documents which
86       match a query, and do something with them. For instance, to retrieve
87       all the docs for a particular "client_id":
88
89           my $scroll = $es->scroll_helper(
90               index       => 'my_index',
91               size        => 1000,
92               body        => {
93                   query => {
94                       match => {
95                           client_id => 123
96                       }
97                   },
98                   sort => '_doc'
99               }
100           );
101
102           while (my $doc = $scroll->next) {
103               # do something
104           }
105
106       Very often the something that you will want to do with these results
107       involves bulk-indexing them into a new index. The easiest way to do
108       this is to use the built-in "reindex()" in
109       Search::Elasticsearch::Client::8_0::Direct functionality provided by
110       Elasticsearch.
111

METHODS

113   new()
114           use Search::Elasticsearch;
115
116           my $es = Search::Elasticsearch->new(...);
117           my $scroll = $es->scroll_helper(
118               scroll         => '1m',            # optional
119               %search_params
120           );
121
122       The "scroll_helper()" in Search::Elasticsearch::Client::8_0::Direct
123       method loads Search::Elasticsearch::Client::8_0::Scroll class and calls
124       "new()", passing in any arguments.
125
126       You can specify a "scroll" duration (which defaults to "1m").  Any
127       other parameters are passed directly to "search()" in
128       Search::Elasticsearch::Client::8_0::Direct.
129
130       The "scroll" duration tells Elasticearch how long it should keep the
131       scroll alive.  Note: this duration doesn't need to be long enough to
132       process all results, just long enough to process a single batch of
133       results.  The expiry gets renewed for another "scroll" period every
134       time new a new batch of results is retrieved from the cluster.
135
136       By default, the "scroll_id" is passed as the "body" to the scroll
137       request.
138
139       The "scroll" request uses "GET" by default.  To use "POST" instead, set
140       send_get_body_as to "POST".
141
142   next()
143           $doc  = $scroll->next;
144           @docs = $scroll->next($num);
145
146       The next() method returns the next result, or the next $num results
147       (pulling more results if required).  If all results have been
148       exhausted, it returns an empty list.
149
150   drain_buffer()
151           @docs = $scroll->drain_buffer;
152
153       The drain_buffer() method returns all of the documents currently in the
154       buffer, without fetching any more from the cluster.
155
156   refill_buffer()
157           $total = $scroll->refill_buffer;
158
159       The refill_buffer() method fetches the next batch of results from the
160       cluster, stores them in the buffer, and returns the total number of
161       docs currently in the buffer.
162
163   buffer_size()
164           $total = $scroll->buffer_size;
165
166       The buffer_size() method returns the total number of docs currently in
167       the buffer.
168
169   finish()
170           $scroll->finish;
171
172       The finish() method clears out the buffer, sets "is_finished()" to
173       "true" and tries to clear the "scroll_id" on Elasticsearch.  This API
174       is only supported since v0.90.6, but the call to "clear_scroll" is
175       wrapped in an "eval" so the finish() method can be safely called with
176       any version of Elasticsearch.
177
178       When the $scroll instance goes out of scope, "finish()" is called
179       automatically if required.
180
181   is_finished()
182           $bool = $scroll->is_finished;
183
184       A flag which returns "true" if all results have been processed or
185       "finish()" has been called.
186

INFO ACCESSORS

188       The information from the original search is returned via the following
189       accessors:
190
191   "total"
192       The total number of documents that matched your query.
193
194   "max_score"
195       The maximum score of any documents in your query.
196
197   "aggregations"
198       Any aggregations that were specified, or "undef"
199
200   "facets"
201       Any facets that were specified, or "undef"
202
203   "suggest"
204       Any suggestions that were specified, or "undef"
205
206   "took"
207       How long the original search took, in milliseconds
208
209   "took_total"
210       How long the original search plus all subsequent batches took, in
211       milliseconds.
212

SEE ALSO

214       •   "search()" in Search::Elasticsearch::Client::8_0::Direct
215
216       •   "scroll()" in Search::Elasticsearch::Client::8_0::Direct
217
218       •   "reindex()" in Search::Elasticsearch::Client::8_0::Direct
219

AUTHOR

221       Enrico Zimuel <enrico.zimuel@elastic.co>
222
224       This software is Copyright (c) 2022 by Elasticsearch BV.
225
226       This is free software, licensed under:
227
228         The Apache License, Version 2.0, January 2004
229
230
231
232perl v5.36.0                     S2e0a2r3c-h0:1:-E2l0asticsearch::Client::8_0::Scroll(3)
Impressum