1ElasticSearch::SearchBuUisledrerC(o3n)tributed Perl DocuEmleansttaitciSoenarch::SearchBuilder(3)
2
3
4
6 ElasticSearch::SearchBuilder - A Perlish compact query language for
7 ElasticSearch
8
10 Version 0.16
11
12 Compatible with ElasticSearch version 0.19.11
13
15 The 'text' queries have been renamed 'match' queries in elasticsearch
16 0.19.9. If you need support for an older version of elasticsearch,
17 please use
18 <https://metacpan.org/release/DRTECH/ElasticSearch-SearchBuilder-0.15/>.
19
21 The Query DSL for ElasticSearch (see Query DSL
22 <http://www.elasticsearch.org/guide/reference/query-dsl>), which is
23 used to write queries and filters, is simple but verbose, which can
24 make it difficult to write and understand large queries.
25
26 ElasticSearch::SearchBuilder is an SQL::Abstract-like query language
27 which exposes the full power of the query DSL, but in a more compact,
28 Perlish way.
29
30 This module is considered stable. If you have suggestions for
31 improvements to the API or the documenation, please contact me.
32
34 my $sb = ElasticSearch::SearchBuilder->new();
35 my $query = $sb->query({
36 body => 'interesting keywords',
37 -filter => {
38 status => 'active',
39 tags => ['perl','python','ruby'],
40 created => {
41 '>=' => '2010-01-01',
42 '<' => '2011-01-01'
43 },
44 }
45 })
46
47 NOTE: "ElasticSearch::SearchBuilder" is fully integrated with the
48 ElasticSearch API. Wherever you can specify "query", "filter" or
49 "facet_filter" in ElasticSearch, you can automatically use
50 SearchBuilder by specifying "queryb", "filterb", "facet_filterb"
51 instead.
52
53 $es->search( queryb => { body => 'interesting keywords' } )
54
56 new()
57 my $sb = ElasticSearch::SearchBuilder->new()
58
59 Creates a new instance of the SearchBuilder - takes no parameters.
60
61 query()
62 my $es_query = $sb->query($compact_query)
63
64 Returns a query in the ElasticSearch query DSL.
65
66 $compact_query can be a scalar, a hash ref or an array ref.
67
68 $sb->query('foo')
69 # { "query" : { "match" : { "_all" : "foo" }}}
70
71 $sb->query({ ... }) or $sb->query([ ... ])
72 # { "query" : { ... }}
73
74 filter()
75 my $es_filter = $sb->filter($compact_filter)
76
77 Returns a filter in the ElasticSearch query DSL.
78
79 $compact_filter can be a scalar, a hash ref or an array ref.
80
81 $sb->filter('foo')
82 # { "filter" : { "term" : { "_all" : "foo" }}}
83
84 $sb->filter({ ... }) or $sb->filter([ ... ])
85 # { "filter" : { ... }}
86
88 IMPORTANT: If you are not familiar with ElasticSearch then you should
89 read "ELASTICSEARCH CONCEPTS" before continuing.
90
91 This module was inspired by SQL::Abstract but they are not compatible
92 with each other.
93
94 The easiest way to explain how the syntax works is to give examples:
95
96 QUERY / FILTER CONTEXT
97 There are two contexts:
98
99 • "filter" context
100
101 Filter are fast and cacheable. They should be used to
102 include/exclude docs, based on simple term values. For instance,
103 exclude all docs that have neither tag "perl" nor "python".
104
105 Typically, most of your clauses should be filters, which reduce the
106 number of docs that need to be passed to the query.
107
108 • "query" context
109
110 Queries are smarter than filters, but more expensive, as they have
111 to calculate search relevance (ie "_score").
112
113 They should be used where:
114
115 • relevance is important, eg: in a search for tags "perl" or
116 "python", a doc that has BOTH tags is more relevant than a doc
117 that has only one
118
119 • where search terms need to be analyzed as full text, eg: find
120 me all docs where the "content" field includes the words "Perl
121 is GREAT", no matter how those words are capitalized.
122
123 The available operators (and the query/filter clauses that are
124 generated) differ according to which context you are in.
125
126 The initial context depends upon which method you use: "query()" puts
127 you into "query" context, and "filter()" into "filter" context.
128
129 However, you can switch from one context to another as follows:
130
131 $sb->query({
132
133 # query context
134 foo => 1,
135 bar => 2,
136
137 -filter => {
138 # filter context
139 foo => 1,
140 bar => 2,
141
142 -query => {
143 # query context
144 foo => 1
145 }
146 }
147 })
148
149 -filter | -not_filter
150
151 Switch from query context to filter context:
152
153 # query field content for 'brown cow', and filter documents
154 # where status is 'active' and tags contains the term 'perl'
155 {
156 content => 'brown cow',
157 -filter => {
158 status => 'active',
159 tags => 'perl'
160 }
161 }
162
163
164 # no query, just a filter:
165 { -filter => { status => 'active' }}
166
167 See Filtered Query <http://www.elasticsearch.org/guide/reference/query-
168 dsl/filtered-query.html> and Constant Score Query
169 <http://www.elasticsearch.org/guide/reference/query-dsl/constant-score-
170 query.html>
171
172 -query | -not_query
173
174 Use a query as a filter:
175
176 # query field content for 'brown cow', and filter documents
177 # where status is 'active', tags contains the term 'perl'
178 # and a match query on field title contains 'important'
179 {
180 content => 'brown cow',
181 -filter => {
182 status => 'active',
183 tags => 'perl',
184 -query => {
185 title => 'important'
186 }
187 }
188 }
189
190 See Query Filter <http://www.elasticsearch.org/guide/reference/query-
191 dsl/query-filter.html>
192
193 KEY-VALUE PAIRS
194 Key-value pairs are equivalent to the "=" operator, discussed below.
195 They are converted to "match" queries or "term" filters:
196
197 # Field 'foo' contains term 'bar'
198 # equiv: { foo => { '=' => 'bar' }}
199 { foo => 'bar' }
200
201
202
203 # Field 'foo' contains 'bar' or 'baz'
204 # equiv: { foo => { '=' => ['bar','baz'] }}
205 { foo => ['bar','baz']}
206
207
208 # Field 'foo' contains terms 'bar' AND 'baz'
209 # equiv: { foo => { '-and' => [ {'=' => 'bar'}, {'=' => 'baz'}] }}
210 { foo => ['-and','bar','baz']}
211
212
213 ### FILTER ONLY ###
214
215 # Field 'foo' is missing ie has no value
216 # equiv: { -missing => 'foo' }
217 { foo => undef }
218
219 AND|OR LOGIC
220 Arrays are OR'ed, hashes are AND'ed:
221
222 # tags = 'perl' AND status = 'active:
223 {
224 tags => 'perl',
225 status => 'active'
226 }
227
228 # tags = 'perl' OR status = 'active:
229 [
230 tags => 'perl',
231 status => 'active'
232 ]
233
234 # tags = 'perl' or tags = 'python':
235 { tags => [ 'perl','python' ]}
236 { tags => { '=' => [ 'perl','python' ] }}
237
238 # tags begins with prefix 'p' or 'r'
239 { tags => { '^' => [ 'p','r' ] }}
240
241 The logic in an array can changed from "OR" to "AND" by making the
242 first element of the array ref "-and":
243
244 # tags has term 'perl' AND 'python'
245
246 { tags => ['-and','perl','python']}
247
248 {
249 tags => [
250 -and => { '=' => 'perl'},
251 { '=' => 'python'}
252 ]
253 }
254
255 However, the first element in an array ref which is used as the value
256 for a field operator (see "FIELD OPERATORS") is not special:
257
258 # WRONG
259 { tags => { '=' => [ '-and','perl','python' ] }}
260
261 # RIGHT
262 { tags => ['-and' => [ {'=' => 'perl'}, {'=' => 'python'} ] ]}
263
264 ...otherwise you would never be able to search for the term "-and". So
265 if you might possibly have the terms "-and" or "-or" in your data, use:
266
267 { foo => {'=' => [....] }}
268
269 instead of:
270
271 { foo => [....]}
272
273 -and | -or | -not
274
275 These unary operators allow you apply "and", "or" and "not" logic to
276 nested queries or filters.
277
278 # Field foo has both terms 'bar' and 'baz'
279 { -and => [
280 foo => 'bar',
281 foo => 'baz'
282 ]}
283
284 # Field 'name' contains 'john smith', or the name field is missing
285 # and the 'desc' field contains 'john smith'
286
287 { -or => [
288 { name => 'John Smith' },
289 {
290 desc => 'John Smith'
291 -filter => { -missing => 'name' },
292 }
293 ]}
294
295 The "-and", "-or" and "-not" constructs emit "bool" queries when in
296 query context, and "and", "or" and "not" clauses when in filter
297 context.
298
299 See also: "NAMED FILTERS", Bool Query
300 <http://www.elasticsearch.org/guide/reference/query-dsl/bool-
301 query.html>, And Filter
302 <http://www.elasticsearch.org/guide/reference/query-dsl/and-
303 filter.html>, Or Filter
304 <http://www.elasticsearch.org/guide/reference/query-dsl/or-filter.html>
305 and Not Filter <http://www.elasticsearch.org/guide/reference/query-
306 dsl/not-filter.html>
307
308 FIELD OPERATORS
309 Most operators (eg "=", "gt", "geo_distance" etc) are applied to a
310 particular field. These are known as "Field Operators". For example:
311
312 # Field foo contains the term 'bar'
313 { foo => 'bar' }
314 { foo => {'=' => 'bar' }}
315
316 # Field created is between Jan 1 and Dec 31 2010
317 { created => {
318 '>=' => '2010-01-01',
319 '<' => '2011-01-01'
320 }}
321
322 # Field foo contains terms which begin with prefix 'a' or 'b' or 'c'
323 { foo => { '^' => ['a','b','c' ]}}
324
325 Some field operators are available as symbols (eg "=", "*", "^", "gt")
326 and others as words (eg "geo_distance" or "-geo_distance" - the dash is
327 optional).
328
329 Multiple field operators can be applied to a single field. Use "{}" to
330 imply "this AND that":
331
332 # Field foo has any value from 100 to 200
333 { foo => { gte => 100, lte => 200 }}
334
335 # Field foo begins with 'p' but is not python
336 { foo => {
337 '^' => 'p',
338 '!=' => 'python'
339 }}
340
341 Or "[]" to imply "this OR that"
342
343 # foo is 5 or foo greater than 10
344 { foo => [
345 { '=' => 5 },
346 { 'gt' => 10 }
347 ]}
348
349 All word operators may be negated by adding "not_" to the beginning,
350 eg:
351
352 # Field foo does NOT contain a term beginning with 'bar' or 'baz'
353 { foo => { not_prefix => ['bar','baz'] }}
354
355 UNARY OPERATORS
356 There are other operators which don't fit this "{ field => { op =>
357 value }}" model.
358
359 For instance:
360
361 • An operator might apply to multiple fields:
362
363 # Search fields 'title' and 'content' for text 'brown cow'
364 {
365 -match => {
366 query => 'brown cow',
367 fields => ['title','content']
368 }
369 }
370
371 • The field might BE the value:
372
373 # Find documents where the field 'foo' is blank or undefined
374 { -missing => 'foo' }
375
376 # Find documents where the field 'foo' exists and has a value
377 { -exists => 'foo' }
378
379 • For combining other queries or filters:
380
381 # Field foo has terms 'bar' and 'baz' but not 'balloo'
382 {
383 -and => [
384 foo => 'bar',
385 foo => 'baz',
386 -not => { foo => 'balloo' }
387 ]
388 }
389
390 • Other:
391
392 # Script query
393 { -script => "doc['num1'].value > 1" }
394
395 These operators are called "unary operators" and ALWAYS begin with a
396 dash "-" to distinguish them from field names.
397
398 Unary operators may also be prefixed with "not_" to negate their
399 meaning.
400
402 -all
403 The "-all" operator matches all documents:
404
405 # match all
406 { -all => 1 }
407 { -all => 0 }
408 { -all => {} }
409
410 In query context, the "match_all" query usually scores all docs as 1
411 (ie having the same relevance). By specifying a "norms_field", the
412 relevance can be read from that field (at the cost of a slower
413 execution time):
414
415 # Query context only
416 { -all =>{
417 boost => 1,
418 norms_field => 'doc_boost'
419 }}
420
422 These operators answer the question: "Does this field contain this
423 term?"
424
425 Filter equality operators work only with exact terms, while query
426 equality operators (the "match" family of queries) will "do the right
427 thing", ie work with terms for "not_analyzed" fields and with analyzed
428 text for "analyzed" fields.
429
430 EQUALITY (QUERIES)
431 = | -match | != | <> | -not_match
432
433 These operators all generate "match" queries:
434
435 # Analyzed field 'title' contains the terms 'Perl is GREAT'
436 # (which is analyzed to the terms 'perl','great')
437 { title => 'Perl is GREAT' }
438 { title => { '=' => 'Perl is GREAT' }}
439 { title => { match => 'Perl is GREAT' }}
440
441 # Not_analyzed field 'status' contains the EXACT term 'ACTIVE'
442 { status => 'ACTIVE' }
443 { status => { '=' => 'ACTIVE' }}
444 { status => { match => 'ACTIVE' }}
445
446 # Same as above but with extra parameters:
447 { title => {
448 match => {
449 query => 'Perl is GREAT',
450 boost => 2.0,
451 operator => 'and',
452 analyzer => 'default',
453 fuzziness => 0.5,
454 fuzzy_rewrite => 'constant_score_default',
455 lenient => 1,
456 max_expansions => 100,
457 minimum_should_match => 2,
458 prefix_length => 2,
459 }
460 }}
461
462 Operators "<>", "!=" and "not_match" are synonyms for each other and
463 just wrap the operator in a "not" clause.
464
465 See Match Query <http://www.elasticsearch.org/guide/reference/query-
466 dsl/match-query.html>
467
468 == | -phrase | -not_phrase
469
470 These operators look for a complete phrase.
471
472 For instance, given the text
473
474 The quick brown fox jumped over the lazy dog.
475
476 # matches
477 { content => { '==' => 'Quick Brown' }}
478
479 # doesn't match
480 { content => { '==' => 'Brown Quick' }}
481 { content => { '==' => 'Quick Fox' }}
482
483 The "slop" parameter can be used to allow the phrase to match words in
484 the same order, but further apart:
485
486 # with other parameters
487 { content => {
488 phrase => {
489 query => 'Quick Fox',
490 slop => 3,
491 analyzer => 'default'
492 boost => 1,
493 lenient => 1,
494 }}
495
496 See Match Query <http://www.elasticsearch.org/guide/reference/query-
497 dsl/match-query.html>
498
499 Multi-field -match | -not_match
500
501 To run a "match" | "=", "phrase" or "phrase_prefix" query against
502 multiple fields, you can use the "-match" unary operator:
503
504 {
505 -match => {
506 query => "Quick Fox",
507 type => 'boolean',
508 fields => ['content','title'],
509
510 use_dis_max => 1,
511 tie_breaker => 0.7,
512
513 boost => 2.0,
514 operator => 'and',
515 analyzer => 'default',
516 fuzziness => 0.5,
517 fuzzy_rewrite => 'constant_score_default',
518 lenient => 1,
519 max_expansions => 100,
520 minimum_should_match => 2,
521 prefix_length => 2,
522 }
523 }
524
525 The "type" parameter can be "boolean" (equivalent of "match" | "=")
526 which is the default, "phrase" or "phrase_prefix".
527
528 See Multi-match Query
529 <http://www.elasticsearch.org/guide/reference/query-dsl/multi-match-
530 query.html>.
531
532 -term | -terms | -not_term | -not_terms
533
534 The "term"/"terms" operators are provided for completeness. You should
535 almost always use the "match"/"=" operator instead.
536
537 There are only two use cases:
538
539 • To find the exact (ie not analyzed) term 'foo' in an analyzed
540 field:
541
542 { title => { term => 'foo' }}
543
544 • To match a list of possible terms, where more than 1 value must
545 match:
546
547 # match 2 or more of these tags
548 { tags => {
549 terms => {
550 value => ['perl','python','php'],
551 minimum_match => 2,
552 boost => 1,
553 }
554 }}
555
556 The above can also be achieved with the "-bool" operator.
557
558 "term" and "terms" are synonyms, as are "not_term" and "not_terms".
559
560 EQUALITY (FILTERS)
561 = | -term | -terms | <> | != | -not_term | -not_terms
562
563 These operators result in "term" or "terms" filters, which look for
564 fields which contain exactly the terms specified:
565
566 # Field foo has the term 'bar':
567 { foo => 'bar' }
568 { foo => { '=' => 'bar' }}
569 { foo => { 'term' => 'bar' }}
570
571 # Field foo has the term 'bar' or 'baz'
572 { foo => ['bar','baz'] }
573 { foo => { '=' => ['bar','baz'] }}
574 { foo => { 'term' => ['bar','baz'] }}
575
576 "<>" and "!=" are synonyms:
577
578 # Field foo does not contain the term 'bar':
579 { foo => { '!=' => 'bar' }}
580 { foo => { '<>' => 'bar' }}
581
582 # Field foo contains neither 'bar' nor 'baz'
583 { foo => { '!=' => ['bar','baz'] }}
584 { foo => { '<>' => ['bar','baz'] }}
585
586 The "terms" filter can take an "execution" parameter which affects how
587 the filter of multiple terms is executed and cached.
588
589 For instance:
590
591 { foo => {
592 -terms => {
593 value => ['foo','bar'],
594 execution => 'bool'
595 }
596 }}
597
598 See Term Filter <http://www.elasticsearch.org/guide/reference/query-
599 dsl/term-filter.html> and Terms Filter
600 <http://www.elasticsearch.org/guide/reference/query-dsl/terms-
601 filter.html>
602
604 lt | gt | lte | gte | < | <= | >= | > | -range | -not_range
605 These operators imply a range query or filter, which can be numeric or
606 alphabetical.
607
608 # Field foo contains terms between 'alpha' and 'beta'
609 { foo => {
610 'gte' => 'alpha',
611 'lte' => 'beta'
612 }}
613
614 # Field foo contains numbers between 10 and 20
615 { foo => {
616 'gte' => '10',
617 'lte' => '20'
618 }}
619
620 # boost a range *** query only ***
621 { foo => {
622 range => {
623 gt => 5,
624 gte => 5,
625 lt => 10,
626 lte => 10,
627 boost => 2.0
628 }
629 }}
630
631 For queries, "<" is a synonym for "lt", ">" for "gt" etc.
632
633 See Range Query <http://www.elasticsearch.org/guide/reference/query-
634 dsl/range-query.html>
635
636 Note: for filter clauses, the "gt","gte","lt" and "lte" operators imply
637 a "range" filter, while the "<", "<=", ">" and ">=" operators imply a
638 "numeric_range" filter.
639
640 This does not mean that you should use the "numeric_range" version for
641 any field which contains numbers!
642
643 The "numeric_range" filter should be used for numbers/datetimes which
644 have many distinct values, eg "ID" or "last_modified". If you have a
645 numeric field with few distinct values, eg "number_of_fingers" then it
646 is better to use a "range" filter.
647
648 See Range Filter <http://www.elasticsearch.org/guide/reference/query-
649 dsl/range-filter.html> and Numeric Range Filter
650 <http://www.elasticsearch.org/guide/reference/query-dsl/numeric-range-
651 filter.html>.
652
654 *** Filter context only ***
655
656 -missing | -exists
657 You can use a "missing" or "exists" filter to select only docs where a
658 particular field exists and has a value, or is undefined or has no
659 value:
660
661 # Field 'foo' has a value:
662 { foo => { exists => 1 }}
663 { foo => { missing => 0 }}
664 { -exists => 'foo' }
665
666 # Field 'foo' is undefined or has no value:
667 { foo => { missing => 1 }}
668 { foo => { exists => 0 }}
669 { -missing => 'foo' }
670 { foo => undef }
671
672 The "missing" filter also supports the "null_value" and "existence"
673 parameters:
674
675 {
676 foo => {
677 missing => {
678 null_value => 1,
679 existence => 1,
680 }
681 }
682 }
683
684 OR
685
686 { -missing => {
687 field => 'foo',
688 null_value => 1,
689 existence => 1,
690 }}
691
692 See Missing Filter <http://www.elasticsearch.org/guide/reference/query-
693 dsl/missing-filter.html> and Exists Filter
694 <http://www.elasticsearch.org/guide/reference/query-dsl/exists-
695 filter.html>
696
698 *** Query context only ***
699
700 For most full text search queries, the "match" queries are what you
701 want. These analyze the search terms, and look for documents that
702 contain one or more of those terms. (See "EQUALITY (QUERIES)").
703
704 -qs | -query_string | -not_qs | -not_query_string
705 However, there is a more advanced query string syntax (see Lucene Query
706 Parser Syntax
707 <http://lucene.apache.org/core/old_versioned_docs/versions/3_5_0/queryparsersyntax.html>)
708 which understands search terms like:
709
710 perl AND python tag:recent "this exact phrase" -apple
711
712 It is useful for "power" users, but has the disadvantage that, if the
713 syntax is incorrect, ES throws an error. You can use
714 ElasticSearch::QueryParser to fix any syntax errors.
715
716 # find docs whose 'title' field matches 'this AND that'
717 { title => { qs => 'this AND that' }}
718 { title => { query_string => 'this AND that' }}
719
720 # With other parameters
721 { title => {
722 field => {
723 query => 'this that ',
724 default_operator => 'AND',
725 analyzer => 'default',
726 allow_leading_wildcard => 0,
727 lowercase_expanded_terms => 1,
728 enable_position_increments => 1,
729 fuzzy_min_sim => 0.5,
730 fuzzy_prefix_length => 2,
731 fuzzy_rewrite => 'constant_score_default',
732 fuzzy_max_expansions => 1024,
733 lenient => 1,
734 phrase_slop => 10,
735 boost => 2,
736 analyze_wildcard => 1,
737 auto_generate_phrase_queries => 0,
738 rewrite => 'constant_score_default',
739 minimum_should_match => 3,
740 quote_analyzer => 'standard',
741 quote_field_suffix => '.unstemmed'
742 }
743 }}
744
745 The unary form "-qs" or "-query_string" can be used when matching
746 against multiple fields:
747
748 { -qs => {
749 query => 'this AND that ',
750 fields => ['title','content'],
751 default_operator => 'AND',
752 analyzer => 'default',
753 allow_leading_wildcard => 0,
754 lowercase_expanded_terms => 1,
755 enable_position_increments => 1,
756 fuzzy_min_sim => 0.5,
757 fuzzy_prefix_length => 2,
758 fuzzy_rewrite => 'constant_score_default',
759 fuzzy_max_expansions => 1024,
760 lenient => 1,
761 phrase_slop => 10,
762 boost => 2,
763 analyze_wildcard => 1,
764 auto_generate_phrase_queries => 0,
765 use_dis_max => 1,
766 tie_breaker => 0.7,
767 minimum_should_match => 3,
768 quote_analyzer => 'standard',
769 quote_field_suffix => '.unstemmed'
770 }}
771
772 See Query-string Query
773 <http://www.elasticsearch.org/guide/reference/query-dsl/query-string-
774 query.html>
775
776 -mlt | -not_mlt
777 An "mlt" or "more_like_this" query finds documents that are "like" the
778 specified text, where "like" means that it contains some or all of the
779 specified terms.
780
781 # Field foo is like "brown cow"
782 { foo => { mlt => "brown cow" }}
783
784 # With other paramters:
785 { foo => {
786 mlt => {
787 like_text => 'brown cow',
788 percent_terms_to_match => 0.3,
789 min_term_freq => 2,
790 max_query_terms => 25,
791 stop_words => ['the','and'],
792 min_doc_freq => 5,
793 max_doc_freq => 1000,
794 min_word_len => 0,
795 max_word_len => 20,
796 boost_terms => 2,
797 boost => 2.0,
798 analyzer => 'default'
799 }
800 }}
801
802 # multi fields
803 { -mlt => {
804 like_text => 'brown cow',
805 fields => ['title','content']
806 percent_terms_to_match => 0.3,
807 min_term_freq => 2,
808 max_query_terms => 25,
809 stop_words => ['the','and'],
810 min_doc_freq => 5,
811 max_doc_freq => 1000,
812 min_word_len => 0,
813 max_word_len => 20,
814 boost_terms => 2,
815 boost => 2.0,
816 analyzer => 'default'
817 }}
818
819 See MLT Field Query
820 <http://www.elasticsearch.org/guide/reference/query-dsl/mlt-field-
821 query.html> and MLT Query
822 <http://www.elasticsearch.org/guide/reference/query-dsl/mlt-query.html>
823
824 -flt | -not_flt
825 An "flt" or "fuzzy_like_this" query fuzzifies all specified terms, then
826 picks the best "max_query_terms" differentiating terms. It is a
827 combination of "fuzzy" with "more_like_this".
828
829 # Field foo is fuzzily similar to "brown cow"
830 { foo => { flt => 'brown cow }}
831
832 # With other parameters:
833 { foo => {
834 flt => {
835 like_text => 'brown cow',
836 ignore_tf => 0,
837 max_query_terms => 10,
838 min_similarity => 0.5,
839 prefix_length => 3,
840 boost => 2.0,
841 analyzer => 'default'
842 }
843 }}
844
845 # Multi-field
846 flt => {
847 like_text => 'brown cow',
848 fields => ['title','content'],
849 ignore_tf => 0,
850 max_query_terms => 10,
851 min_similarity => 0.5,
852 prefix_length => 3,
853 boost => 2.0,
854 analyzer => 'default'
855 }}
856
857 See FLT Field Query
858 <http://www.elasticsearch.org/guide/reference/query-dsl/flt-field-
859 query.html> and FLT Query
860 <http://www.elasticsearch.org/guide/reference/query-dsl/flt-query.html>
861
863 PREFIX (QUERIES)
864 ^ | -phrase_prefix | -not_phrase_prefix
865
866 These operators use the "match_phrase_prefix" query.
867
868 For "analyzed" fields, it analyzes the search terms, and does a
869 "match_phrase" query, with a "prefix" query on the last term. Think
870 "auto-complete".
871
872 For "not_analyzed" fields, this behaves the same as the term-based
873 "prefix" query.
874
875 For instance, given the phrase "The quick brown fox jumped over the
876 lazy dog":
877
878 # matches
879 { content => { '^' => 'qui'}}
880 { content => { '^' => 'quick br'}}
881 { content => { 'phrase_prefix' => 'quick brown f'}}
882
883 # doesn't match
884 { content => { '^' => 'quick fo' }}
885 { content => { 'phrase_prefix' => 'fox brow'}}
886
887 With extra options
888
889 { content => {
890 phrase_prefix => {
891 query => "Brown Fo",
892 slop => 3,
893 analyzer => 'default',
894 boost => 3.0,
895 max_expansions => 100,
896 }
897 }}
898
899 See
900 http://www.elasticsearch.org/guide/reference/query-dsl/match-query.html
901
902 -prefix | -not_prefix
903
904 The "prefix" query is a term-based query - no analysis takes place,
905 even on analyzed fields. Generally you should use "^" instead.
906
907 # Field 'lang' contains terms beginning with 'p'
908 { lang => { prefix => 'p' }}
909
910 # With extra options
911 { lang => {
912 'prefix' => {
913 value => 'p',
914 boost => 2,
915 rewrite => 'constant_score_default',
916
917 }
918 }}
919
920 See Prefix Query <http://www.elasticsearch.org/guide/reference/query-
921 dsl/prefix-query.html>.
922
923 PREFIX (FILTERS)
924 ^ | -prefix | -not_prefix
925
926 # Field foo contains a term which begins with 'bar'
927 { foo => { '^' => 'bar' }}
928 { foo => { 'prefix' => 'bar' }}
929
930 # Field foo contains a term which begins with 'bar' or 'baz'
931 { foo => { '^' => ['bar','baz'] }}
932 { foo => { 'prefix' => ['bar','baz'] }}
933
934 # Field foo contains a term which begins with neither 'bar' nor 'baz'
935 { foo => { 'not_prefix' => ['bar','baz'] }}
936
937 See Prefix Filter <http://www.elasticsearch.org/guide/reference/query-
938 dsl/prefix-filter.html>
939
941 *** Query context only ***
942
943 * | -wildcard | -not_wildcard
944 A "wildcard" is a term-based query (no analysis is applied), which does
945 shell globbing to find matching terms. In other words "?" represents
946 any single character, while "*" represents zero or more characters.
947
948 # Field foo matches 'f?ob*'
949 { foo => { '*' => 'f?ob*' }}
950 { foo => { 'wildcard' => 'f?ob*' }}
951
952 # with a boost:
953 { foo => {
954 '*' => { value => 'f?ob*', boost => 2.0 }
955 }}
956 { foo => {
957 'wildcard' => {
958 value => 'f?ob*',
959 boost => 2.0,
960 rewrite => 'constant_score_default',
961 }
962 }}
963
964 See Wildcard Query <http://www.elasticsearch.org/guide/reference/query-
965 dsl/wildcard-query.html>
966
967 -fuzzy | -not_fuzzy
968 A "fuzzy" query is a term-based query (ie no analysis is done) which
969 looks for terms that are similar to the the provided terms, where
970 similarity is based on the Levenshtein (edit distance) algorithm:
971
972 # Field foo is similar to 'fonbaz'
973 { foo => { fuzzy => 'fonbaz' }}
974
975 # With other parameters:
976 { foo => {
977 fuzzy => {
978 value => 'fonbaz',
979 boost => 2.0,
980 min_similarity => 0.2,
981 max_expansions => 10,
982 rewrite => 'constant_score_default',
983 }
984 }}
985
986 Normally, you should rather use either the "EQUALITY" queries with the
987 "fuzziness" parameter, or the -flt queries.
988
989 See Fuzzy Query <http://www.elasticsearch.org/guide/reference/query-
990 dsl/fuzzy-query.html>.
991
993 *** Query context only ***
994
995 These constructs allow you to combine multiple queries.
996
997 -dis_max | -dismax
998 While a "bool" query adds together the scores of the nested queries, a
999 "dis_max" query uses the highest score of any matching queries.
1000
1001 # Run the two queries and use the best score
1002 { -dismax => [
1003 { foo => 'bar' },
1004 { foo => 'baz' }
1005 ] }
1006
1007 # With other parameters
1008 { -dismax => {
1009 queries => [
1010 { foo => 'bar' },
1011 { foo => 'baz' }
1012 ],
1013 tie_breaker => 0.5,
1014 boost => 2.0
1015 ] }
1016
1017 See DisMax Query <http://www.elasticsearch.org/guide/reference/query-
1018 dsl/dis-max-query.html>
1019
1020 -bool
1021 Normally, there should be no need to use a "bool" query directly, as
1022 these are autogenerated from eg "-and", "-or" and "-not" constructs.
1023 However, if you need to pass any of the other parameters to a "bool"
1024 query, then you can do the following:
1025
1026 {
1027 -bool => {
1028 must => [{ foo => 'bar' }],
1029 must_not => { status => 'inactive' },
1030 should => [
1031 { tag => 'perl' },
1032 { tag => 'python' },
1033 { tag => 'ruby' },
1034 ],
1035 minimum_number_should_match => 2,
1036 disable_coord => 1,
1037 boost => 2
1038 }
1039 }
1040
1041 See Bool Query <http://www.elasticsearch.org/guide/reference/query-
1042 dsl/bool-query.html>
1043
1044 -boosting
1045 The "boosting" query can be used to "demote" results that match a given
1046 query. Unlike the "must_not" clause of a "bool" query, the query still
1047 matches, but the results are "less relevant".
1048
1049 { -boosting => {
1050 positive => { title => 'apple pear' },
1051 negative => { title => 'apple computer' },
1052 negative_boost => 0.2
1053 }}
1054
1055 See Boosting Query <http://www.elasticsearch.org/guide/reference/query-
1056 dsl/boosting-query.html>
1057
1058 -custom_boost
1059 The "custom_boost" query allows you to multiply the scores of another
1060 query by the specified boost factor. This is a bit different from a
1061 standard "boost", which is normalized.
1062
1063 {
1064 -custom_boost => {
1065 query => { title => 'foo' },
1066 boost_factor => 3
1067 }
1068 }
1069
1070 See Custom Boost Factor Query
1071 <http://www.elasticsearch.org/guide/reference/query-dsl/custom-boost-
1072 factor-query.html>.
1073
1075 Nested queries/filters allow you to run queries/filters on nested docs.
1076
1077 Normally, a doc like this would not allow you to associate the name
1078 "perl" with the number 5
1079
1080 {
1081 title: "my title",
1082 tags: [
1083 { name: "perl", num: 5},
1084 { name: "python", num: 2}
1085 ]
1086 }
1087
1088 However, if "tags" is mapped as a "nested" field, then you can run
1089 queries or filters on each sub-doc individually.
1090
1091 See Nested Type
1092 <http://www.elasticsearch.org/guide/reference/mapping/nested-
1093 type.html>, Nested Query
1094 <http://www.elasticsearch.org/guide/reference/query-dsl/nested-
1095 query.html> and Nested Filter
1096 <http://www.elasticsearch.org/guide/reference/query-dsl/nested-
1097 filter.html>
1098
1099 -nested (QUERY)
1100 {
1101 -nested => {
1102 path => 'tags',
1103 score_mode => 'avg',
1104 _scope => 'my_tags',
1105 query => {
1106 "tags.name" => 'perl',
1107 "tags.num" => { gt => 2 },
1108 }
1109 }
1110 }
1111
1112 See Nested Query <http://www.elasticsearch.org/guide/reference/query-
1113 dsl/nested-query.html>
1114
1115 -nested (FILTER)
1116 {
1117 -nested => {
1118 path => 'tags',
1119 score_mode => 'avg',
1120 _cache => 1,
1121 _name => 'my_filter',
1122 filter => {
1123 tags.name => 'perl',
1124 tags.num => { gt => 2},
1125 }
1126 }
1127 }
1128
1129 See Nested Filter <http://www.elasticsearch.org/guide/reference/query-
1130 dsl/nested-filter.html>
1131
1133 ElasticSearch supports the use of scripts to customise query or filter
1134 behaviour. By default the query language is "mvel" but javascript,
1135 groovy, python and native java scripts are also supported.
1136
1137 See Scripting
1138 <http://www.elasticsearch.org/guide/reference/modules/scripting.html>
1139 for more on scripting.
1140
1141 -custom_score
1142 *** Query context only ***
1143
1144 The "-custom_score" query allows you to customise the "_score" or
1145 relevance (and thus the order) of docs returned from a query.
1146
1147 {
1148 -custom_score => {
1149 query => { foo => 'bar' },
1150 lang => 'mvel',
1151 script => "_score * doc['my_numeric_field'].value / pow(param1, param2)"
1152 params => {
1153 param1 => 2,
1154 param2 => 3.1
1155 },
1156 }
1157 }
1158
1159 See Custom Score Query
1160 <http://www.elasticsearch.org/guide/reference/query-dsl/custom-score-
1161 query.html>
1162
1163 -custom_filters_score
1164 *** Query context only ***
1165
1166 The "-custom_filters_score" query allows you to boost documents that
1167 match a filter, either with a "boost" parameter, or with a custom
1168 "script".
1169
1170 This is a very powerful and efficient way to boost results which depend
1171 on matching unanalyzed fields, eg a "tag" or a "date". Also, these
1172 filters can be cached.
1173
1174 {
1175 -custom_filters_score => {
1176 query => { foo => 'bar' },
1177 score_mode => 'first|max|total|avg|min|multiply', # default 'first'
1178 max_boost => 10,
1179 filters => [
1180 {
1181 filter => { tag => 'perl' },
1182 boost => 2,
1183 },
1184 {
1185 filter => { tag => 'python' },
1186 script => '_score * my_boost',
1187 params => { my_boost => 2},
1188 lang => 'mvel'
1189 },
1190 ]
1191 }
1192 }
1193
1194 See Custom Filters Score Query
1195 <http://www.elasticsearch.org/guide/reference/query-dsl/custom-filters-
1196 score-query.html>
1197
1198 -script
1199 *** Filter context only ***
1200
1201 The "-script" filter allows you to use a script as a filter. Return a
1202 true value to indicate that the filter matches.
1203
1204 # Filter docs whose field 'foo' is greater than 5
1205 { -script => "doc['foo'].value > 5 " }
1206
1207 # With other params
1208 {
1209 -script => {
1210 script => "doc['foo'].value > minimum ",
1211 params => { minimum => 5 },
1212 lang => 'mvel'
1213 }
1214 }
1215
1216 See Script Filter <http://www.elasticsearch.org/guide/reference/query-
1217 dsl/script-filter.html>
1218
1220 Documents stored in ElasticSearch can be configured to have
1221 parent/child relationships.
1222
1223 See Parent Field
1224 <http://www.elasticsearch.org/guide/reference/mapping/parent-
1225 field.html> for more.
1226
1227 -has_parent | -not_has_parent
1228 Find child documents that have a parent document which matches a query.
1229
1230 # Find parent docs whose children of type 'comment' have the tag 'perl'
1231 {
1232 -has_parent => {
1233 type => 'comment',
1234 query => { tag => 'perl' },
1235 _scope => 'my_scope',
1236 boost => 1, # Query context only
1237 score_type => 'max' # Query context only
1238 }
1239 }
1240
1241 See Has Parent Query
1242 <http://www.elasticsearch.org/guide/reference/query-dsl/has-parent-
1243 query.html> and See Has Parent Filter
1244 <http://www.elasticsearch.org/guide/reference/query-dsl/has-parent-
1245 filter.html>.
1246
1247 -has_child | -not_has_child
1248 Find parent documents that have child documents which match a query.
1249
1250 # Find parent docs whose children of type 'comment' have the tag 'perl'
1251 {
1252 -has_child => {
1253 type => 'comment',
1254 query => { tag => 'perl' },
1255 _scope => 'my_scope',
1256 boost => 1, # Query context only
1257 score_type => 'max' # Query context only
1258 }
1259 }
1260
1261 See Has Child Query
1262 <http://www.elasticsearch.org/guide/reference/query-dsl/has-child-
1263 query.html> and See Has Child Filter
1264 <http://www.elasticsearch.org/guide/reference/query-dsl/has-child-
1265 filter.html>.
1266
1267 -top_children
1268 *** Query context only ***
1269
1270 The "top_children" query runs a query against the child docs, and
1271 aggregates the scores to find the parent docs whose children best
1272 match.
1273
1274 {
1275 -top_children => {
1276 type => 'blog_tag',
1277 query => { tag => 'perl' },
1278 score => 'max',
1279 factor => 5,
1280 incremental_factor => 2,
1281 _scope => 'my_scope'
1282 }
1283 }
1284
1285 See Top Children Query
1286 <http://www.elasticsearch.org/guide/reference/query-dsl/top-children-
1287 query.html>
1288
1290 For all the geo filters, the "normalize" parameter defaults to "true",
1291 meaning that the longitude value will be normalized to "-180" to 180
1292 and the latitude value to "-90" to 90.
1293
1294 -geo_distance | -not_geo_distance
1295 *** Filter context only ***
1296
1297 The "geo_distance" filter will find locations within a certain distance
1298 of a given point:
1299
1300 {
1301 my_location => {
1302 -geo_distance => {
1303 location => { lat => 10, lon => 5 },
1304 distance => '5km',
1305 normalize => 1 | 0,
1306 optimize_bbox => memory | indexed | none,
1307 }
1308 }
1309 }
1310
1311 See Geo Distance Filter
1312 <http://www.elasticsearch.org/guide/reference/query-dsl/geo-distance-
1313 filter.html>
1314
1315 -geo_distance_range | -not_geo_distance_range
1316 *** Filter context only ***
1317
1318 The "geo_distance_range" filter is similar to the -geo_distance filter,
1319 but expressed as a range:
1320
1321 {
1322 my_location => {
1323 -geo_distance => {
1324 location => { lat => 10, lon => 5 },
1325 from => '5km',
1326 to => '10km',
1327 include_lower => 1 | 0,
1328 include_upper => 0 | 1
1329 normalize => 1 | 0,
1330 optimize_bbox => memory | indexed | none,
1331 }
1332 }
1333 }
1334
1335 or instead of "from", "to", "include_lower" and "include_upper" you can
1336 use "gt", "gte", "lt", "lte".
1337
1338 See Geo Distance Range Filter
1339 <http://www.elasticsearch.org/guide/reference/query-dsl/geo-distance-
1340 range-filter.html>
1341
1342 -geo_bounding_box | -geo_bbox | -not_geo_bounding_box | -not_geo_bbox
1343 *** Filter context only ***
1344
1345 The "geo_bounding_box" filter finds points which lie within the given
1346 rectangle:
1347
1348 {
1349 my_location => {
1350 -geo_bbox => {
1351 top_left => { lat => 9, lon => 4 },
1352 bottom_right => { lat => 10, lon => 5 },
1353 normalize => 1 | 0,
1354 type => memory | indexed
1355 }
1356 }
1357 }
1358
1359 See Geo Bounding Box Filter
1360 <http://www.elasticsearch.org/guide/reference/query-dsl/geo-bounding-
1361 box-filter.html>
1362
1363 -geo_polygon | -not_geo_polygon
1364 *** Filter context only ***
1365
1366 The "geo_polygon" filter is similar to the -geo_bounding_box filter,
1367 except that it allows you to specify a polygon instead of a rectangle:
1368
1369 {
1370 my_location => {
1371 -geo_polygon => [
1372 { lat => 40, lon => -70 },
1373 { lat => 30, lon => -80 },
1374 { lat => 20, lon => -90 },
1375 ]
1376 }
1377 }
1378
1379 or:
1380
1381 {
1382 my_location => {
1383 -geo_polygon => {
1384 points => [
1385 { lat => 40, lon => -70 },
1386 { lat => 30, lon => -80 },
1387 { lat => 20, lon => -90 },
1388 ],
1389 normalize => 1 | 0,
1390 }
1391 }
1392 }
1393
1394 See Geo Polygon Filter
1395 <http://www.elasticsearch.org/guide/reference/query-dsl/geo-polygon-
1396 filter.html>
1397
1399 -indices
1400 *** Query context only ***
1401
1402 To run a different query depending on the index name, you can use the
1403 "-indices" query:
1404
1405 {
1406 -indices => {
1407 indices => 'one' | ['one','two],
1408 query => { status => 'active' },
1409 no_match_query => 'all' | 'none' | { another => query }
1410 }
1411 }
1412
1413 The `no_match_query` will be run on any indices which don't appear in
1414 the specified list. It defaults to "all", but can be set to "none" or
1415 to a full query.
1416
1417 See Indices Query <http://www.elasticsearch.org/guide/reference/query-
1418 dsl/indices-query.html>.
1419
1420 *** Filter context only ***
1421
1422 To run a different filter depending on the index name, you can use the
1423 "-indices" filter:
1424
1425 {
1426 -indices => {
1427 indices => 'one' | ['one','two],
1428 filter => { status => 'active' },
1429 no_match_filter => 'all' | 'none' | { another => filter }
1430 }
1431 }
1432
1433 The `no_match_filter` will be run on any indices which don't appear in
1434 the specified list. It defaults to "all", but can be set to "none" or
1435 to a full filter.
1436
1437 See Indices Filter
1438 <https://github.com/elasticsearch/elasticsearch/issues/1787>.
1439
1440 -ids
1441 The "_id" field is not indexed by default, and thus isn't available for
1442 normal queries or filters
1443
1444 Returns docs with the matching "_id" or "_type"/"_id" combination:
1445
1446 # doc with ID 123
1447 { -ids => 123 }
1448
1449 # docs with IDs 123 or 124
1450 { -ids => [123,124] }
1451
1452 # docs of types 'blog' or 'comment' with IDs 123 or 124
1453 {
1454 -ids => {
1455 type => ['blog','comment'],
1456 values => [123,124]
1457
1458 }
1459 }
1460
1461 See IDs Query <http://www.elasticsearch.org/guide/reference/query-
1462 dsl/ids-query.html> abd IDs Filter
1463 <http://www.elasticsearch.org/guide/reference/query-dsl/ids-
1464 filter.html>
1465
1466 -type
1467 *** Filter context only ***
1468
1469 Filters docs with matching "_type" fields.
1470
1471 While the "_type" field is indexed by default, ElasticSearch provides
1472 the "type" filter which will work even if indexing of the "_type" field
1473 is disabled.
1474
1475 # Filter docs of type 'comment'
1476 { -type => 'comment' }
1477
1478 # Filter docs of type 'comment' or 'blog'
1479 { -type => ['blog','comment' ]}
1480
1481 See Type Filter <http://www.elasticsearch.org/guide/reference/query-
1482 dsl/type-filter.html>
1483
1485 *** Filter context only ***
1486
1487 The "limit" filter limits the number of documents (per shard) to
1488 execute on:
1489
1490 {
1491 name => "Joe Bloggs",
1492 -filter => { -limit => 100 }
1493 }
1494
1495 See Limit Filter <http://www.elasticsearch.org/guide/reference/query-
1496 dsl/limit-filter.html>
1497
1499 ElasticSearch allows you to name filters, in which each search result
1500 will include a "matched_filters" array containing the names of all
1501 filters that matched.
1502
1503 -name | -not_name
1504 *** Filter context only ***
1505
1506 { -name => {
1507 popular => { user_rank => { 'gte' => 10 }},
1508 unpopular => { user_rank => { 'lt' => 10 }},
1509 }}
1510
1511 Multiple filters are joined with an "or" filter (as it doesn't make
1512 sense to join them with "and").
1513
1514 See Named Filters
1515 <http://www.elasticsearch.org/guide/reference/api/search/named-
1516 filters.html> and "-and | -or | -not".
1517
1519 Part of the performance boost that you get when using filters comes
1520 from the ability to cache the results of those filters. However, it
1521 doesn't make sense to cache all filters by default.
1522
1523 -cache | -nocache
1524 *** Filter context only ***
1525
1526 If you would like to override the default caching, then you can use
1527 "-cache" or "-nocache":
1528
1529 # Don't cache the term filter for 'status'
1530 {
1531 content => 'interesting post',
1532 -filter => {
1533 -nocache => { status => 'active' }
1534 }
1535 }
1536
1537 # Do cache the numeric range filter:
1538 {
1539 content => 'interesting post',
1540 -filter => {
1541 -cache => { created => {'>' => '2010-01-01' } }
1542 }
1543 }
1544
1545 See Query DSL <http://www.elasticsearch.org/guide/reference/query-dsl/>
1546 for more details about what is cached by default and what is not.
1547
1548 -cache_key
1549 It is also possible to use a name to identify a cached filter. For
1550 instance:
1551
1552 {
1553 -cache_key => {
1554 friends => { person_id => [1,2,3] },
1555 enemies => { person_id => [4,5,6] },
1556 }
1557 }
1558
1559 In the above example, the two filters will be joined by an "and"
1560 filter. The following example will have the two filters joined by an
1561 "or" filter:
1562
1563 {
1564 -cache_key => [
1565 friends => { person_id => [1,2,3] },
1566 enemies => { person_id => [4,5,6] },
1567 ]
1568 }
1569
1570 See _cache_key <http://www.elasticsearch.org/guide/reference/query-
1571 dsl/index.html> for more details.
1572
1574 Sometimes, instead of using the SearchBuilder syntax, you may want to
1575 revert to the raw Query DSL that ElasticSearch uses.
1576
1577 You can do this by passing a reference to a HASH ref, for instance:
1578
1579 $sb->query({
1580 foo => 1,
1581 -filter => \{ term => { bar => 2 }}
1582 })
1583
1584 Would result in:
1585
1586 {
1587 query => {
1588 filtered => {
1589 query => {
1590 match => { foo => 1 }
1591 },
1592 filter => {
1593 term => { bar => 2 }
1594 }
1595 }
1596 }
1597 }
1598
1599 An example with OR'ed filters:
1600
1601 $sb->filter([
1602 foo => 1,
1603 \{ term => { bar => 2 }}
1604 ])
1605
1606 Would result in:
1607
1608 {
1609 filter => {
1610 or => [
1611 { term => { foo => 1 }},
1612 { term => { bar => 2 }}
1613 ]
1614 }
1615 }
1616
1617 An example with AND'ed filters:
1618
1619 $sb->filter({
1620 -and => [
1621 foo => 1 ,
1622 \{ term => { bar => 2 }}
1623 ]
1624 })
1625
1626 Would result in:
1627
1628 {
1629 filter => {
1630 and => [
1631 { term => { foo => 1 }},
1632 { term => { bar => 2 }}
1633 ]
1634 }
1635 }
1636
1637 Wherever a filter or query is expected, passing a reference to a HASH-
1638 ref is accepted.
1639
1641 FILTERS VS QUERIES
1642 ElasticSearch supports filters and queries:
1643
1644 • A filter just answers the question: "Does this field match?
1645 Yes/No", eg:
1646
1647 • Does this document have the tag "beta"?
1648
1649 • Was this document published in 2011?
1650
1651 • A query is used to calculate relevance ( known in ElasticSearch as
1652 "_score"):
1653
1654 • Give me all documents that include the keywords "Foo" and "Bar"
1655 and rank them in order of relevance.
1656
1657 • Give me all documents whose "tag" field contains "perl" or
1658 "ruby" and rank documents that contain BOTH tags more highly.
1659
1660 Filters are lighter and faster, and the results can often be cached,
1661 but they don't contribute to the "_score" in any way.
1662
1663 Typically, most of your clauses will be filters, and just a few will be
1664 queries.
1665
1666 TERMS VS TEXT
1667 All data is stored in ElasticSearch as a "term", which is an exact
1668 value. The term "Foo" is not the same as "foo".
1669
1670 While this is useful for fields that have discreet values (eg "active",
1671 "inactive"), it is not sufficient to support full text search.
1672
1673 ElasticSearch has to analyze text to convert it into terms. This
1674 applies both to the text that the stored document contains, and to the
1675 text that the user tries to search on.
1676
1677 The default analyzer will:
1678
1679 • split the text on (most) punctuation and remove that punctuation
1680
1681 • lowercase each word
1682
1683 • remove English stopwords
1684
1685 For instance, "The 2 GREATEST widgets are foo-bar and fizz_buzz" would
1686 result in the terms "
1687 [2,'greatest','widgets','foo','bar','fizz_buzz']".
1688
1689 It is important that the same analyzer is used both for the stored text
1690 and for the search terms, otherwise the resulting terms may be
1691 different, and the query won't succeed.
1692
1693 For instance, a "term" query for "GREATEST" wouldn't work, but
1694 "greatest" would work. However, a "match" query for "GREATEST" would
1695 work, because the search text would be analyzed to produce the same
1696 terms that are stored in the index.
1697
1698 See Analysis <http://www.elasticsearch.org/guide/reference/index-
1699 modules/analysis/> for the list of supported analyzers.
1700
1701 "match" QUERIES
1702 ElasticSearch has a family of DWIM queries called "match" queries.
1703
1704 Their action depends upon how the field has been defined. If a field is
1705 "analyzed" (the default for string fields) then the "match" queries
1706 analyze the search terms before doing the search:
1707
1708 # Convert "Perl is GREAT" to the terms 'perl','great' and search
1709 # the 'content' field for those terms
1710
1711 { match: { content: "Perl is GREAT" }}
1712
1713 If a field is "not_analyzed", then it treats the search terms as a
1714 single term:
1715
1716 # Find all docs where the 'status' field contains EXACTLY the term 'ACTIVE'
1717 { match: { status: "ACTIVE" }}
1718
1719 Filters, on the other hand, don't have full text queries - filters
1720 operate on simple terms instead.
1721
1722 See Match Query <http://www.elasticsearch.org/guide/reference/query-
1723 dsl/match-query.html> for more about match queries.
1724
1726 Clinton Gormley, "<drtech at cpan.org>"
1727
1729 If you have any suggestions for improvements, or find any bugs, please
1730 report them to
1731 <https://github.com/clintongormley/ElasticSearch-SearchBuilder/issues>.
1732 I will be notified, and then you'll automatically be notified of
1733 progress on your bug as I make changes.
1734
1736 Add support for "span" queries.
1737
1739 You can find documentation for this module with the perldoc command.
1740
1741 perldoc ElasticSearch::SearchBuilder
1742
1743 You can also look for information at: <http://www.elasticsearch.org>
1744
1746 Thanks to SQL::Abstract for providing the inspiration and some of the
1747 internals.
1748
1750 Copyright 2011 Clinton Gormley.
1751
1752 This program is free software; you can redistribute it and/or modify it
1753 under the terms of either: the GNU General Public License as published
1754 by the Free Software Foundation; or the Artistic License.
1755
1756 See <http://dev.perl.org/licenses/> for more information.
1757
1758
1759
1760perl v5.34.0 2021-05-21 ElasticSearch::SearchBuilder(3)