mlt-field-query.asciidoc 2.9 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374
  1. [[query-dsl-mlt-field-query]]
  2. === More Like This Field Query
  3. The `more_like_this_field` query is the same as the `more_like_this`
  4. query, except it runs against a single field. It provides nicer query
  5. DSL over the generic `more_like_this` query, and support typed fields
  6. query (automatically wraps typed fields with type filter to match only
  7. on the specific type).
  8. [source,js]
  9. --------------------------------------------------
  10. {
  11. "more_like_this_field" : {
  12. "name.first" : {
  13. "like_text" : "text like this one",
  14. "min_term_freq" : 1,
  15. "max_query_terms" : 12
  16. }
  17. }
  18. }
  19. --------------------------------------------------
  20. `more_like_this_field` can be shortened to `mlt_field`.
  21. The `more_like_this_field` top level parameters include:
  22. [cols="<,<",options="header",]
  23. |=======================================================================
  24. |Parameter |Description
  25. |`like_text` |The text to find documents like it, *required*.
  26. |`minimum_should_match`| coming[1.5.0] From the generated query, the number of terms that
  27. must match following the <<query-dsl-minimum-should-match,minimum should
  28. syntax>>. (Defaults to `"30%"`).
  29. |`percent_terms_to_match` | deprecated[1.5.0,Replaced by `minimum_should_match`]
  30. From the generated query, the percentage of terms that must match (float value
  31. between 0 and 1). Defaults to `0.3` (30 percent).
  32. |`min_term_freq` |The frequency below which terms will be ignored in the
  33. source doc. The default frequency is `2`.
  34. |`max_query_terms` |The maximum number of query terms that will be
  35. included in any generated query. Defaults to `25`.
  36. |`stop_words` |An array of stop words. Any word in this set is
  37. considered "uninteresting" and ignored. Even if your Analyzer allows
  38. stopwords, you might want to tell the MoreLikeThis code to ignore them,
  39. as for the purposes of document similarity it seems reasonable to assume
  40. that "a stop word is never interesting".
  41. |`min_doc_freq` |The frequency at which words will be ignored which do
  42. not occur in at least this many docs. Defaults to `5`.
  43. |`max_doc_freq` |The maximum frequency in which words may still appear.
  44. Words that appear in more than this many docs will be ignored. Defaults
  45. to unbounded.
  46. |`min_word_length` |The minimum word length below which words will be
  47. ignored. Defaults to `0`. (Old name "min_word_len" is deprecated)
  48. |`max_word_length` |The maximum word length above which words will be
  49. ignored. Defaults to unbounded (`0`). (Old name "max_word_len" is deprecated)
  50. |`boost_terms` |Sets the boost factor to use when boosting terms.
  51. Defaults to deactivated (`0`). Any other value activates boosting with given
  52. boost factor.
  53. |`boost` |Sets the boost value of the query. Defaults to `1.0`.
  54. |`analyzer` |The analyzer that will be used to analyze the text.
  55. Defaults to the analyzer associated with the field.
  56. |=======================================================================