stop-analyzer.asciidoc 1.9 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697
  1. [[analysis-stop-analyzer]]
  2. === Stop Analyzer
  3. The `stop` analyzer is the same as the <<analysis-simple-analyzer,`simple` analyzer>>
  4. but adds support for removing stop words. It defaults to using the
  5. `_english_` stop words.
  6. [float]
  7. === Definition
  8. It consists of:
  9. Tokenizer::
  10. * <<analysis-lowercase-tokenizer,Lower Case Tokenizer>>
  11. Token filters::
  12. * <<analysis-stop-tokenfilter,Stop Token Filter>>
  13. [float]
  14. === Example output
  15. [source,js]
  16. ---------------------------
  17. POST _analyze
  18. {
  19. "analyzer": "stop",
  20. "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
  21. }
  22. ---------------------------
  23. // CONSOLE
  24. The above sentence would produce the following terms:
  25. [source,text]
  26. ---------------------------
  27. [ quick, brown, foxes, jumped, over, lazy, dog, s, bone ]
  28. ---------------------------
  29. [float]
  30. === Configuration
  31. The `stop` analyzer accepts the following parameters:
  32. [horizontal]
  33. `stopwords`::
  34. A pre-defined stop words list like `_english_` or an array containing a
  35. list of stop words. Defaults to `_english_`.
  36. `stopwords_path`::
  37. The path to a file containing stop words.
  38. See the <<analysis-stop-tokenfilter,Stop Token Filter>> for more information
  39. about stop word configuration.
  40. [float]
  41. === Example configuration
  42. In this example, we configure the `stop` analyzer to use a specified list of
  43. words as stop words:
  44. [source,js]
  45. ----------------------------
  46. PUT my_index
  47. {
  48. "settings": {
  49. "analysis": {
  50. "analyzer": {
  51. "my_stop_analyzer": {
  52. "type": "stop",
  53. "stopwords": ["the", "over"]
  54. }
  55. }
  56. }
  57. }
  58. }
  59. GET _cluster/health?wait_for_status=yellow
  60. POST my_index/_analyze
  61. {
  62. "analyzer": "my_stop_analyzer",
  63. "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
  64. }
  65. ----------------------------
  66. // CONSOLE
  67. The above example produces the following terms:
  68. [source,text]
  69. ---------------------------
  70. [ quick, brown, foxes, jumped, lazy, dog, s, bone ]
  71. ---------------------------