analyze.asciidoc 3.3 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112
  1. [[indices-analyze]]
  2. == Analyze
  3. Performs the analysis process on a text and return the tokens breakdown
  4. of the text.
  5. Can be used without specifying an index against one of the many built in
  6. analyzers:
  7. [source,js]
  8. --------------------------------------------------
  9. curl -XGET 'localhost:9200/_analyze' -d '
  10. {
  11. "analyzer" : "standard",
  12. "text" : "this is a test"
  13. }'
  14. --------------------------------------------------
  15. coming[2.0.0-beta1, body based parameters were added in 2.0.0]
  16. If text parameter is provided as array of strings, it is analyzed as a multi-valued field.
  17. [source,js]
  18. --------------------------------------------------
  19. curl -XGET 'localhost:9200/_analyze' -d '
  20. {
  21. "analyzer" : "standard",
  22. "text" : ["this is a test", "the second text"]
  23. }'
  24. --------------------------------------------------
  25. coming[2.0.0-beta1, body based parameters were added in 2.0.0]
  26. Or by building a custom transient analyzer out of tokenizers,
  27. token filters and char filters. Token filters can use the shorter 'filters'
  28. parameter name:
  29. [source,js]
  30. --------------------------------------------------
  31. curl -XGET 'localhost:9200/_analyze' -d '
  32. {
  33. "tokenizer" : "keyword",
  34. "filters" : ["lowercase"],
  35. "text" : "this is a test"
  36. }'
  37. curl -XGET 'localhost:9200/_analyze' -d '
  38. {
  39. "tokenizer" : "keyword",
  40. "token_filters" : ["lowercase"],
  41. "char_filters" : ["html_strip"],
  42. "text" : "this is a <b>test</b>"
  43. }'
  44. --------------------------------------------------
  45. coming[2.0.0-beta1, body based parameters were added in 2.0.0]
  46. It can also run against a specific index:
  47. [source,js]
  48. --------------------------------------------------
  49. curl -XGET 'localhost:9200/test/_analyze' -d '
  50. {
  51. "text" : "this is a test"
  52. }'
  53. --------------------------------------------------
  54. The above will run an analysis on the "this is a test" text, using the
  55. default index analyzer associated with the `test` index. An `analyzer`
  56. can also be provided to use a different analyzer:
  57. [source,js]
  58. --------------------------------------------------
  59. curl -XGET 'localhost:9200/test/_analyze' -d '
  60. {
  61. "analyzer" : "whitespace",
  62. "text : "this is a test"
  63. }'
  64. --------------------------------------------------
  65. coming[2.0.0-beta1, body based parameters were added in 2.0.0]
  66. Also, the analyzer can be derived based on a field mapping, for example:
  67. [source,js]
  68. --------------------------------------------------
  69. curl -XGET 'localhost:9200/test/_analyze' -d '
  70. {
  71. "field" : "obj1.field1",
  72. "text" : "this is a test"
  73. }'
  74. --------------------------------------------------
  75. coming[2.0.0-beta1, body based parameters were added in 2.0.0]
  76. Will cause the analysis to happen based on the analyzer configured in the
  77. mapping for `obj1.field1` (and if not, the default index analyzer).
  78. All parameters can also supplied as request parameters. For example:
  79. [source,js]
  80. --------------------------------------------------
  81. curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&filters=lowercase&text=this+is+a+test'
  82. --------------------------------------------------
  83. For backwards compatibility, we also accept the text parameter as the body of the request,
  84. provided it doesn't start with `{` :
  85. [source,js]
  86. --------------------------------------------------
  87. curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&token_filters=lowercase&char_filters=html_strip' -d 'this is a <b>test</b>'
  88. --------------------------------------------------