analyze.asciidoc 3.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102
  1. [[indices-analyze]]
  2. == Analyze
  3. Performs the analysis process on a text and return the tokens breakdown
  4. of the text.
  5. Can be used without specifying an index against one of the many built in
  6. analyzers:
  7. [source,js]
  8. --------------------------------------------------
  9. curl -XGET 'localhost:9200/_analyze' -d '
  10. {
  11. "analyzer" : "standard",
  12. "text" : "this is a test"
  13. }'
  14. --------------------------------------------------
  15. If text parameter is provided as array of strings, it is analyzed as a multi-valued field.
  16. [source,js]
  17. --------------------------------------------------
  18. curl -XGET 'localhost:9200/_analyze' -d '
  19. {
  20. "analyzer" : "standard",
  21. "text" : ["this is a test", "the second text"]
  22. }'
  23. --------------------------------------------------
  24. Or by building a custom transient analyzer out of tokenizers,
  25. token filters and char filters. Token filters can use the shorter 'filters'
  26. parameter name:
  27. [source,js]
  28. --------------------------------------------------
  29. curl -XGET 'localhost:9200/_analyze' -d '
  30. {
  31. "tokenizer" : "keyword",
  32. "filters" : ["lowercase"],
  33. "text" : "this is a test"
  34. }'
  35. curl -XGET 'localhost:9200/_analyze' -d '
  36. {
  37. "tokenizer" : "keyword",
  38. "token_filters" : ["lowercase"],
  39. "char_filters" : ["html_strip"],
  40. "text" : "this is a <b>test</b>"
  41. }'
  42. --------------------------------------------------
  43. It can also run against a specific index:
  44. [source,js]
  45. --------------------------------------------------
  46. curl -XGET 'localhost:9200/test/_analyze' -d '
  47. {
  48. "text" : "this is a test"
  49. }'
  50. --------------------------------------------------
  51. The above will run an analysis on the "this is a test" text, using the
  52. default index analyzer associated with the `test` index. An `analyzer`
  53. can also be provided to use a different analyzer:
  54. [source,js]
  55. --------------------------------------------------
  56. curl -XGET 'localhost:9200/test/_analyze' -d '
  57. {
  58. "analyzer" : "whitespace",
  59. "text : "this is a test"
  60. }'
  61. --------------------------------------------------
  62. Also, the analyzer can be derived based on a field mapping, for example:
  63. [source,js]
  64. --------------------------------------------------
  65. curl -XGET 'localhost:9200/test/_analyze' -d '
  66. {
  67. "field" : "obj1.field1",
  68. "text" : "this is a test"
  69. }'
  70. --------------------------------------------------
  71. Will cause the analysis to happen based on the analyzer configured in the
  72. mapping for `obj1.field1` (and if not, the default index analyzer).
  73. All parameters can also supplied as request parameters. For example:
  74. [source,js]
  75. --------------------------------------------------
  76. curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&filters=lowercase&text=this+is+a+test'
  77. --------------------------------------------------
  78. For backwards compatibility, we also accept the text parameter as the body of the request,
  79. provided it doesn't start with `{` :
  80. [source,js]
  81. --------------------------------------------------
  82. curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&token_filters=lowercase&char_filters=html_strip' -d 'this is a <b>test</b>'
  83. --------------------------------------------------