analyze.asciidoc 1.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
  1. [[indices-analyze]]
  2. == Analyze
  3. Performs the analysis process on a text and return the tokens breakdown
  4. of the text.
  5. Can be used without specifying an index against one of the many built in
  6. analyzers:
  7. [source,js]
  8. --------------------------------------------------
  9. curl -XGET 'localhost:9200/_analyze?analyzer=standard' -d 'this is a test'
  10. --------------------------------------------------
  11. Or by building a custom transient analyzer out of tokenizers and
  12. filters:
  13. [source,js]
  14. --------------------------------------------------
  15. curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&filters=lowercase' -d 'this is a test'
  16. --------------------------------------------------
  17. It can also run against a specific index:
  18. [source,js]
  19. --------------------------------------------------
  20. curl -XGET 'localhost:9200/test/_analyze?text=this+is+a+test'
  21. --------------------------------------------------
  22. The above will run an analysis on the "this is a test" text, using the
  23. default index analyzer associated with the `test` index. An `analyzer`
  24. can also be provided to use a different analyzer:
  25. [source,js]
  26. --------------------------------------------------
  27. curl -XGET 'localhost:9200/test/_analyze?analyzer=whitespace' -d 'this is a test'
  28. --------------------------------------------------
  29. Also, the analyzer can be derived based on a field mapping, for example:
  30. [source,js]
  31. --------------------------------------------------
  32. curl -XGET 'localhost:9200/test/_analyze?field=obj1.field1' -d 'this is a test'
  33. --------------------------------------------------
  34. Will cause the analysis to happen based on the analyzer configure in the
  35. mapping for `obj1.field1` (and if not, the default index analyzer).
  36. Also, the text can be provided as part of the request body, and not as a
  37. parameter.
  38. [float]
  39. === Format
  40. By default, the format the tokens are returned in are in json and its
  41. called `detailed`. The `text` format value provides the analyzed data in
  42. a text stream that is a bit more readable.