testing.asciidoc 2.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687
  1. [[test-analyzer]]
  2. === Test an analyzer
  3. The <<indices-analyze,`analyze` API>> is an invaluable tool for viewing the
  4. terms produced by an analyzer. A built-in analyzer (or combination of built-in
  5. tokenizer, token filters, and character filters) can be specified inline in
  6. the request:
  7. [source,console]
  8. -------------------------------------
  9. POST _analyze
  10. {
  11. "analyzer": "whitespace",
  12. "text": "The quick brown fox."
  13. }
  14. POST _analyze
  15. {
  16. "tokenizer": "standard",
  17. "filter": [ "lowercase", "asciifolding" ],
  18. "text": "Is this déja vu?"
  19. }
  20. -------------------------------------
  21. .Positions and character offsets
  22. *********************************************************
  23. As can be seen from the output of the `analyze` API, analyzers not only
  24. convert words into terms, they also record the order or relative _positions_
  25. of each term (used for phrase queries or word proximity queries), and the
  26. start and end _character offsets_ of each term in the original text (used for
  27. highlighting search snippets).
  28. *********************************************************
  29. Alternatively, a <<analysis-custom-analyzer,`custom` analyzer>> can be
  30. referred to when running the `analyze` API on a specific index:
  31. [source,console]
  32. -------------------------------------
  33. PUT my_index
  34. {
  35. "settings": {
  36. "analysis": {
  37. "analyzer": {
  38. "std_folded": { <1>
  39. "type": "custom",
  40. "tokenizer": "standard",
  41. "filter": [
  42. "lowercase",
  43. "asciifolding"
  44. ]
  45. }
  46. }
  47. }
  48. },
  49. "mappings": {
  50. "properties": {
  51. "my_text": {
  52. "type": "text",
  53. "analyzer": "std_folded" <2>
  54. }
  55. }
  56. }
  57. }
  58. GET my_index/_analyze <3>
  59. {
  60. "analyzer": "std_folded", <4>
  61. "text": "Is this déjà vu?"
  62. }
  63. GET my_index/_analyze <3>
  64. {
  65. "field": "my_text", <5>
  66. "text": "Is this déjà vu?"
  67. }
  68. -------------------------------------
  69. <1> Define a `custom` analyzer called `std_folded`.
  70. <2> The field `my_text` uses the `std_folded` analyzer.
  71. <3> To refer to this analyzer, the `analyze` API must specify the index name.
  72. <4> Refer to the analyzer by name.
  73. <5> Refer to the analyzer used by field `my_text`.