testing.asciidoc 2.1 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788
  1. == Testing analyzers
  2. The <<indices-analyze,`analyze` API>> is an invaluable tool for viewing the
  3. terms produced by an analyzer. A built-in analyzer (or combination of built-in
  4. tokenizer, token filters, and character filters) can be specified inline in
  5. the request:
  6. [source,js]
  7. -------------------------------------
  8. POST _analyze
  9. {
  10. "analyzer": "whitespace",
  11. "text": "The quick brown fox."
  12. }
  13. POST _analyze
  14. {
  15. "tokenizer": "standard",
  16. "filter": [ "lowercase", "asciifolding" ],
  17. "text": "Is this déja vu?"
  18. }
  19. -------------------------------------
  20. // CONSOLE
  21. .Positions and character offsets
  22. *********************************************************
  23. As can be seen from the output of the `analyze` API, analyzers not only
  24. convert words into terms, they also record the order or relative _positions_
  25. of each term (used for phrase queries or word proximity queries), and the
  26. start and end _character offsets_ of each term in the original text (used for
  27. highlighting search snippets).
  28. *********************************************************
  29. Alternatively, a <<analysis-custom-analyzer,`custom` analyzer>> can be
  30. referred to when running the `analyze` API on a specific index:
  31. [source,js]
  32. -------------------------------------
  33. PUT my_index
  34. {
  35. "settings": {
  36. "analysis": {
  37. "analyzer": {
  38. "std_folded": { <1>
  39. "type": "custom",
  40. "tokenizer": "standard",
  41. "filter": [
  42. "lowercase",
  43. "asciifolding"
  44. ]
  45. }
  46. }
  47. }
  48. },
  49. "mappings": {
  50. "properties": {
  51. "my_text": {
  52. "type": "text",
  53. "analyzer": "std_folded" <2>
  54. }
  55. }
  56. }
  57. }
  58. GET my_index/_analyze <3>
  59. {
  60. "analyzer": "std_folded", <4>
  61. "text": "Is this déjà vu?"
  62. }
  63. GET my_index/_analyze <3>
  64. {
  65. "field": "my_text", <5>
  66. "text": "Is this déjà vu?"
  67. }
  68. -------------------------------------
  69. // CONSOLE
  70. <1> Define a `custom` analyzer called `std_folded`.
  71. <2> The field `my_text` uses the `std_folded` analyzer.
  72. <3> To refer to this analyzer, the `analyze` API must specify the index name.
  73. <4> Refer to the analyzer by name.
  74. <5> Refer to the analyzer used by field `my_text`.