regexp-query.asciidoc 2.8 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798
  1. [[query-dsl-regexp-query]]
  2. === Regexp Query
  3. The `regexp` query allows you to use regular expression term queries.
  4. See <<regexp-syntax>> for details of the supported regular expression language.
  5. The "term queries" in that first sentence means that Elasticsearch will apply
  6. the regexp to the terms produced by the tokenizer for that field, and not
  7. to the original text of the field.
  8. *Note*: The performance of a `regexp` query heavily depends on the
  9. regular expression chosen. Matching everything like `.*` is very slow as
  10. well as using lookaround regular expressions. If possible, you should
  11. try to use a long prefix before your regular expression starts. Wildcard
  12. matchers like `.*?+` will mostly lower performance.
  13. [source,js]
  14. --------------------------------------------------
  15. GET /_search
  16. {
  17. "query": {
  18. "regexp":{
  19. "name.first": "s.*y"
  20. }
  21. }
  22. }
  23. --------------------------------------------------
  24. // CONSOLE
  25. Boosting is also supported
  26. [source,js]
  27. --------------------------------------------------
  28. GET /_search
  29. {
  30. "query": {
  31. "regexp":{
  32. "name.first":{
  33. "value":"s.*y",
  34. "boost":1.2
  35. }
  36. }
  37. }
  38. }
  39. --------------------------------------------------
  40. // CONSOLE
  41. You can also use special flags
  42. [source,js]
  43. --------------------------------------------------
  44. GET /_search
  45. {
  46. "query": {
  47. "regexp":{
  48. "name.first": {
  49. "value": "s.*y",
  50. "flags" : "INTERSECTION|COMPLEMENT|EMPTY"
  51. }
  52. }
  53. }
  54. }
  55. --------------------------------------------------
  56. // CONSOLE
  57. Possible flags are `ALL` (default), `ANYSTRING`, `COMPLEMENT`,
  58. `EMPTY`, `INTERSECTION`, `INTERVAL`, or `NONE`. Please check the
  59. http://lucene.apache.org/core/4_9_0/core/org/apache/lucene/util/automaton/RegExp.html[Lucene
  60. documentation] for their meaning
  61. Regular expressions are dangerous because it's easy to accidentally
  62. create an innocuous looking one that requires an exponential number of
  63. internal determinized automaton states (and corresponding RAM and CPU)
  64. for Lucene to execute. Lucene prevents these using the
  65. `max_determinized_states` setting (defaults to 10000). You can raise
  66. this limit to allow more complex regular expressions to execute.
  67. [source,js]
  68. --------------------------------------------------
  69. GET /_search
  70. {
  71. "query": {
  72. "regexp":{
  73. "name.first": {
  74. "value": "s.*y",
  75. "flags" : "INTERSECTION|COMPLEMENT|EMPTY",
  76. "max_determinized_states": 20000
  77. }
  78. }
  79. }
  80. }
  81. --------------------------------------------------
  82. // CONSOLE
  83. NOTE: By default the maximum length of regex string allowed in a Regexp Query
  84. is limited to 1000. You can update the `index.max_regex_length` index setting
  85. to bypass this limit.
  86. include::regexp-syntax.asciidoc[]