ignore-above.asciidoc 1.6 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
  1. [[ignore-above]]
  2. === `ignore_above`
  3. Strings longer than the `ignore_above` setting will not be indexed or stored.
  4. For arrays of strings, `ignore_above` will be applied for each array element separately and string elements longer than `ignore_above` will not be indexed or stored.
  5. NOTE: All strings/array elements will still be present in the `_source` field, if the latter is enabled which is the default in Elasticsearch.
  6. [source,console]
  7. --------------------------------------------------
  8. PUT my_index
  9. {
  10. "mappings": {
  11. "properties": {
  12. "message": {
  13. "type": "keyword",
  14. "ignore_above": 20 <1>
  15. }
  16. }
  17. }
  18. }
  19. PUT my_index/_doc/1 <2>
  20. {
  21. "message": "Syntax error"
  22. }
  23. PUT my_index/_doc/2 <3>
  24. {
  25. "message": "Syntax error with some long stacktrace"
  26. }
  27. GET my_index/_search <4>
  28. {
  29. "aggs": {
  30. "messages": {
  31. "terms": {
  32. "field": "message"
  33. }
  34. }
  35. }
  36. }
  37. --------------------------------------------------
  38. <1> This field will ignore any string longer than 20 characters.
  39. <2> This document is indexed successfully.
  40. <3> This document will be indexed, but without indexing the `message` field.
  41. <4> Search returns both documents, but only the first is present in the terms aggregation.
  42. TIP: The `ignore_above` setting can be updated on
  43. existing fields using the <<indices-put-mapping,PUT mapping API>>.
  44. This option is also useful for protecting against Lucene's term byte-length
  45. limit of `32766`.
  46. NOTE: The value for `ignore_above` is the _character count_, but Lucene counts
  47. bytes. If you use UTF-8 text with many non-ASCII characters, you may want to
  48. set the limit to `32766 / 4 = 8191` since UTF-8 characters may occupy at most
  49. 4 bytes.