mapper-murmur3.asciidoc 1.9 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
  1. [[mapper-murmur3]]
  2. === Mapper Murmur3 Plugin
  3. The mapper-murmur3 plugin provides the ability to compute hash of field values
  4. at index-time and store them in the index. This can sometimes be helpful when
  5. running cardinality aggregations on high-cardinality and large string fields.
  6. :plugin_name: mapper-murmur3
  7. include::install_remove.asciidoc[]
  8. [[mapper-murmur3-usage]]
  9. ==== Using the `murmur3` field
  10. The `murmur3` is typically used within a multi-field, so that both the original
  11. value and its hash are stored in the index:
  12. [source,js]
  13. --------------------------
  14. PUT my_index
  15. {
  16. "mappings": {
  17. "_doc": {
  18. "properties": {
  19. "my_field": {
  20. "type": "keyword",
  21. "fields": {
  22. "hash": {
  23. "type": "murmur3"
  24. }
  25. }
  26. }
  27. }
  28. }
  29. }
  30. }
  31. --------------------------
  32. // CONSOLE
  33. Such a mapping would allow to refer to `my_field.hash` in order to get hashes
  34. of the values of the `my_field` field. This is only useful in order to run
  35. `cardinality` aggregations:
  36. [source,js]
  37. --------------------------
  38. # Example documents
  39. PUT my_index/_doc/1
  40. {
  41. "my_field": "This is a document"
  42. }
  43. PUT my_index/_doc/2
  44. {
  45. "my_field": "This is another document"
  46. }
  47. GET my_index/_search
  48. {
  49. "aggs": {
  50. "my_field_cardinality": {
  51. "cardinality": {
  52. "field": "my_field.hash" <1>
  53. }
  54. }
  55. }
  56. }
  57. --------------------------
  58. // CONSOLE
  59. <1> Counting unique values on the `my_field.hash` field
  60. Running a `cardinality` aggregation on the `my_field` field directly would
  61. yield the same result, however using `my_field.hash` instead might result in
  62. a speed-up if the field has a high-cardinality. On the other hand, it is
  63. discouraged to use the `murmur3` field on numeric fields and string fields
  64. that are not almost unique as the use of a `murmur3` field is unlikely to
  65. bring significant speed-ups, while increasing the amount of disk space required
  66. to store the index.