mapper-murmur3.asciidoc 1.9 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475
  1. [[mapper-murmur3]]
  2. === Mapper Murmur3 Plugin
  3. The mapper-murmur3 plugin provides the ability to compute hash of field values
  4. at index-time and store them in the index. This can sometimes be helpful when
  5. running cardinality aggregations on high-cardinality and large string fields.
  6. :plugin_name: mapper-murmur3
  7. include::install_remove.asciidoc[]
  8. [[mapper-murmur3-usage]]
  9. ==== Using the `murmur3` field
  10. The `murmur3` is typically used within a multi-field, so that both the original
  11. value and its hash are stored in the index:
  12. [source,js]
  13. --------------------------
  14. PUT my_index
  15. {
  16. "mappings": {
  17. "properties": {
  18. "my_field": {
  19. "type": "keyword",
  20. "fields": {
  21. "hash": {
  22. "type": "murmur3"
  23. }
  24. }
  25. }
  26. }
  27. }
  28. }
  29. --------------------------
  30. // CONSOLE
  31. Such a mapping would allow to refer to `my_field.hash` in order to get hashes
  32. of the values of the `my_field` field. This is only useful in order to run
  33. `cardinality` aggregations:
  34. [source,js]
  35. --------------------------
  36. # Example documents
  37. PUT my_index/_doc/1
  38. {
  39. "my_field": "This is a document"
  40. }
  41. PUT my_index/_doc/2
  42. {
  43. "my_field": "This is another document"
  44. }
  45. GET my_index/_search
  46. {
  47. "aggs": {
  48. "my_field_cardinality": {
  49. "cardinality": {
  50. "field": "my_field.hash" <1>
  51. }
  52. }
  53. }
  54. }
  55. --------------------------
  56. // CONSOLE
  57. <1> Counting unique values on the `my_field.hash` field
  58. Running a `cardinality` aggregation on the `my_field` field directly would
  59. yield the same result, however using `my_field.hash` instead might result in
  60. a speed-up if the field has a high-cardinality. On the other hand, it is
  61. discouraged to use the `murmur3` field on numeric fields and string fields
  62. that are not almost unique as the use of a `murmur3` field is unlikely to
  63. bring significant speed-ups, while increasing the amount of disk space required
  64. to store the index.