mapper-murmur3.asciidoc 2.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101
  1. [[mapper-murmur3]]
  2. === Mapper Murmur3 Plugin
  3. The mapper-murmur3 plugin provides the ability to compute hash of field values
  4. at index-time and store them in the index. This can sometimes be helpful when
  5. running cardinality aggregations on high-cardinality and large string fields.
  6. [[mapper-murmur3-install]]
  7. [float]
  8. ==== Installation
  9. This plugin can be installed using the plugin manager:
  10. [source,sh]
  11. ----------------------------------------------------------------
  12. sudo bin/elasticsearch-plugin install mapper-murmur3
  13. ----------------------------------------------------------------
  14. The plugin must be installed on every node in the cluster, and each node must
  15. be restarted after installation.
  16. [[mapper-murmur3-remove]]
  17. [float]
  18. ==== Removal
  19. The plugin can be removed with the following command:
  20. [source,sh]
  21. ----------------------------------------------------------------
  22. sudo bin/elasticsearch-plugin remove mapper-murmur3
  23. ----------------------------------------------------------------
  24. The node must be stopped before removing the plugin.
  25. [[mapper-murmur3-usage]]
  26. ==== Using the `murmur3` field
  27. The `murmur3` is typically used within a multi-field, so that both the original
  28. value and its hash are stored in the index:
  29. [source,js]
  30. --------------------------
  31. PUT my_index
  32. {
  33. "mappings": {
  34. "my_type": {
  35. "properties": {
  36. "my_field": {
  37. "type": "keyword",
  38. "fields": {
  39. "hash": {
  40. "type": "murmur3"
  41. }
  42. }
  43. }
  44. }
  45. }
  46. }
  47. }
  48. --------------------------
  49. // CONSOLE
  50. Such a mapping would allow to refer to `my_field.hash` in order to get hashes
  51. of the values of the `my_field` field. This is only useful in order to run
  52. `cardinality` aggregations:
  53. [source,js]
  54. --------------------------
  55. # Example documents
  56. PUT my_index/my_type/1
  57. {
  58. "my_field": "This is a document"
  59. }
  60. PUT my_index/my_type/2
  61. {
  62. "my_field": "This is another document"
  63. }
  64. GET my_index/_search
  65. {
  66. "aggs": {
  67. "my_field_cardinality": {
  68. "cardinality": {
  69. "field": "my_field.hash" <1>
  70. }
  71. }
  72. }
  73. }
  74. --------------------------
  75. // CONSOLE
  76. <1> Counting unique values on the `my_field.hash` field
  77. Running a `cardinality` aggregation on the `my_field` field directly would
  78. yield the same result, however using `my_field.hash` instead might result in
  79. a speed-up if the field has a high-cardinality. On the other hand, it is
  80. discouraged to use the `murmur3` field on numeric fields and string fields
  81. that are not almost unique as the use of a `murmur3` field is unlikely to
  82. bring significant speed-ups, while increasing the amount of disk space required
  83. to store the index.