mapper-murmur3.asciidoc 2.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104
  1. [[mapper-murmur3]]
  2. === Mapper Murmur3 Plugin
  3. The mapper-murmur3 plugin provides the ability to compute hash of field values
  4. at index-time and store them in the index. This can sometimes be helpful when
  5. running cardinality aggregations on high-cardinality and large string fields.
  6. [[mapper-murmur3-install]]
  7. [float]
  8. ==== Installation
  9. This plugin can be installed using the plugin manager:
  10. [source,sh]
  11. ----------------------------------------------------------------
  12. sudo bin/elasticsearch-plugin install mapper-murmur3
  13. ----------------------------------------------------------------
  14. The plugin must be installed on every node in the cluster, and each node must
  15. be restarted after installation.
  16. This plugin can be downloaded for <<plugin-management-custom-url,offline install>> from
  17. {plugin_url}/mapper-murmur3/mapper-murmur3-{version}.zip.
  18. [[mapper-murmur3-remove]]
  19. [float]
  20. ==== Removal
  21. The plugin can be removed with the following command:
  22. [source,sh]
  23. ----------------------------------------------------------------
  24. sudo bin/elasticsearch-plugin remove mapper-murmur3
  25. ----------------------------------------------------------------
  26. The node must be stopped before removing the plugin.
  27. [[mapper-murmur3-usage]]
  28. ==== Using the `murmur3` field
  29. The `murmur3` is typically used within a multi-field, so that both the original
  30. value and its hash are stored in the index:
  31. [source,js]
  32. --------------------------
  33. PUT my_index
  34. {
  35. "mappings": {
  36. "my_type": {
  37. "properties": {
  38. "my_field": {
  39. "type": "keyword",
  40. "fields": {
  41. "hash": {
  42. "type": "murmur3"
  43. }
  44. }
  45. }
  46. }
  47. }
  48. }
  49. }
  50. --------------------------
  51. // CONSOLE
  52. Such a mapping would allow to refer to `my_field.hash` in order to get hashes
  53. of the values of the `my_field` field. This is only useful in order to run
  54. `cardinality` aggregations:
  55. [source,js]
  56. --------------------------
  57. # Example documents
  58. PUT my_index/my_type/1
  59. {
  60. "my_field": "This is a document"
  61. }
  62. PUT my_index/my_type/2
  63. {
  64. "my_field": "This is another document"
  65. }
  66. GET my_index/_search
  67. {
  68. "aggs": {
  69. "my_field_cardinality": {
  70. "cardinality": {
  71. "field": "my_field.hash" <1>
  72. }
  73. }
  74. }
  75. }
  76. --------------------------
  77. // CONSOLE
  78. <1> Counting unique values on the `my_field.hash` field
  79. Running a `cardinality` aggregation on the `my_field` field directly would
  80. yield the same result, however using `my_field.hash` instead might result in
  81. a speed-up if the field has a high-cardinality. On the other hand, it is
  82. discouraged to use the `murmur3` field on numeric fields and string fields
  83. that are not almost unique as the use of a `murmur3` field is unlikely to
  84. bring significant speed-ups, while increasing the amount of disk space required
  85. to store the index.