mapper-murmur3.asciidoc 2.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104
  1. [[mapper-murmur3]]
  2. === Mapper Murmur3 Plugin
  3. The mapper-murmur3 plugin provides the ability to compute hash of field values
  4. at index-time and store them in the index. This can sometimes be helpful when
  5. running cardinality aggregations on high-cardinality and large string fields.
  6. [[mapper-murmur3-install]]
  7. [float]
  8. ==== Installation
  9. This plugin can be installed using the plugin manager:
  10. [source,sh]
  11. ----------------------------------------------------------------
  12. sudo bin/elasticsearch-plugin install mapper-murmur3
  13. ----------------------------------------------------------------
  14. The plugin must be installed on every node in the cluster, and each node must
  15. be restarted after installation.
  16. This plugin can be downloaded for offline install from
  17. {plugin_url}/mapper-murmur3/{version}/mapper-murmur3-{version}.zip[elastic download service].
  18. [[mapper-murmur3-remove]]
  19. [float]
  20. ==== Removal
  21. The plugin can be removed with the following command:
  22. [source,sh]
  23. ----------------------------------------------------------------
  24. sudo bin/elasticsearch-plugin remove mapper-murmur3
  25. ----------------------------------------------------------------
  26. The node must be stopped before removing the plugin.
  27. [[mapper-murmur3-usage]]
  28. ==== Using the `murmur3` field
  29. The `murmur3` is typically used within a multi-field, so that both the original
  30. value and its hash are stored in the index:
  31. [source,js]
  32. --------------------------
  33. PUT my_index
  34. {
  35. "mappings": {
  36. "my_type": {
  37. "properties": {
  38. "my_field": {
  39. "type": "keyword",
  40. "fields": {
  41. "hash": {
  42. "type": "murmur3"
  43. }
  44. }
  45. }
  46. }
  47. }
  48. }
  49. }
  50. --------------------------
  51. // CONSOLE
  52. Such a mapping would allow to refer to `my_field.hash` in order to get hashes
  53. of the values of the `my_field` field. This is only useful in order to run
  54. `cardinality` aggregations:
  55. [source,js]
  56. --------------------------
  57. # Example documents
  58. PUT my_index/my_type/1
  59. {
  60. "my_field": "This is a document"
  61. }
  62. PUT my_index/my_type/2
  63. {
  64. "my_field": "This is another document"
  65. }
  66. GET my_index/_search
  67. {
  68. "aggs": {
  69. "my_field_cardinality": {
  70. "cardinality": {
  71. "field": "my_field.hash" <1>
  72. }
  73. }
  74. }
  75. }
  76. --------------------------
  77. // CONSOLE
  78. <1> Counting unique values on the `my_field.hash` field
  79. Running a `cardinality` aggregation on the `my_field` field directly would
  80. yield the same result, however using `my_field.hash` instead might result in
  81. a speed-up if the field has a high-cardinality. On the other hand, it is
  82. discouraged to use the `murmur3` field on numeric fields and string fields
  83. that are not almost unique as the use of a `murmur3` field is unlikely to
  84. bring significant speed-ups, while increasing the amount of disk space required
  85. to store the index.