mapper-murmur3.asciidoc 1.9 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
  1. [[mapper-murmur3]]
  2. === Mapper murmur3 plugin
  3. The mapper-murmur3 plugin provides the ability to compute hash of field values
  4. at index-time and store them in the index. This can sometimes be helpful when
  5. running cardinality aggregations on high-cardinality and large string fields.
  6. :plugin_name: mapper-murmur3
  7. include::install_remove.asciidoc[]
  8. [[mapper-murmur3-usage]]
  9. ==== Using the `murmur3` field
  10. The `murmur3` is typically used within a multi-field, so that both the original
  11. value and its hash are stored in the index:
  12. [source,console]
  13. --------------------------
  14. PUT my-index-000001
  15. {
  16. "mappings": {
  17. "properties": {
  18. "my_field": {
  19. "type": "keyword",
  20. "fields": {
  21. "hash": {
  22. "type": "murmur3"
  23. }
  24. }
  25. }
  26. }
  27. }
  28. }
  29. --------------------------
  30. Such a mapping would allow to refer to `my_field.hash` in order to get hashes
  31. of the values of the `my_field` field. This is only useful in order to run
  32. `cardinality` aggregations:
  33. [source,console]
  34. --------------------------
  35. # Example documents
  36. PUT my-index-000001/_doc/1
  37. {
  38. "my_field": "This is a document"
  39. }
  40. PUT my-index-000001/_doc/2
  41. {
  42. "my_field": "This is another document"
  43. }
  44. GET my-index-000001/_search
  45. {
  46. "aggs": {
  47. "my_field_cardinality": {
  48. "cardinality": {
  49. "field": "my_field.hash" <1>
  50. }
  51. }
  52. }
  53. }
  54. --------------------------
  55. <1> Counting unique values on the `my_field.hash` field
  56. Running a `cardinality` aggregation on the `my_field` field directly would
  57. yield the same result, however using `my_field.hash` instead might result in
  58. a speed-up if the field has a high-cardinality. On the other hand, it is
  59. discouraged to use the `murmur3` field on numeric fields and string fields
  60. that are not almost unique as the use of a `murmur3` field is unlikely to
  61. bring significant speed-ups, while increasing the amount of disk space required
  62. to store the index.