ingest-user-agent.asciidoc 3.9 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586
  1. [[ingest-user-agent]]
  2. === Ingest user agent processor plugin
  3. The `user_agent` processor extracts details from the user agent string a browser sends with its web requests.
  4. This processor adds this information by default under the `user_agent` field.
  5. The ingest-user-agent plugin ships by default with the regexes.yaml made available by uap-java with an Apache 2.0 license. For more details see https://github.com/ua-parser/uap-core.
  6. :plugin_name: ingest-user-agent
  7. include::install_remove.asciidoc[]
  8. [[using-ingest-user-agent]]
  9. ==== Using the user_agent Processor in a Pipeline
  10. [[ingest-user-agent-options]]
  11. .User-agent options
  12. [options="header"]
  13. |======
  14. | Name | Required | Default | Description
  15. | `field` | yes | - | The field containing the user agent string.
  16. | `target_field` | no | user_agent | The field that will be filled with the user agent details.
  17. | `regex_file` | no | - | The name of the file in the `config/ingest-user-agent` directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with (see below).
  18. | `properties` | no | [`name`, `major`, `minor`, `patch`, `build`, `os`, `os_name`, `os_major`, `os_minor`, `device`] | Controls what properties are added to `target_field`.
  19. | `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
  20. |======
  21. Here is an example that adds the user agent details to the `user_agent` field based on the `agent` field:
  22. [source,js]
  23. --------------------------------------------------
  24. PUT _ingest/pipeline/user_agent
  25. {
  26. "description" : "Add user agent information",
  27. "processors" : [
  28. {
  29. "user_agent" : {
  30. "field" : "agent"
  31. }
  32. }
  33. ]
  34. }
  35. PUT my_index/_doc/my_id?pipeline=user_agent
  36. {
  37. "agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36"
  38. }
  39. GET my_index/_doc/my_id
  40. --------------------------------------------------
  41. // CONSOLE
  42. Which returns
  43. [source,js]
  44. --------------------------------------------------
  45. {
  46. "found": true,
  47. "_index": "my_index",
  48. "_type": "_doc",
  49. "_id": "my_id",
  50. "_version": 1,
  51. "_source": {
  52. "agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36",
  53. "user_agent": {
  54. "name": "Chrome",
  55. "major": "51",
  56. "minor": "0",
  57. "patch": "2704",
  58. "os_name": "Mac OS X",
  59. "os": "Mac OS X 10.10.5",
  60. "os_major": "10",
  61. "os_minor": "10",
  62. "device": "Other"
  63. }
  64. }
  65. }
  66. --------------------------------------------------
  67. // TESTRESPONSE
  68. ===== Using a custom regex file
  69. To use a custom regex file for parsing the user agents, that file has to be put into the `config/ingest-user-agent` directory and
  70. has to have a `.yaml` filename extension. The file has to be present at node startup, any changes to it or any new files added
  71. while the node is running will not have any effect.
  72. In practice, it will make most sense for any custom regex file to be a variant of the default file, either a more recent version
  73. or a customised version.
  74. The default file included in `ingest-user-agent` is the `regexes.yaml` from uap-core: https://github.com/ua-parser/uap-core/blob/master/regexes.yaml