dissect.asciidoc 9.3 KB


  1. [[dissect-processor]]
  2. === Dissect processor
  3. ++++
  4. <titleabbrev>Dissect</titleabbrev>
  5. ++++
  6. Similar to the <<grok-processor,Grok Processor>>, dissect also extracts structured fields out of a single text field
  7. within a document. However unlike the <<grok-processor,Grok Processor>>, dissect does not use
  8. {wikipedia}/Regular_expression[Regular Expressions]. This allows dissect's syntax to be simple and for
  9. some cases faster than the <<grok-processor,Grok Processor>>.
  10. Dissect matches a single text field against a defined pattern.
  11. For example the following pattern:
  12. [source,txt]
  13. --------------------------------------------------
  14. %{clientip} %{ident} %{auth} [%{@timestamp}] \"%{verb} %{request} HTTP/%{httpversion}\" %{status} %{size}
  15. --------------------------------------------------
  16. will match a log line of this format:
  17. [source,txt]
  18. --------------------------------------------------
  19. 1.2.3.4 - - [30/Apr/1998:22:00:52 +0000] \"GET /english/venues/cities/images/montpellier/18.gif HTTP/1.0\" 200 3171
  20. --------------------------------------------------
  21. and result in a document with the following fields:
  22. [source,js]
  23. --------------------------------------------------
  24. "doc": {
  25. "_index": "_index",
  26. "_type": "_type",
  27. "_id": "_id",
  28. "_source": {
  29. "request": "/english/venues/cities/images/montpellier/18.gif",
  30. "auth": "-",
  31. "ident": "-",
  32. "verb": "GET",
  33. "@timestamp": "30/Apr/1998:22:00:52 +0000",
  34. "size": "3171",
  35. "clientip": "1.2.3.4",
  36. "httpversion": "1.0",
  37. "status": "200"
  38. }
  39. }
  40. --------------------------------------------------
  41. // NOTCONSOLE
  42. // tag::intro-example-explanation[]
  43. A dissect pattern is defined by the parts of the string that will be discarded. In the previous example, the first part
  44. to be discarded is a single space. Dissect finds this space, then assigns the value of `clientip` everything up
  45. until that space.
  46. Next, dissect matches the `[` and then `]` and then assigns `@timestamp` to everything in-between `[` and `]`.
  47. Paying special attention to the parts of the string to discard will help build successful dissect patterns.
  48. // end::intro-example-explanation[]
  49. Successful matches require all keys in a pattern to have a value. If any of the `%{keyname}` defined in the pattern do
  50. not have a value, then an exception is thrown and may be handled by the <<handling-pipeline-failures,`on_failure`>> directive.
  51. An empty key `%{}` or a <<dissect-modifier-named-skip-key, named skip key>> can be used to match values, but exclude the value from
  52. the final document. All matched values are represented as string data types. The <<convert-processor, convert processor>>
  53. may be used to convert to expected data type.
  54. Dissect also supports <<dissect-key-modifiers,key modifiers>> that can change dissect's default
  55. behavior. For example you can instruct dissect to ignore certain fields, append fields, skip over padding, etc.
  56. See <<dissect-key-modifiers, below>> for more information.
  57. [[dissect-options]]
  58. .Dissect Options
  59. [options="header"]
  60. |======
  61. | Name | Required | Default | Description
  62. | `field` | yes | - | The field to dissect
  63. | `pattern` | yes | - | The pattern to apply to the field
  64. | `append_separator`| no | "" (empty string) | The character(s) that separate the appended fields.
  65. | `ignore_missing` | no | false | If `true` and `field` does not exist or is `null`, the processor quietly exits without modifying the document
  66. include::common-options.asciidoc[]
  67. |======
  68. [source,js]
  69. --------------------------------------------------
  70. {
  71. "dissect": {
  72. "field": "message",
  73. "pattern" : "%{clientip} %{ident} %{auth} [%{@timestamp}] \"%{verb} %{request} HTTP/%{httpversion}\" %{status} %{size}"
  74. }
  75. }
  76. --------------------------------------------------
  77. // NOTCONSOLE
  78. [[dissect-key-modifiers]]
  79. ==== Dissect key modifiers
  80. // tag::dissect-key-modifiers[]
  81. Key modifiers can change the default behavior for dissection. Key modifiers may be found on the left or right
  82. of the `%{keyname}` always inside the `%{` and `}`. For example `%{+keyname ->}` has the append and right padding
  83. modifiers.
  84. // end::dissect-key-modifiers[]
  85. [[dissect-key-modifiers-table]]
  86. .Dissect Key Modifiers
  87. [options="header"]
  88. |======
  89. | Modifier | Name | Position | Example | Description | Details
  90. | `->` | Skip right padding | (far) right | `%{keyname1->}` | Skips any repeated characters to the right | <<dissect-modifier-skip-right-padding,link>>
  91. | `+` | Append | left | `%{+keyname} %{+keyname}` | Appends two or more fields together | <<dissect-modifier-append-key,link>>
  92. | `+` with `/n` | Append with order | left and right | `%{+keyname/2} %{+keyname/1}` | Appends two or more fields together in the order specified | <<dissect-modifier-append-key-with-order,link>>
  93. | `?` | Named skip key | left | `%{?ignoreme}` | Skips the matched value in the output. Same behavior as `%{}`| <<dissect-modifier-named-skip-key,link>>
  94. | `*` and `&` | Reference keys | left | `%{*r1} %{&r1}` | Sets the output key as value of `*` and output value of `&` | <<dissect-modifier-reference-keys,link>>
  95. |======
  96. [[dissect-modifier-skip-right-padding]]
  97. ===== Right padding modifier (`->`)
  98. // tag::dissect-modifier-skip-right-padding[]
  99. The algorithm that performs the dissection is very strict in that it requires all characters in the pattern to match
  100. the source string. For example, the pattern `%{fookey} %{barkey}` (1 space), will match the string "foo{nbsp}bar"
  101. (1 space), but will not match the string "foo{nbsp}{nbsp}bar" (2 spaces) since the pattern has only 1 space and the
  102. source string has 2 spaces.
  103. The right padding modifier helps with this case. Adding the right padding modifier to the pattern `%{fookey->} %{barkey}`,
  104. It will now will match "foo{nbsp}bar" (1 space) and "foo{nbsp}{nbsp}bar" (2 spaces)
  105. and even "foo{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}bar" (10 spaces).
  106. Use the right padding modifier to allow for repetition of the characters after a `%{keyname->}`.
  107. The right padding modifier may be placed on any key with any other modifiers. It should always be the furthest right
  108. modifier. For example: `%{+keyname/1->}` and `%{->}`
  109. Right padding modifier example
  110. |======
  111. | *Pattern* | `%{ts->} %{level}`
  112. | *Input* | 1998-08-10T17:15:42,466{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}WARN
  113. | *Result* a|
  114. * ts = 1998-08-10T17:15:42,466
  115. * level = WARN
  116. |======
  117. The right padding modifier may be used with an empty key to help skip unwanted data. For example, the same input string, but wrapped with brackets requires the use of an empty right padded key to achieve the same result.
  118. Right padding modifier with empty key example
  119. |======
  120. | *Pattern* | `[%{ts}]%{->}[%{level}]`
  121. | *Input* | [1998-08-10T17:15:42,466]{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}{nbsp}[WARN]
  122. | *Result* a|
  123. * ts = 1998-08-10T17:15:42,466
  124. * level = WARN
  125. |======
  126. // end::dissect-modifier-skip-right-padding[]
  127. [[append-modifier]]
  128. ===== Append modifier (`+`)
  129. [[dissect-modifier-append-key]]
  130. // tag::append-modifier[]
  131. Dissect supports appending two or more results together for the output.
  132. Values are appended left to right. An append separator can be specified.
  133. In this example the append_separator is defined as a space.
  134. Append modifier example
  135. |======
  136. | *Pattern* | `%{+name} %{+name} %{+name} %{+name}`
  137. | *Input* | john jacob jingleheimer schmidt
  138. | *Result* a|
  139. * name = john jacob jingleheimer schmidt
  140. |======
  141. // end::append-modifier[]
  142. [[append-order-modifier]]
  143. ===== Append with order modifier (`+` and `/n`)
  144. [[dissect-modifier-append-key-with-order]]
  145. // tag::append-order-modifier[]
  146. Dissect supports appending two or more results together for the output.
  147. Values are appended based on the order defined (`/n`). An append separator can be specified.
  148. In this example the append_separator is defined as a comma.
  149. Append with order modifier example
  150. |======
  151. | *Pattern* | `%{+name/2} %{+name/4} %{+name/3} %{+name/1}`
  152. | *Input* | john jacob jingleheimer schmidt
  153. | *Result* a|
  154. * name = schmidt,john,jingleheimer,jacob
  155. |======
  156. // end::append-order-modifier[]
  157. [[named-skip-key]]
  158. ===== Named skip key (`?`)
  159. [[dissect-modifier-named-skip-key]]
  160. // tag::named-skip-key[]
  161. Dissect supports ignoring matches in the final result. This can be done with an empty key `%{}`, but for readability
  162. it may be desired to give that empty key a name.
  163. Named skip key modifier example
  164. |======
  165. | *Pattern* | `%{clientip} %{?ident} %{?auth} [%{@timestamp}]`
  166. | *Input* | 1.2.3.4 - - [30/Apr/1998:22:00:52 +0000]
  167. | *Result* a|
  168. * clientip = 1.2.3.4
  169. * @timestamp = 30/Apr/1998:22:00:52 +0000
  170. |======
  171. // end::named-skip-key[]
  172. [[reference-keys]]
  173. ===== Reference keys (`*` and `&`)
  174. [[dissect-modifier-reference-keys]]
  175. // tag::reference-keys[]
  176. Dissect support using parsed values as the key/value pairings for the structured content. Imagine a system that
  177. partially logs in key/value pairs. Reference keys allow you to maintain that key/value relationship.
  178. Reference key modifier example
  179. |======
  180. | *Pattern* | `[%{ts}] [%{level}] %{*p1}:%{&p1} %{*p2}:%{&p2}`
  181. | *Input* | [2018-08-10T17:15:42,466] [ERR] ip:1.2.3.4 error:REFUSED
  182. | *Result* a|
  183. * ts = 2018-08-10T17:15:42,466
  184. * level = ERR
  185. * ip = 1.2.3.4
  186. * error = REFUSED
  187. |======
  188. // end::reference-keys[]