common-log-format-example.asciidoc 7.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269
  1. [[common-log-format-example]]
  2. == Example: Parse logs in the Common Log Format
  3. ++++
  4. <titleabbrev>Example: Parse logs</titleabbrev>
  5. ++++
  6. In this example tutorial, you’ll use an <<ingest,ingest pipeline>> to parse
  7. server logs in the {wikipedia}/Common_Log_Format[Common Log Format] before
  8. indexing. Before starting, check the <<ingest-prerequisites,prerequisites>> for
  9. ingest pipelines.
  10. The logs you want to parse look similar to this:
  11. [source,log]
  12. ----
  13. 212.87.37.154 - - [30/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\"
  14. 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6)
  15. AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"
  16. ----
  17. // NOTCONSOLE
  18. These logs contain a timestamp, IP address, and user agent. You want to give
  19. these three items their own field in {es} for faster searches and
  20. visualizations. You also want to know where the request is coming from.
  21. . In {kib}, open the main menu and click **Stack Management** > **Ingest
  22. Pipelines**.
  23. +
  24. [role="screenshot"]
  25. image::images/ingest/ingest-pipeline-list.png[Kibana's Ingest Pipelines list view,align="center"]
  26. . Click **Create pipeline**.
  27. . Provide a name and description for the pipeline.
  28. . Add a <<grok-processor,grok processor>> to parse the log message:
  29. .. Click **Add a processor** and select the **Grok** processor type.
  30. .. Set **Field** to `message` and **Patterns** to the following
  31. <<grok-basics,grok pattern>>:
  32. +
  33. [source,grok]
  34. ----
  35. %{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}\" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}
  36. ----
  37. // NOTCONSOLE
  38. +
  39. .. Click **Add** to save the processor.
  40. .. Set the processor description to `Extract fields from 'message'`.
  41. . Add processors for the timestamp, IP address, and user agent fields. Configure
  42. the processors as follows:
  43. +
  44. --
  45. [options="header"]
  46. |====
  47. | Processor type | Field | Additional options | Description
  48. | <<date-processor,**Date**>>
  49. | `@timestamp`
  50. | **Formats**: `dd/MMM/yyyy:HH:mm:ss Z`
  51. | `Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'`
  52. | <<geoip-processor,**GeoIP**>>
  53. | `source.ip`
  54. | **Target field**: `source.geo`
  55. | `Add 'source.geo' GeoIP data for 'source.ip'`
  56. | <<user-agent-processor,**User agent**>>
  57. | `user_agent`
  58. |
  59. | `Extract fields from 'user_agent'`
  60. |====
  61. Your form should look similar to this:
  62. [role="screenshot"]
  63. image::images/ingest/ingest-pipeline-processor.png[Processors for Ingest Pipelines,align="center"]
  64. The four processors will run sequentially: +
  65. Grok > Date > GeoIP > User agent +
  66. You can reorder processors using the arrow icons.
  67. Alternatively, you can click the **Import processors** link and define the
  68. processors as JSON:
  69. [source,js]
  70. ----
  71. {
  72. include::common-log-format-example.asciidoc[tag=common-log-pipeline]
  73. }
  74. ----
  75. // NOTCONSOLE
  76. ////
  77. [source,console]
  78. ----
  79. PUT _ingest/pipeline/my-pipeline
  80. {
  81. // tag::common-log-pipeline[]
  82. "processors": [
  83. {
  84. "grok": {
  85. "description": "Extract fields from 'message'",
  86. "field": "message",
  87. "patterns": ["%{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}\" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}"]
  88. }
  89. },
  90. {
  91. "date": {
  92. "description": "Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'",
  93. "field": "@timestamp",
  94. "formats": [ "dd/MMM/yyyy:HH:mm:ss Z" ]
  95. }
  96. },
  97. {
  98. "geoip": {
  99. "description": "Add 'source.geo' GeoIP data for 'source.ip'",
  100. "field": "source.ip",
  101. "target_field": "source.geo"
  102. }
  103. },
  104. {
  105. "user_agent": {
  106. "description": "Extract fields from 'user_agent'",
  107. "field": "user_agent"
  108. }
  109. }
  110. ]
  111. // end::common-log-pipeline[]
  112. }
  113. ----
  114. ////
  115. --
  116. . To test the pipeline, click **Add documents**.
  117. . In the **Documents** tab, provide a sample document for testing:
  118. +
  119. [source,js]
  120. ----
  121. [
  122. {
  123. "_source": {
  124. "message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
  125. }
  126. }
  127. ]
  128. ----
  129. // NOTCONSOLE
  130. . Click **Run the pipeline** and verify the pipeline worked as expected.
  131. . If everything looks correct, close the panel, and then click **Create
  132. pipeline**.
  133. +
  134. You’re now ready to index the logs data to a <<data-streams,data stream>>.
  135. . Create an <<index-templates,index template>> with
  136. <<create-index-template,data stream enabled>>.
  137. +
  138. [source,console]
  139. ----
  140. PUT _index_template/my-data-stream-template
  141. {
  142. "index_patterns": [ "my-data-stream*" ],
  143. "data_stream": { },
  144. "priority": 500
  145. }
  146. ----
  147. // TEST[continued]
  148. . Index a document with the pipeline you created.
  149. +
  150. [source,console]
  151. ----
  152. POST my-data-stream/_doc?pipeline=my-pipeline
  153. {
  154. "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
  155. }
  156. ----
  157. // TEST[s/my-pipeline/my-pipeline&refresh=wait_for/]
  158. // TEST[continued]
  159. . To verify, search the data stream to retrieve the document. The following
  160. search uses <<common-options-response-filtering,`filter_path`>> to return only
  161. the <<mapping-source-field,document source>>.
  162. +
  163. --
  164. [source,console]
  165. ----
  166. GET my-data-stream/_search?filter_path=hits.hits._source
  167. ----
  168. // TEST[continued]
  169. The API returns:
  170. [source,console-result]
  171. ----
  172. {
  173. "hits": {
  174. "hits": [
  175. {
  176. "_source": {
  177. "@timestamp": "2099-05-05T16:21:15.000Z",
  178. "http": {
  179. "request": {
  180. "referrer": "\"-\"",
  181. "method": "GET"
  182. },
  183. "response": {
  184. "status_code": 200,
  185. "body": {
  186. "bytes": 3638
  187. }
  188. },
  189. "version": "1.1"
  190. },
  191. "source": {
  192. "ip": "89.160.20.128",
  193. "geo": {
  194. "continent_name" : "Europe",
  195. "country_name" : "Sweden",
  196. "country_iso_code" : "SE",
  197. "city_name" : "Linköping",
  198. "region_iso_code" : "SE-E",
  199. "region_name" : "Östergötland County",
  200. "location" : {
  201. "lon" : 15.6167,
  202. "lat" : 58.4167
  203. }
  204. }
  205. },
  206. "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
  207. "url": {
  208. "original": "/favicon.ico"
  209. },
  210. "user": {
  211. "name": "-",
  212. "id": "-"
  213. },
  214. "user_agent": {
  215. "original": "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
  216. "os": {
  217. "name": "Mac OS X",
  218. "version": "10.11.6",
  219. "full": "Mac OS X 10.11.6"
  220. },
  221. "name": "Chrome",
  222. "device": {
  223. "name": "Mac"
  224. },
  225. "version": "52.0.2743.116"
  226. }
  227. }
  228. }
  229. ]
  230. }
  231. }
  232. ----
  233. --
  234. ////
  235. [source,console]
  236. ----
  237. DELETE _data_stream/*
  238. DELETE _index_template/*
  239. ----
  240. // TEST[continued]
  241. ////