common-log-format-example.asciidoc 7.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268
  1. [[common-log-format-example]]
  2. == Example: Parse logs in the Common Log Format
  3. ++++
  4. <titleabbrev>Example: Parse logs</titleabbrev>
  5. ++++
  6. In this example tutorial, you’ll use an <<ingest,ingest pipeline>> to parse
  7. server logs in the {wikipedia}/Common_Log_Format[Common Log Format] before
  8. indexing. Before starting, check the <<ingest-prerequisites,prerequisites>> for
  9. ingest pipelines.
  10. The logs you want to parse look similar to this:
  11. [source,log]
  12. ----
  13. 212.87.37.154 - - [05/May/2099:16:21:15 +0000] "GET /favicon.ico HTTP/1.1" 200 3638 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"
  14. ----
  15. // NOTCONSOLE
  16. These logs contain a timestamp, IP address, and user agent. You want to give
  17. these three items their own field in {es} for faster searches and
  18. visualizations. You also want to know where the request is coming from.
  19. . In {kib}, open the main menu and click **Stack Management** > **Ingest
  20. Pipelines**.
  21. +
  22. [role="screenshot"]
  23. image::images/ingest/ingest-pipeline-list.png[Kibana's Ingest Pipelines list view,align="center"]
  24. . Click **Create pipeline > New pipeline**.
  25. . Set **Name** to `my-pipeline` and optionally add a description for the
  26. pipeline.
  27. . Add a <<grok-processor,grok processor>> to parse the log message:
  28. .. Click **Add a processor** and select the **Grok** processor type.
  29. .. Set **Field** to `message` and **Patterns** to the following
  30. <<grok,grok pattern>>:
  31. +
  32. [source,grok]
  33. ----
  34. %{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \[%{HTTPDATE:@timestamp}\] "%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}
  35. ----
  36. // NOTCONSOLE
  37. +
  38. .. Click **Add** to save the processor.
  39. .. Set the processor description to `Extract fields from 'message'`.
  40. . Add processors for the timestamp, IP address, and user agent fields. Configure
  41. the processors as follows:
  42. +
  43. --
  44. [options="header"]
  45. |====
  46. | Processor type | Field | Additional options | Description
  47. | <<date-processor,**Date**>>
  48. | `@timestamp`
  49. | **Formats**: `dd/MMM/yyyy:HH:mm:ss Z`
  50. | `Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'`
  51. | <<geoip-processor,**GeoIP**>>
  52. | `source.ip`
  53. | **Target field**: `source.geo`
  54. | `Add 'source.geo' GeoIP data for 'source.ip'`
  55. | <<user-agent-processor,**User agent**>>
  56. | `user_agent`
  57. |
  58. | `Extract fields from 'user_agent'`
  59. |====
  60. Your form should look similar to this:
  61. [role="screenshot"]
  62. image::images/ingest/ingest-pipeline-processor.png[Processors for Ingest Pipelines,align="center"]
  63. The four processors will run sequentially: +
  64. Grok > Date > GeoIP > User agent +
  65. You can reorder processors using the arrow icons.
  66. Alternatively, you can click the **Import processors** link and define the
  67. processors as JSON:
  68. [source,js]
  69. ----
  70. {
  71. include::common-log-format-example.asciidoc[tag=common-log-pipeline]
  72. }
  73. ----
  74. // NOTCONSOLE
  75. ////
  76. [source,console]
  77. ----
  78. PUT _ingest/pipeline/my-pipeline
  79. {
  80. // tag::common-log-pipeline[]
  81. "processors": [
  82. {
  83. "grok": {
  84. "description": "Extract fields from 'message'",
  85. "field": "message",
  86. "patterns": ["%{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}\" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}"]
  87. }
  88. },
  89. {
  90. "date": {
  91. "description": "Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'",
  92. "field": "@timestamp",
  93. "formats": [ "dd/MMM/yyyy:HH:mm:ss Z" ]
  94. }
  95. },
  96. {
  97. "geoip": {
  98. "description": "Add 'source.geo' GeoIP data for 'source.ip'",
  99. "field": "source.ip",
  100. "target_field": "source.geo"
  101. }
  102. },
  103. {
  104. "user_agent": {
  105. "description": "Extract fields from 'user_agent'",
  106. "field": "user_agent"
  107. }
  108. }
  109. ]
  110. // end::common-log-pipeline[]
  111. }
  112. ----
  113. ////
  114. --
  115. . To test the pipeline, click **Add documents**.
  116. . In the **Documents** tab, provide a sample document for testing:
  117. +
  118. [source,js]
  119. ----
  120. [
  121. {
  122. "_source": {
  123. "message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
  124. }
  125. }
  126. ]
  127. ----
  128. // NOTCONSOLE
  129. . Click **Run the pipeline** and verify the pipeline worked as expected.
  130. . If everything looks correct, close the panel, and then click **Create
  131. pipeline**.
  132. +
  133. You’re now ready to index the logs data to a <<data-streams,data stream>>.
  134. . Create an <<index-templates,index template>> with
  135. <<create-index-template,data stream enabled>>.
  136. +
  137. [source,console]
  138. ----
  139. PUT _index_template/my-data-stream-template
  140. {
  141. "index_patterns": [ "my-data-stream*" ],
  142. "data_stream": { },
  143. "priority": 500
  144. }
  145. ----
  146. // TEST[continued]
  147. . Index a document with the pipeline you created.
  148. +
  149. [source,console]
  150. ----
  151. POST my-data-stream/_doc?pipeline=my-pipeline
  152. {
  153. "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
  154. }
  155. ----
  156. // TEST[s/my-pipeline/my-pipeline&refresh=wait_for/]
  157. // TEST[continued]
  158. . To verify, search the data stream to retrieve the document. The following
  159. search uses <<common-options-response-filtering,`filter_path`>> to return only
  160. the <<mapping-source-field,document source>>.
  161. +
  162. --
  163. [source,console]
  164. ----
  165. GET my-data-stream/_search?filter_path=hits.hits._source
  166. ----
  167. // TEST[continued]
  168. The API returns:
  169. [source,console-result]
  170. ----
  171. {
  172. "hits": {
  173. "hits": [
  174. {
  175. "_source": {
  176. "@timestamp": "2099-05-05T16:21:15.000Z",
  177. "http": {
  178. "request": {
  179. "referrer": "\"-\"",
  180. "method": "GET"
  181. },
  182. "response": {
  183. "status_code": 200,
  184. "body": {
  185. "bytes": 3638
  186. }
  187. },
  188. "version": "1.1"
  189. },
  190. "source": {
  191. "ip": "89.160.20.128",
  192. "geo": {
  193. "continent_name" : "Europe",
  194. "country_name" : "Sweden",
  195. "country_iso_code" : "SE",
  196. "city_name" : "Linköping",
  197. "region_iso_code" : "SE-E",
  198. "region_name" : "Östergötland County",
  199. "location" : {
  200. "lon" : 15.6167,
  201. "lat" : 58.4167
  202. }
  203. }
  204. },
  205. "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
  206. "url": {
  207. "original": "/favicon.ico"
  208. },
  209. "user": {
  210. "name": "-",
  211. "id": "-"
  212. },
  213. "user_agent": {
  214. "original": "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
  215. "os": {
  216. "name": "Mac OS X",
  217. "version": "10.11.6",
  218. "full": "Mac OS X 10.11.6"
  219. },
  220. "name": "Chrome",
  221. "device": {
  222. "name": "Mac"
  223. },
  224. "version": "52.0.2743.116"
  225. }
  226. }
  227. }
  228. ]
  229. }
  230. }
  231. ----
  232. --
  233. ////
  234. [source,console]
  235. ----
  236. DELETE _data_stream/*
  237. DELETE _index_template/*
  238. ----
  239. // TEST[continued]
  240. ////