1
0

common-log-format-example.asciidoc 7.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269
  1. [[common-log-format-example]]
  2. == Example: Parse logs in the Common Log Format
  3. ++++
  4. <titleabbrev>Example: Parse logs</titleabbrev>
  5. ++++
  6. In this example tutorial, you’ll use an <<ingest,ingest pipeline>> to parse
  7. server logs in the {wikipedia}/Common_Log_Format[Common Log Format] before
  8. indexing. Before starting, check the <<ingest-prerequisites,prerequisites>> for
  9. ingest pipelines.
  10. The logs you want to parse look similar to this:
  11. [source,log]
  12. ----
  13. 212.87.37.154 - - [05/May/2099:16:21:15 +0000] "GET /favicon.ico HTTP/1.1" 200 3638 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"
  14. ----
  15. // NOTCONSOLE
  16. These logs contain a timestamp, IP address, and user agent. You want to give
  17. these three items their own field in {es} for faster searches and
  18. visualizations. You also want to know where the request is coming from.
  19. . In {kib}, open the main menu and click **Stack Management** > **Ingest
  20. Pipelines**.
  21. +
  22. [role="screenshot"]
  23. image::images/ingest/ingest-pipeline-list.png[Kibana's Ingest Pipelines list view,align="center"]
  24. . Click **Create pipeline > New pipeline**.
  25. . Set **Name** to `my-pipeline` and optionally add a description for the
  26. pipeline.
  27. . Add a <<grok-processor,grok processor>> to parse the log message:
  28. .. Click **Add a processor** and select the **Grok** processor type.
  29. .. Set **Field** to `message` and **Patterns** to the following
  30. <<grok,grok pattern>>:
  31. +
  32. [source,grok]
  33. ----
  34. %{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \[%{HTTPDATE:@timestamp}\] "%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}
  35. ----
  36. // NOTCONSOLE
  37. +
  38. .. Click **Add** to save the processor.
  39. .. Set the processor description to `Extract fields from 'message'`.
  40. . Add processors for the timestamp, IP address, and user agent fields. Configure
  41. the processors as follows:
  42. +
  43. --
  44. [options="header"]
  45. |====
  46. | Processor type | Field | Additional options | Description
  47. | <<date-processor,**Date**>>
  48. | `@timestamp`
  49. | **Formats**: `dd/MMM/yyyy:HH:mm:ss Z`
  50. | `Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'`
  51. | <<geoip-processor,**GeoIP**>>
  52. | `source.ip`
  53. | **Target field**: `source.geo`
  54. | `Add 'source.geo' GeoIP data for 'source.ip'`
  55. | <<user-agent-processor,**User agent**>>
  56. | `user_agent`
  57. |
  58. | `Extract fields from 'user_agent'`
  59. |====
  60. Your form should look similar to this:
  61. [role="screenshot"]
  62. image::images/ingest/ingest-pipeline-processor.png[Processors for Ingest Pipelines,align="center"]
  63. The four processors will run sequentially: +
  64. Grok > Date > GeoIP > User agent +
  65. You can reorder processors using the arrow icons.
  66. Alternatively, you can click the **Import processors** link and define the
  67. processors as JSON:
  68. [source,js]
  69. ----
  70. {
  71. include::common-log-format-example.asciidoc[tag=common-log-pipeline]
  72. }
  73. ----
  74. // NOTCONSOLE
  75. ////
  76. [source,console]
  77. ----
  78. PUT _ingest/pipeline/my-pipeline
  79. {
  80. // tag::common-log-pipeline[]
  81. "processors": [
  82. {
  83. "grok": {
  84. "description": "Extract fields from 'message'",
  85. "field": "message",
  86. "patterns": ["%{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}\" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}"]
  87. }
  88. },
  89. {
  90. "date": {
  91. "description": "Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'",
  92. "field": "@timestamp",
  93. "formats": [ "dd/MMM/yyyy:HH:mm:ss Z" ]
  94. }
  95. },
  96. {
  97. "geoip": {
  98. "description": "Add 'source.geo' GeoIP data for 'source.ip'",
  99. "field": "source.ip",
  100. "target_field": "source.geo"
  101. }
  102. },
  103. {
  104. "user_agent": {
  105. "description": "Extract fields from 'user_agent'",
  106. "field": "user_agent"
  107. }
  108. }
  109. ]
  110. // end::common-log-pipeline[]
  111. }
  112. ----
  113. // TEST[skip:This can output a warning, and asciidoc doesn't support allowed_warnings]
  114. ////
  115. --
  116. . To test the pipeline, click **Add documents**.
  117. . In the **Documents** tab, provide a sample document for testing:
  118. +
  119. [source,js]
  120. ----
  121. [
  122. {
  123. "_source": {
  124. "message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
  125. }
  126. }
  127. ]
  128. ----
  129. // NOTCONSOLE
  130. . Click **Run the pipeline** and verify the pipeline worked as expected.
  131. . If everything looks correct, close the panel, and then click **Create
  132. pipeline**.
  133. +
  134. You’re now ready to index the logs data to a <<data-streams,data stream>>.
  135. . Create an <<index-templates,index template>> with
  136. <<create-index-template,data stream enabled>>.
  137. +
  138. [source,console]
  139. ----
  140. PUT _index_template/my-data-stream-template
  141. {
  142. "index_patterns": [ "my-data-stream*" ],
  143. "data_stream": { },
  144. "priority": 500
  145. }
  146. ----
  147. // TEST[continued]
  148. . Index a document with the pipeline you created.
  149. +
  150. [source,console]
  151. ----
  152. POST my-data-stream/_doc?pipeline=my-pipeline
  153. {
  154. "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
  155. }
  156. ----
  157. // TEST[s/my-pipeline/my-pipeline&refresh=wait_for/]
  158. // TEST[continued]
  159. . To verify, search the data stream to retrieve the document. The following
  160. search uses <<common-options-response-filtering,`filter_path`>> to return only
  161. the <<mapping-source-field,document source>>.
  162. +
  163. --
  164. [source,console]
  165. ----
  166. GET my-data-stream/_search?filter_path=hits.hits._source
  167. ----
  168. // TEST[continued]
  169. The API returns:
  170. [source,console-result]
  171. ----
  172. {
  173. "hits": {
  174. "hits": [
  175. {
  176. "_source": {
  177. "@timestamp": "2099-05-05T16:21:15.000Z",
  178. "http": {
  179. "request": {
  180. "referrer": "\"-\"",
  181. "method": "GET"
  182. },
  183. "response": {
  184. "status_code": 200,
  185. "body": {
  186. "bytes": 3638
  187. }
  188. },
  189. "version": "1.1"
  190. },
  191. "source": {
  192. "ip": "89.160.20.128",
  193. "geo": {
  194. "continent_name" : "Europe",
  195. "country_name" : "Sweden",
  196. "country_iso_code" : "SE",
  197. "city_name" : "Linköping",
  198. "region_iso_code" : "SE-E",
  199. "region_name" : "Östergötland County",
  200. "location" : {
  201. "lon" : 15.6167,
  202. "lat" : 58.4167
  203. }
  204. }
  205. },
  206. "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
  207. "url": {
  208. "original": "/favicon.ico"
  209. },
  210. "user": {
  211. "name": "-",
  212. "id": "-"
  213. },
  214. "user_agent": {
  215. "original": "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
  216. "os": {
  217. "name": "Mac OS X",
  218. "version": "10.11.6",
  219. "full": "Mac OS X 10.11.6"
  220. },
  221. "name": "Chrome",
  222. "device": {
  223. "name": "Mac"
  224. },
  225. "version": "52.0.2743.116"
  226. }
  227. }
  228. }
  229. ]
  230. }
  231. }
  232. ----
  233. --
  234. ////
  235. [source,console]
  236. ----
  237. DELETE _data_stream/*
  238. DELETE _index_template/*
  239. ----
  240. // TEST[continued]
  241. ////