common-log-format-example.asciidoc 5.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196
  1. [[common-log-format-example]]
  2. == Example: Parse logs in the Common Log Format
  3. ++++
  4. <titleabbrev>Example: Parse logs</titleabbrev>
  5. ++++
  6. In this example tutorial, you’ll use an <<ingest,ingest pipeline>> to parse
  7. server logs in the {wikipedia}/Common_Log_Format[Common Log Format] before
  8. indexing. Before starting, check the <<ingest-prerequisites,prerequisites>> for
  9. ingest pipelines.
  10. The logs you want to parse look similar to this:
  11. [source,js]
  12. ----
  13. 212.87.37.154 - - [30/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\"
  14. 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6)
  15. AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"
  16. ----
  17. // NOTCONSOLE
  18. These logs contain an IP address, timestamp, and user agent. You want to give
  19. these three items their own field in {es} for faster searches and
  20. visualizations. You also want to know where the request is coming from.
  21. . In {kib}, open the main menu and click **Stack Management** > **Ingest Node
  22. Pipelines**.
  23. +
  24. [role="screenshot"]
  25. image::images/ingest/ingest-pipeline-list.png[Kibana's Ingest Node Pipelines list view,align="center"]
  26. . Click **Create a pipeline**.
  27. . Provide a name and description for the pipeline.
  28. . Add a <<grok-processor,grok processor>> to parse the log message:
  29. .. Click **Add a processor** and select the **Grok** processor type.
  30. .. Set the field input to `message` and enter the following <<grok-basics,grok
  31. pattern>>:
  32. +
  33. [source,js]
  34. ----
  35. %{IPORHOST:client.ip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:@timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:user_agent}
  36. ----
  37. // NOTCONSOLE
  38. +
  39. .. Click **Add** to save the processor.
  40. . Add processors to map the date, IP, and user agent fields. Map the appropriate
  41. field to each processor type:
  42. +
  43. --
  44. * <<date-processor,**Date**>>: `@timestamp`
  45. * <<geoip-processor,**GeoIP**>>: `client.ip`
  46. * <<user-agent-processor,**User agent**>>: `user_agent`
  47. In the **Date** processor, specify the date format you want to use:
  48. `dd/MMM/YYYY:HH:mm:ss Z`.
  49. --
  50. Your form should look similar to this:
  51. +
  52. [role="screenshot"]
  53. image::images/ingest/ingest-pipeline-processor.png[Processors for Ingest Node Pipelines,align="center"]
  54. +
  55. The four processors will run sequentially: +
  56. Grok > Date > GeoIP > User agent +
  57. You can reorder processors using the arrow icons.
  58. +
  59. Alternatively, you can click the **Import processors** link and define the
  60. processors as JSON:
  61. +
  62. [source,console]
  63. ----
  64. {
  65. "processors": [
  66. {
  67. "grok": {
  68. "field": "message",
  69. "patterns": ["%{IPORHOST:client.ip} %{USER:ident} %{USER:auth} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:user_agent}"]
  70. }
  71. },
  72. {
  73. "date": {
  74. "field": "@timestamp",
  75. "formats": [ "dd/MMM/YYYY:HH:mm:ss Z" ]
  76. }
  77. },
  78. {
  79. "geoip": {
  80. "field": "client.ip"
  81. }
  82. },
  83. {
  84. "user_agent": {
  85. "field": "user_agent"
  86. }
  87. }
  88. ]
  89. }
  90. ----
  91. // TEST[s/^/PUT _ingest\/pipeline\/my-pipeline\n/]
  92. . To test the pipeline, click **Add documents**.
  93. . In the **Documents** tab, provide a sample document for testing:
  94. +
  95. [source,js]
  96. ----
  97. [
  98. {
  99. "_source": {
  100. "message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
  101. }
  102. }
  103. ]
  104. ----
  105. // NOTCONSOLE
  106. . Click **Run the pipeline** and verify the pipeline worked as expected.
  107. . If everything looks correct, close the panel, and then click **Create
  108. pipeline**.
  109. +
  110. You’re now ready to load the logs data using the <<docs-index_,index API>>.
  111. . Index a document with the pipeline you created.
  112. +
  113. [source,console]
  114. ----
  115. PUT my-index/_doc/1?pipeline=my-pipeline
  116. {
  117. "message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
  118. }
  119. ----
  120. // TEST[continued]
  121. . To verify, run:
  122. +
  123. [source,console]
  124. ----
  125. GET my-index/_doc/1
  126. ----
  127. // TEST[continued]
  128. ////
  129. [source,console-result]
  130. ----
  131. {
  132. "_index": "my-index",
  133. "_id": "1",
  134. "_version": 1,
  135. "_seq_no": 0,
  136. "_primary_term": 1,
  137. "found": true,
  138. "_source": {
  139. "request": "/favicon.ico",
  140. "geoip": {
  141. "continent_name": "Europe",
  142. "region_iso_code": "DE-BE",
  143. "city_name": "Berlin",
  144. "country_iso_code": "DE",
  145. "country_name": "Germany",
  146. "region_name": "Land Berlin",
  147. "location": {
  148. "lon": 13.4978,
  149. "lat": 52.411
  150. }
  151. },
  152. "auth": "-",
  153. "ident": "-",
  154. "verb": "GET",
  155. "message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
  156. "referrer": "\"-\"",
  157. "@timestamp": "2098-12-29T16:21:15.000Z",
  158. "response": 200,
  159. "bytes": 3638,
  160. "client": {
  161. "ip": "212.87.37.154"
  162. },
  163. "httpversion": "1.1",
  164. "user_agent": {
  165. "original": "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
  166. "os": {
  167. "name": "Mac OS X",
  168. "version": "10.11.6",
  169. "full": "Mac OS X 10.11.6"
  170. },
  171. "name": "Chrome",
  172. "device": {
  173. "name": "Mac"
  174. },
  175. "version": "52.0.2743.116"
  176. }
  177. }
  178. }
  179. ----
  180. ////