| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268 | [[common-log-format-example]]== Example: Parse logs in the Common Log Format++++<titleabbrev>Example: Parse logs</titleabbrev>++++In this example tutorial, you’ll use an <<ingest,ingest pipeline>> to parseserver logs in the {wikipedia}/Common_Log_Format[Common Log Format] beforeindexing. Before starting, check the <<ingest-prerequisites,prerequisites>> foringest pipelines.The logs you want to parse look similar to this:[source,log]----212.87.37.154 - - [05/May/2099:16:21:15 +0000] "GET /favicon.ico HTTP/1.1" 200 3638 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36"----// NOTCONSOLEThese logs contain a timestamp, IP address, and user agent. You want to givethese three items their own field in {es} for faster searches andvisualizations. You also want to know where the request is coming from.. In {kib}, open the main menu and click **Stack Management** > **IngestPipelines**.+[role="screenshot"]image::images/ingest/ingest-pipeline-list.png[Kibana's Ingest Pipelines list view,align="center"]. Click **Create pipeline > New pipeline**.. Set **Name** to `my-pipeline` and optionally add a description for the pipeline.. Add a <<grok-processor,grok processor>> to parse the log message:.. Click **Add a processor** and select the **Grok** processor type... Set **Field** to `message` and **Patterns** to the following<<grok-basics,grok pattern>>:+[source,grok]----%{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \[%{HTTPDATE:@timestamp}\] "%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}----// NOTCONSOLE+.. Click **Add** to save the processor... Set the processor description to `Extract fields from 'message'`.. Add processors for the timestamp, IP address, and user agent fields. Configurethe processors as follows:+--[options="header"]|====| Processor type | Field  | Additional options | Description| <<date-processor,**Date**>>| `@timestamp`| **Formats**: `dd/MMM/yyyy:HH:mm:ss Z`| `Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'`| <<geoip-processor,**GeoIP**>>| `source.ip`| **Target field**: `source.geo`| `Add 'source.geo' GeoIP data for 'source.ip'`| <<user-agent-processor,**User agent**>>| `user_agent`|| `Extract fields from 'user_agent'`|====Your form should look similar to this:[role="screenshot"]image::images/ingest/ingest-pipeline-processor.png[Processors for Ingest Pipelines,align="center"]The four processors will run sequentially: +Grok > Date > GeoIP > User agent +You can reorder processors using the arrow icons.Alternatively, you can click the **Import processors** link and define theprocessors as JSON:[source,js]----{include::common-log-format-example.asciidoc[tag=common-log-pipeline]}----// NOTCONSOLE////[source,console]----PUT _ingest/pipeline/my-pipeline{// tag::common-log-pipeline[]  "processors": [    {      "grok": {        "description": "Extract fields from 'message'",        "field": "message",        "patterns": ["%{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}\" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}"]      }    },    {      "date": {        "description": "Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'",        "field": "@timestamp",        "formats": [ "dd/MMM/yyyy:HH:mm:ss Z" ]      }    },    {      "geoip": {        "description": "Add 'source.geo' GeoIP data for 'source.ip'",        "field": "source.ip",        "target_field": "source.geo"      }    },    {      "user_agent": {        "description": "Extract fields from 'user_agent'",        "field": "user_agent"      }    }  ]// end::common-log-pipeline[]}----////--. To test the pipeline, click **Add documents**.. In the **Documents** tab, provide a sample document for testing:+[source,js]----[  {    "_source": {      "message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""    }  }]----// NOTCONSOLE. Click **Run the pipeline** and verify the pipeline worked as expected.. If everything looks correct, close the panel, and then click **Createpipeline**.+You’re now ready to index the logs data to a <<data-streams,data stream>>.. Create an <<index-templates,index template>> with<<create-index-template,data stream enabled>>.+[source,console]----PUT _index_template/my-data-stream-template{  "index_patterns": [ "my-data-stream*" ],  "data_stream": { },  "priority": 500}----// TEST[continued]. Index a document with the pipeline you created.+[source,console]----POST my-data-stream/_doc?pipeline=my-pipeline{  "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""}----// TEST[s/my-pipeline/my-pipeline&refresh=wait_for/]// TEST[continued]. To verify, search the data stream to retrieve the document. The followingsearch uses <<common-options-response-filtering,`filter_path`>> to return onlythe <<mapping-source-field,document source>>.+--[source,console]----GET my-data-stream/_search?filter_path=hits.hits._source----// TEST[continued]The API returns:[source,console-result]----{  "hits": {    "hits": [      {        "_source": {          "@timestamp": "2099-05-05T16:21:15.000Z",          "http": {            "request": {              "referrer": "\"-\"",              "method": "GET"            },            "response": {              "status_code": 200,              "body": {                "bytes": 3638              }            },            "version": "1.1"          },          "source": {            "ip": "89.160.20.128",            "geo": {              "continent_name" : "Europe",              "country_name" : "Sweden",              "country_iso_code" : "SE",              "city_name" : "Linköping",              "region_iso_code" : "SE-E",              "region_name" : "Östergötland County",              "location" : {                "lon" : 15.6167,                "lat" : 58.4167              }            }          },          "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",          "url": {            "original": "/favicon.ico"          },          "user": {            "name": "-",            "id": "-"          },          "user_agent": {            "original": "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",            "os": {              "name": "Mac OS X",              "version": "10.11.6",              "full": "Mac OS X 10.11.6"            },            "name": "Chrome",            "device": {              "name": "Mac"            },            "version": "52.0.2743.116"          }        }      }    ]  }}------////[source,console]----DELETE _data_stream/*DELETE _index_template/*----// TEST[continued]////
 |