123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269 |
- [[common-log-format-example]]
- == Example: Parse logs in the Common Log Format
- ++++
- <titleabbrev>Example: Parse logs</titleabbrev>
- ++++
- In this example tutorial, you’ll use an <<ingest,ingest pipeline>> to parse
- server logs in the {wikipedia}/Common_Log_Format[Common Log Format] before
- indexing. Before starting, check the <<ingest-prerequisites,prerequisites>> for
- ingest pipelines.
- The logs you want to parse look similar to this:
- [source,log]
- ----
- 212.87.37.154 - - [30/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\"
- 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6)
- AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"
- ----
- // NOTCONSOLE
- These logs contain a timestamp, IP address, and user agent. You want to give
- these three items their own field in {es} for faster searches and
- visualizations. You also want to know where the request is coming from.
- . In {kib}, open the main menu and click **Stack Management** > **Ingest
- Pipelines**.
- +
- [role="screenshot"]
- image::images/ingest/ingest-pipeline-list.png[Kibana's Ingest Pipelines list view,align="center"]
- . Click **Create pipeline**.
- . Provide a name and description for the pipeline.
- . Add a <<grok-processor,grok processor>> to parse the log message:
- .. Click **Add a processor** and select the **Grok** processor type.
- .. Set **Field** to `message` and **Patterns** to the following
- <<grok-basics,grok pattern>>:
- +
- [source,grok]
- ----
- %{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}\" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}
- ----
- // NOTCONSOLE
- +
- .. Click **Add** to save the processor.
- .. Set the processor description to `Extract fields from 'message'`.
- . Add processors for the timestamp, IP address, and user agent fields. Configure
- the processors as follows:
- +
- --
- [options="header"]
- |====
- | Processor type | Field | Additional options | Description
- | <<date-processor,**Date**>>
- | `@timestamp`
- | **Formats**: `dd/MMM/yyyy:HH:mm:ss Z`
- | `Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'`
- | <<geoip-processor,**GeoIP**>>
- | `source.ip`
- | **Target field**: `source.geo`
- | `Add 'source.geo' GeoIP data for 'source.ip'`
- | <<user-agent-processor,**User agent**>>
- | `user_agent`
- |
- | `Extract fields from 'user_agent'`
- |====
- Your form should look similar to this:
- [role="screenshot"]
- image::images/ingest/ingest-pipeline-processor.png[Processors for Ingest Pipelines,align="center"]
- The four processors will run sequentially: +
- Grok > Date > GeoIP > User agent +
- You can reorder processors using the arrow icons.
- Alternatively, you can click the **Import processors** link and define the
- processors as JSON:
- [source,js]
- ----
- {
- include::common-log-format-example.asciidoc[tag=common-log-pipeline]
- }
- ----
- // NOTCONSOLE
- ////
- [source,console]
- ----
- PUT _ingest/pipeline/my-pipeline
- {
- // tag::common-log-pipeline[]
- "processors": [
- {
- "grok": {
- "description": "Extract fields from 'message'",
- "field": "message",
- "patterns": ["%{IPORHOST:source.ip} %{USER:user.id} %{USER:user.name} \\[%{HTTPDATE:@timestamp}\\] \"%{WORD:http.request.method} %{DATA:url.original} HTTP/%{NUMBER:http.version}\" %{NUMBER:http.response.status_code:int} (?:-|%{NUMBER:http.response.body.bytes:int}) %{QS:http.request.referrer} %{QS:user_agent}"]
- }
- },
- {
- "date": {
- "description": "Format '@timestamp' as 'dd/MMM/yyyy:HH:mm:ss Z'",
- "field": "@timestamp",
- "formats": [ "dd/MMM/yyyy:HH:mm:ss Z" ]
- }
- },
- {
- "geoip": {
- "description": "Add 'source.geo' GeoIP data for 'source.ip'",
- "field": "source.ip",
- "target_field": "source.geo"
- }
- },
- {
- "user_agent": {
- "description": "Extract fields from 'user_agent'",
- "field": "user_agent"
- }
- }
- ]
- // end::common-log-pipeline[]
- }
- ----
- ////
- --
- . To test the pipeline, click **Add documents**.
- . In the **Documents** tab, provide a sample document for testing:
- +
- [source,js]
- ----
- [
- {
- "_source": {
- "message": "212.87.37.154 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
- }
- }
- ]
- ----
- // NOTCONSOLE
- . Click **Run the pipeline** and verify the pipeline worked as expected.
- . If everything looks correct, close the panel, and then click **Create
- pipeline**.
- +
- You’re now ready to index the logs data to a <<data-streams,data stream>>.
- . Create an <<index-templates,index template>> with
- <<create-index-template,data stream enabled>>.
- +
- [source,console]
- ----
- PUT _index_template/my-data-stream-template
- {
- "index_patterns": [ "my-data-stream*" ],
- "data_stream": { },
- "priority": 500
- }
- ----
- // TEST[continued]
- . Index a document with the pipeline you created.
- +
- [source,console]
- ----
- POST my-data-stream/_doc?pipeline=my-pipeline
- {
- "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
- }
- ----
- // TEST[s/my-pipeline/my-pipeline&refresh=wait_for/]
- // TEST[continued]
- . To verify, search the data stream to retrieve the document. The following
- search uses <<common-options-response-filtering,`filter_path`>> to return only
- the <<mapping-source-field,document source>>.
- +
- --
- [source,console]
- ----
- GET my-data-stream/_search?filter_path=hits.hits._source
- ----
- // TEST[continued]
- The API returns:
- [source,console-result]
- ----
- {
- "hits": {
- "hits": [
- {
- "_source": {
- "@timestamp": "2099-05-05T16:21:15.000Z",
- "http": {
- "request": {
- "referrer": "\"-\"",
- "method": "GET"
- },
- "response": {
- "status_code": 200,
- "body": {
- "bytes": 3638
- }
- },
- "version": "1.1"
- },
- "source": {
- "ip": "89.160.20.128",
- "geo": {
- "continent_name" : "Europe",
- "country_name" : "Sweden",
- "country_iso_code" : "SE",
- "city_name" : "Linköping",
- "region_iso_code" : "SE-E",
- "region_name" : "Östergötland County",
- "location" : {
- "lon" : 15.6167,
- "lat" : 58.4167
- }
- }
- },
- "message": "89.160.20.128 - - [05/May/2099:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
- "url": {
- "original": "/favicon.ico"
- },
- "user": {
- "name": "-",
- "id": "-"
- },
- "user_agent": {
- "original": "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"",
- "os": {
- "name": "Mac OS X",
- "version": "10.11.6",
- "full": "Mac OS X 10.11.6"
- },
- "name": "Chrome",
- "device": {
- "name": "Mac"
- },
- "version": "52.0.2743.116"
- }
- }
- }
- ]
- }
- }
- ----
- --
- ////
- [source,console]
- ----
- DELETE _data_stream/*
- DELETE _index_template/*
- ----
- // TEST[continued]
- ////
|