123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428 |
- [[geoip-processor]]
- === GeoIP processor
- ++++
- <titleabbrev>GeoIP</titleabbrev>
- ++++
- The `geoip` processor adds information about the geographical location of an
- IPv4 or IPv6 address.
- [[geoip-automatic-updates]]
- By default, the processor uses the GeoLite2 City, GeoLite2 Country, and GeoLite2
- ASN GeoIP2 databases from
- http://dev.maxmind.com/geoip/geoip2/geolite2/[MaxMind], shared under the
- CC BY-SA 4.0 license. {es} automatically downloads updates for
- these databases from the Elastic GeoIP endpoint:
- https://geoip.elastic.co/v1/database. To get download statistics for these
- updates, use the <<geoip-stats-api,GeoIP stats API>>.
- If your cluster can't connect to the Elastic GeoIP endpoint or you want to
- manage your own updates, see <<manage-geoip-database-updates>>.
- If {es} can't connect to the endpoint for 30 days all updated databases will become
- invalid. {es} will stop enriching documents with geoip data and will add `tags: ["_geoip_expired_database"]`
- field instead.
- [[using-ingest-geoip]]
- ==== Using the `geoip` Processor in a Pipeline
- [[ingest-geoip-options]]
- .`geoip` options
- [options="header"]
- |======
- | Name | Required | Default | Description
- | `field` | yes | - | The field to get the ip address from for the geographical lookup.
- | `target_field` | no | geoip | The field that will hold the geographical information looked up from the MaxMind database.
- | `database_file` | no | GeoLite2-City.mmdb | The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the `ingest-geoip` config directory.
- | `properties` | no | [`continent_name`, `country_iso_code`, `country_name`, `region_iso_code`, `region_name`, `city_name`, `location`] * | Controls what properties are added to the `target_field` based on the geoip lookup.
- | `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
- | `first_only` | no | `true` | If `true` only first found geoip data will be returned, even if `field` contains array
- |======
- *Depends on what is available in `database_file`:
- * If the GeoLite2 City database is used, then the following fields may be added under the `target_field`: `ip`,
- `country_iso_code`, `country_name`, `continent_name`, `region_iso_code`, `region_name`, `city_name`, `timezone`, `latitude`, `longitude`
- and `location`. The fields actually added depend on what has been found and which properties were configured in `properties`.
- * If the GeoLite2 Country database is used, then the following fields may be added under the `target_field`: `ip`,
- `country_iso_code`, `country_name` and `continent_name`. The fields actually added depend on what has been found and which properties
- were configured in `properties`.
- * If the GeoLite2 ASN database is used, then the following fields may be added under the `target_field`: `ip`,
- `asn`, `organization_name` and `network`. The fields actually added depend on what has been found and which properties were configured
- in `properties`.
- Here is an example that uses the default city database and adds the geographical information to the `geoip` field based on the `ip` field:
- [source,console]
- --------------------------------------------------
- PUT _ingest/pipeline/geoip
- {
- "description" : "Add geoip info",
- "processors" : [
- {
- "geoip" : {
- "field" : "ip"
- }
- }
- ]
- }
- PUT my-index-000001/_doc/my_id?pipeline=geoip
- {
- "ip": "89.160.20.128"
- }
- GET my-index-000001/_doc/my_id
- --------------------------------------------------
- Which returns:
- [source,console-result]
- --------------------------------------------------
- {
- "found": true,
- "_index": "my-index-000001",
- "_id": "my_id",
- "_version": 1,
- "_seq_no": 55,
- "_primary_term": 1,
- "_source": {
- "ip": "89.160.20.128",
- "geoip": {
- "continent_name": "Europe",
- "country_name": "Sweden",
- "country_iso_code": "SE",
- "city_name" : "Linköping",
- "region_iso_code" : "SE-E",
- "region_name" : "Östergötland County",
- "location": { "lat": 58.4167, "lon": 15.6167 }
- }
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term":1/"_primary_term" : $body._primary_term/]
- Here is an example that uses the default country database and adds the
- geographical information to the `geo` field based on the `ip` field. Note that
- this database is included in the module. So this:
- [source,console]
- --------------------------------------------------
- PUT _ingest/pipeline/geoip
- {
- "description" : "Add geoip info",
- "processors" : [
- {
- "geoip" : {
- "field" : "ip",
- "target_field" : "geo",
- "database_file" : "GeoLite2-Country.mmdb"
- }
- }
- ]
- }
- PUT my-index-000001/_doc/my_id?pipeline=geoip
- {
- "ip": "89.160.20.128"
- }
- GET my-index-000001/_doc/my_id
- --------------------------------------------------
- returns this:
- [source,console-result]
- --------------------------------------------------
- {
- "found": true,
- "_index": "my-index-000001",
- "_id": "my_id",
- "_version": 1,
- "_seq_no": 65,
- "_primary_term": 1,
- "_source": {
- "ip": "89.160.20.128",
- "geo": {
- "continent_name": "Europe",
- "country_name": "Sweden",
- "country_iso_code": "SE"
- }
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
- Not all IP addresses find geo information from the database, When this
- occurs, no `target_field` is inserted into the document.
- Here is an example of what documents will be indexed as when information for "80.231.5.0"
- cannot be found:
- [source,console]
- --------------------------------------------------
- PUT _ingest/pipeline/geoip
- {
- "description" : "Add geoip info",
- "processors" : [
- {
- "geoip" : {
- "field" : "ip"
- }
- }
- ]
- }
- PUT my-index-000001/_doc/my_id?pipeline=geoip
- {
- "ip": "80.231.5.0"
- }
- GET my-index-000001/_doc/my_id
- --------------------------------------------------
- Which returns:
- [source,console-result]
- --------------------------------------------------
- {
- "_index" : "my-index-000001",
- "_id" : "my_id",
- "_version" : 1,
- "_seq_no" : 71,
- "_primary_term": 1,
- "found" : true,
- "_source" : {
- "ip" : "80.231.5.0"
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
- [[ingest-geoip-mappings-note]]
- ===== Recognizing Location as a Geopoint
- Although this processor enriches your document with a `location` field containing
- the estimated latitude and longitude of the IP address, this field will not be
- indexed as a {ref}/geo-point.html[`geo_point`] type in Elasticsearch without explicitly defining it
- as such in the mapping.
- You can use the following mapping for the example index above:
- [source,console]
- --------------------------------------------------
- PUT my_ip_locations
- {
- "mappings": {
- "properties": {
- "geoip": {
- "properties": {
- "location": { "type": "geo_point" }
- }
- }
- }
- }
- }
- --------------------------------------------------
- ////
- [source,console]
- --------------------------------------------------
- PUT _ingest/pipeline/geoip
- {
- "description" : "Add geoip info",
- "processors" : [
- {
- "geoip" : {
- "field" : "ip"
- }
- }
- ]
- }
- PUT my_ip_locations/_doc/1?refresh=true&pipeline=geoip
- {
- "ip": "89.160.20.128"
- }
- GET /my_ip_locations/_search
- {
- "query": {
- "bool": {
- "must": {
- "match_all": {}
- },
- "filter": {
- "geo_distance": {
- "distance": "1m",
- "geoip.location": {
- "lon": 15.6167,
- "lat": 58.4167
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[continued]
- [source,console-result]
- --------------------------------------------------
- {
- "took" : 3,
- "timed_out" : false,
- "_shards" : {
- "total" : 1,
- "successful" : 1,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value": 1,
- "relation": "eq"
- },
- "max_score" : 1.0,
- "hits" : [
- {
- "_index" : "my_ip_locations",
- "_id" : "1",
- "_score" : 1.0,
- "_source" : {
- "geoip" : {
- "continent_name" : "Europe",
- "country_name" : "Sweden",
- "country_iso_code" : "SE",
- "city_name" : "Linköping",
- "region_iso_code" : "SE-E",
- "region_name" : "Östergötland County",
- "location" : {
- "lon" : 15.6167,
- "lat" : 58.4167
- }
- },
- "ip" : "89.160.20.128"
- }
- }
- ]
- }
- }
- --------------------------------------------------
- // TESTRESPONSE[s/"took" : 3/"took" : $body.took/]
- ////
- [[manage-geoip-database-updates]]
- ==== Manage your own GeoIP2 database updates
- If you can't <<geoip-automatic-updates,automatically update>> your GeoIP2
- databases from the Elastic endpoint, you have a few other options:
- * <<use-proxy-geoip-endpoint,Use a proxy endpoint>>
- * <<use-custom-geoip-endpoint,Use a custom endpoint>>
- * <<manually-update-geoip-databases,Manually update your GeoIP2 databases>>
- [[use-proxy-geoip-endpoint]]
- **Use a proxy endpoint**
- If you can't connect directly to the Elastic GeoIP endpoint, consider setting up
- a secure proxy. You can then specify the proxy endpoint URL in the
- <<ingest-geoip-downloader-endpoint,`ingest.geoip.downloader.endpoint`>> setting
- of each node’s `elasticsearch.yml` file.
- [IMPORTANT]
- ====
- In air gapped environments, the {es} nodes require access to `https://geoip.elastic.co`
- and `https://storage.googleapis.com/`.
- ====
- [[use-custom-geoip-endpoint]]
- **Use a custom endpoint**
- You can create a service that mimics the Elastic GeoIP endpoint. You can then
- get automatic updates from this service.
- . Download your `.mmdb` database files from the
- http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site].
- . Copy your database files to a single directory.
- . From your {es} directory, run:
- +
- [source,sh]
- ----
- ./bin/elasticsearch-geoip -s my/source/dir [-t target/directory]
- ----
- . Serve the static database files from your directory. For example, you can use
- Docker to serve the files from an nginx server:
- +
- [source,sh]
- ----
- docker run -v my/source/dir:/usr/share/nginx/html:ro nginx
- ----
- . Specify the service's endpoint URL in the
- <<ingest-geoip-downloader-endpoint,`ingest.geoip.downloader.endpoint`>> setting
- of each node’s `elasticsearch.yml` file.
- +
- By default, {es} checks the endpoint for updates every three days. To use
- another polling interval, use the <<cluster-update-settings,cluster update
- settings API>> to set
- <<ingest-geoip-downloader-poll-interval,`ingest.geoip.downloader.poll.interval`>>.
- [[manually-update-geoip-databases]]
- **Manually update your GeoIP2 databases**
- . Use the <<cluster-update-settings,cluster update settings API>> to set
- `ingest.geoip.downloader.enabled` to `false`. This disables automatic updates
- that may overwrite your database changes. This also deletes all downloaded
- databases.
- . Download your `.mmdb` database files from the
- http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site].
- +
- You can also use custom city, country, and ASN `.mmdb` files. These files must
- be uncompressed and use the respective `-City.mmdb`, `-Country.mmdb`, or
- `-ASN.mmdb` extensions.
- . On {ess} deployments upload database using
- a {cloud}/ec-custom-bundles.html[custom bundle].
- . On self-managed deployments copy the database files to `$ES_CONFIG/ingest-geoip`.
- . In your `geoip` processors, configure the `database_file` parameter to use a
- custom database file.
- [[ingest-geoip-settings]]
- ===== Node Settings
- The `geoip` processor supports the following setting:
- `ingest.geoip.cache_size`::
- The maximum number of results that should be cached. Defaults to `1000`.
- Note that these settings are node settings and apply to all `geoip` processors, i.e. there is one cache for all defined `geoip` processors.
- [[geoip-cluster-settings]]
- ===== Cluster settings
- [[ingest-geoip-downloader-enabled]]
- `ingest.geoip.downloader.enabled`::
- (<<dynamic-cluster-setting,Dynamic>>, Boolean)
- If `true`, {es} automatically downloads and manages updates for GeoIP2 databases
- from the `ingest.geoip.downloader.endpoint`. If `false`, {es} does not download
- updates and deletes all downloaded databases. Defaults to `true`.
- [[ingest-geoip-downloader-endpoint]]
- `ingest.geoip.downloader.endpoint`::
- (<<static-cluster-setting,Static>>, string)
- Endpoint URL used to download updates for GeoIP2 databases. Defaults to
- `https://geoip.elastic.co/v1/database`. {es} stores downloaded database files in
- each node's <<es-tmpdir,temporary directory>> at
- `$ES_TMPDIR/geoip-databases/<node_id>`.
- [[ingest-geoip-downloader-poll-interval]]
- `ingest.geoip.downloader.poll.interval`::
- (<<dynamic-cluster-setting,Dynamic>>, <<time-units,time value>>)
- How often {es} checks for GeoIP2 database updates at the
- `ingest.geoip.downloader.endpoint`. Must be greater than `1d` (one day). Defaults
- to `3d` (three days).
|