geoip.asciidoc 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416
  1. [[geoip-processor]]
  2. === GeoIP processor
  3. ++++
  4. <titleabbrev>GeoIP</titleabbrev>
  5. ++++
  6. The `geoip` processor adds information about the geographical location of an
  7. IPv4 or IPv6 address.
  8. [[geoip-automatic-updates]]
  9. By default, the processor uses the GeoLite2 City, GeoLite2 Country, and GeoLite2
  10. ASN GeoIP2 databases from
  11. http://dev.maxmind.com/geoip/geoip2/geolite2/[MaxMind], shared under the
  12. CCA-ShareAlike 4.0 license. {es} automatically downloads updates for
  13. these databases from the Elastic GeoIP endpoint:
  14. https://geoip.elastic.co/v1/database. To get download statistics for these
  15. updates, use the <<geoip-stats-api,GeoIP stats API>>.
  16. If your cluster can't connect to the Elastic GeoIP endpoint or you want to
  17. manage your own updates, see <<manage-geoip-database-updates>>.
  18. If {es} can't connect to the endpoint for 30 days all updated databases will become
  19. invalid. {es} will stop enriching documents with geoip data and will add `tags: ["_geoip_expired_database"]`
  20. field instead.
  21. [[using-ingest-geoip]]
  22. ==== Using the `geoip` Processor in a Pipeline
  23. [[ingest-geoip-options]]
  24. .`geoip` options
  25. [options="header"]
  26. |======
  27. | Name | Required | Default | Description
  28. | `field` | yes | - | The field to get the ip address from for the geographical lookup.
  29. | `target_field` | no | geoip | The field that will hold the geographical information looked up from the MaxMind database.
  30. | `database_file` | no | GeoLite2-City.mmdb | The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the `ingest-geoip` config directory.
  31. | `properties` | no | [`continent_name`, `country_iso_code`, `country_name`, `region_iso_code`, `region_name`, `city_name`, `location`] * | Controls what properties are added to the `target_field` based on the geoip lookup.
  32. | `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
  33. | `first_only` | no | `true` | If `true` only first found geoip data will be returned, even if `field` contains array
  34. |======
  35. *Depends on what is available in `database_file`:
  36. * If the GeoLite2 City database is used, then the following fields may be added under the `target_field`: `ip`,
  37. `country_iso_code`, `country_name`, `continent_name`, `region_iso_code`, `region_name`, `city_name`, `timezone`, `latitude`, `longitude`
  38. and `location`. The fields actually added depend on what has been found and which properties were configured in `properties`.
  39. * If the GeoLite2 Country database is used, then the following fields may be added under the `target_field`: `ip`,
  40. `country_iso_code`, `country_name` and `continent_name`. The fields actually added depend on what has been found and which properties
  41. were configured in `properties`.
  42. * If the GeoLite2 ASN database is used, then the following fields may be added under the `target_field`: `ip`,
  43. `asn`, `organization_name` and `network`. The fields actually added depend on what has been found and which properties were configured
  44. in `properties`.
  45. Here is an example that uses the default city database and adds the geographical information to the `geoip` field based on the `ip` field:
  46. [source,console]
  47. --------------------------------------------------
  48. PUT _ingest/pipeline/geoip
  49. {
  50. "description" : "Add geoip info",
  51. "processors" : [
  52. {
  53. "geoip" : {
  54. "field" : "ip"
  55. }
  56. }
  57. ]
  58. }
  59. PUT my-index-000001/_doc/my_id?pipeline=geoip
  60. {
  61. "ip": "8.8.8.8"
  62. }
  63. GET my-index-000001/_doc/my_id
  64. --------------------------------------------------
  65. Which returns:
  66. [source,console-result]
  67. --------------------------------------------------
  68. {
  69. "found": true,
  70. "_index": "my-index-000001",
  71. "_id": "my_id",
  72. "_version": 1,
  73. "_seq_no": 55,
  74. "_primary_term": 1,
  75. "_source": {
  76. "ip": "8.8.8.8",
  77. "geoip": {
  78. "continent_name": "North America",
  79. "country_name": "United States",
  80. "country_iso_code": "US",
  81. "location": { "lat": 37.751, "lon": -97.822 }
  82. }
  83. }
  84. }
  85. --------------------------------------------------
  86. // TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term":1/"_primary_term" : $body._primary_term/]
  87. Here is an example that uses the default country database and adds the
  88. geographical information to the `geo` field based on the `ip` field. Note that
  89. this database is included in the module. So this:
  90. [source,console]
  91. --------------------------------------------------
  92. PUT _ingest/pipeline/geoip
  93. {
  94. "description" : "Add geoip info",
  95. "processors" : [
  96. {
  97. "geoip" : {
  98. "field" : "ip",
  99. "target_field" : "geo",
  100. "database_file" : "GeoLite2-Country.mmdb"
  101. }
  102. }
  103. ]
  104. }
  105. PUT my-index-000001/_doc/my_id?pipeline=geoip
  106. {
  107. "ip": "8.8.8.8"
  108. }
  109. GET my-index-000001/_doc/my_id
  110. --------------------------------------------------
  111. returns this:
  112. [source,console-result]
  113. --------------------------------------------------
  114. {
  115. "found": true,
  116. "_index": "my-index-000001",
  117. "_id": "my_id",
  118. "_version": 1,
  119. "_seq_no": 65,
  120. "_primary_term": 1,
  121. "_source": {
  122. "ip": "8.8.8.8",
  123. "geo": {
  124. "continent_name": "North America",
  125. "country_name": "United States",
  126. "country_iso_code": "US",
  127. }
  128. }
  129. }
  130. --------------------------------------------------
  131. // TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
  132. Not all IP addresses find geo information from the database, When this
  133. occurs, no `target_field` is inserted into the document.
  134. Here is an example of what documents will be indexed as when information for "80.231.5.0"
  135. cannot be found:
  136. [source,console]
  137. --------------------------------------------------
  138. PUT _ingest/pipeline/geoip
  139. {
  140. "description" : "Add geoip info",
  141. "processors" : [
  142. {
  143. "geoip" : {
  144. "field" : "ip"
  145. }
  146. }
  147. ]
  148. }
  149. PUT my-index-000001/_doc/my_id?pipeline=geoip
  150. {
  151. "ip": "80.231.5.0"
  152. }
  153. GET my-index-000001/_doc/my_id
  154. --------------------------------------------------
  155. Which returns:
  156. [source,console-result]
  157. --------------------------------------------------
  158. {
  159. "_index" : "my-index-000001",
  160. "_id" : "my_id",
  161. "_version" : 1,
  162. "_seq_no" : 71,
  163. "_primary_term": 1,
  164. "found" : true,
  165. "_source" : {
  166. "ip" : "80.231.5.0"
  167. }
  168. }
  169. --------------------------------------------------
  170. // TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ s/"_primary_term" : 1/"_primary_term" : $body._primary_term/]
  171. [[ingest-geoip-mappings-note]]
  172. ===== Recognizing Location as a Geopoint
  173. Although this processor enriches your document with a `location` field containing
  174. the estimated latitude and longitude of the IP address, this field will not be
  175. indexed as a {ref}/geo-point.html[`geo_point`] type in Elasticsearch without explicitly defining it
  176. as such in the mapping.
  177. You can use the following mapping for the example index above:
  178. [source,console]
  179. --------------------------------------------------
  180. PUT my_ip_locations
  181. {
  182. "mappings": {
  183. "properties": {
  184. "geoip": {
  185. "properties": {
  186. "location": { "type": "geo_point" }
  187. }
  188. }
  189. }
  190. }
  191. }
  192. --------------------------------------------------
  193. ////
  194. [source,console]
  195. --------------------------------------------------
  196. PUT _ingest/pipeline/geoip
  197. {
  198. "description" : "Add geoip info",
  199. "processors" : [
  200. {
  201. "geoip" : {
  202. "field" : "ip"
  203. }
  204. }
  205. ]
  206. }
  207. PUT my_ip_locations/_doc/1?refresh=true&pipeline=geoip
  208. {
  209. "ip": "8.8.8.8"
  210. }
  211. GET /my_ip_locations/_search
  212. {
  213. "query": {
  214. "bool": {
  215. "must": {
  216. "match_all": {}
  217. },
  218. "filter": {
  219. "geo_distance": {
  220. "distance": "1m",
  221. "geoip.location": {
  222. "lon": -97.822,
  223. "lat": 37.751
  224. }
  225. }
  226. }
  227. }
  228. }
  229. }
  230. --------------------------------------------------
  231. // TEST[continued]
  232. [source,console-result]
  233. --------------------------------------------------
  234. {
  235. "took" : 3,
  236. "timed_out" : false,
  237. "_shards" : {
  238. "total" : 1,
  239. "successful" : 1,
  240. "skipped" : 0,
  241. "failed" : 0
  242. },
  243. "hits" : {
  244. "total" : {
  245. "value": 1,
  246. "relation": "eq"
  247. },
  248. "max_score" : 1.0,
  249. "hits" : [
  250. {
  251. "_index" : "my_ip_locations",
  252. "_id" : "1",
  253. "_score" : 1.0,
  254. "_source" : {
  255. "geoip" : {
  256. "continent_name" : "North America",
  257. "country_name" : "United States",
  258. "country_iso_code" : "US",
  259. "location" : {
  260. "lon" : -97.822,
  261. "lat" : 37.751
  262. }
  263. },
  264. "ip" : "8.8.8.8"
  265. }
  266. }
  267. ]
  268. }
  269. }
  270. --------------------------------------------------
  271. // TESTRESPONSE[s/"took" : 3/"took" : $body.took/]
  272. ////
  273. [[manage-geoip-database-updates]]
  274. ==== Manage your own GeoIP2 database updates
  275. If you can't <<geoip-automatic-updates,automatically update>> your GeoIP2
  276. databases from the Elastic endpoint, you have a few other options:
  277. * <<use-proxy-geoip-endpoint,Use a proxy endpoint>>
  278. * <<use-custom-geoip-endpoint,Use a custom endpoint>>
  279. * <<manually-update-geoip-databases,Manually update your GeoIP2 databases>>
  280. [[use-proxy-geoip-endpoint]]
  281. **Use a proxy endpoint**
  282. If you can't connect directly to the Elastic GeoIP endpoint, consider setting up
  283. a secure proxy. You can then specify the proxy endpoint URL in the
  284. <<ingest-geoip-downloader-endpoint,`ingest.geoip.downloader.endpoint`>> setting
  285. of each node’s `elasticsearch.yml` file.
  286. [[use-custom-geoip-endpoint]]
  287. **Use a custom endpoint**
  288. You can create a service that mimics the Elastic GeoIP endpoint. You can then
  289. get automatic updates from this service.
  290. . Download your `.mmdb` database files from the
  291. http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site].
  292. . Copy your database files to a single directory.
  293. . From your {es} directory, run:
  294. +
  295. [source,sh]
  296. ----
  297. ./bin/elasticsearch-geoip -s my/source/dir [-t target/directory]
  298. ----
  299. . Serve the static database files from your directory. For example, you can use
  300. Docker to serve the files from an nginx server:
  301. +
  302. [source,sh]
  303. ----
  304. docker run -v my/source/dir:/usr/share/nginx/html:ro nginx
  305. ----
  306. . Specify the service's endpoint URL in the
  307. <<ingest-geoip-downloader-endpoint,`ingest.geoip.downloader.endpoint`>> setting
  308. of each node’s `elasticsearch.yml` file.
  309. +
  310. By default, {es} checks the endpoint for updates every three days. To use
  311. another polling interval, use the <<cluster-update-settings,update cluster
  312. settings API>> to set
  313. <<ingest-geoip-downloader-poll-interval,`ingest.geoip.downloader.poll.interval`>>.
  314. [[manually-update-geoip-databases]]
  315. **Manually update your GeoIP2 databases**
  316. . Use the <<cluster-update-settings,update cluster settings API>> to set
  317. `ingest.geoip.downloader.enabled` to `false`. This disables automatic updates
  318. that may overwrite your database changes. This also deletes all downloaded
  319. databases.
  320. . Download your `.mmdb` database files from the
  321. http://dev.maxmind.com/geoip/geoip2/geolite2[MaxMind site].
  322. +
  323. You can also use custom city, country, and ASN `.mmdb` files. These files must
  324. be uncompressed and use the respective `-City.mmdb`, `-Country.mmdb`, or
  325. `-ASN.mmdb` extensions.
  326. . On {ess} deployments upload database using
  327. a {cloud}/ec-custom-bundles.html[custom bundle].
  328. . On self-managed deployments copy the database files to `$ES_CONFIG/ingest-geoip`.
  329. . In your `geoip` processors, configure the `database_file` parameter to use a
  330. custom database file.
  331. [[ingest-geoip-settings]]
  332. ===== Node Settings
  333. The `geoip` processor supports the following setting:
  334. `ingest.geoip.cache_size`::
  335. The maximum number of results that should be cached. Defaults to `1000`.
  336. Note that these settings are node settings and apply to all `geoip` processors, i.e. there is one cache for all defined `geoip` processors.
  337. [[geoip-cluster-settings]]
  338. ===== Cluster settings
  339. [[ingest-geoip-downloader-enabled]]
  340. `ingest.geoip.downloader.enabled`::
  341. (<<dynamic-cluster-setting,Dynamic>>, Boolean)
  342. If `true`, {es} automatically downloads and manages updates for GeoIP2 databases
  343. from the `ingest.geoip.downloader.endpoint`. If `false`, {es} does not download
  344. updates and deletes all downloaded databases. Defaults to `true`.
  345. [[ingest-geoip-downloader-endpoint]]
  346. `ingest.geoip.downloader.endpoint`::
  347. (<<static-cluster-setting,Static>>, string)
  348. Endpoint URL used to download updates for GeoIP2 databases. Defaults to
  349. `https://geoip.elastic.co/v1/database`. {es} stores downloaded database files in
  350. each node's <<es-tmpdir,temporary directory>> at
  351. `$ES_TMPDIR/geoip-databases/<node_id>`.
  352. [[ingest-geoip-downloader-poll-interval]]
  353. `ingest.geoip.downloader.poll.interval`::
  354. (<<dynamic-cluster-setting,Dynamic>>, <<time-units,time value>>)
  355. How often {es} checks for GeoIP2 database updates at the
  356. `ingest.geoip.downloader.endpoint`. Must be greater than `1d` (one day). Defaults
  357. to `3d` (three days).