123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420 |
- [[tune-for-search-speed]]
- == Tune for search speed
- [float]
- === Give memory to the filesystem cache
- Elasticsearch heavily relies on the filesystem cache in order to make search
- fast. In general, you should make sure that at least half the available memory
- goes to the filesystem cache so that Elasticsearch can keep hot regions of the
- index in physical memory.
- [float]
- === Use faster hardware
- If your search is I/O bound, you should investigate giving more memory to the
- filesystem cache (see above) or buying faster drives. In particular SSD drives
- are known to perform better than spinning disks. Always use local storage,
- remote filesystems such as `NFS` or `SMB` should be avoided. Also beware of
- virtualized storage such as Amazon's `Elastic Block Storage`. Virtualized
- storage works very well with Elasticsearch, and it is appealing since it is so
- fast and simple to set up, but it is also unfortunately inherently slower on an
- ongoing basis when compared to dedicated local storage. If you put an index on
- `EBS`, be sure to use provisioned IOPS otherwise operations could be quickly
- throttled.
- If your search is CPU-bound, you should investigate buying faster CPUs.
- [float]
- === Document modeling
- Documents should be modeled so that search-time operations are as cheap as possible.
- In particular, joins should be avoided. <<nested,`nested`>> can make queries
- several times slower and <<parent-join,parent-child>> relations can make
- queries hundreds of times slower. So if the same questions can be answered without
- joins by denormalizing documents, significant speedups can be expected.
- [float]
- === Search as few fields as possible
- The more fields a <<query-dsl-query-string-query,`query_string`>> or
- <<query-dsl-multi-match-query,`multi_match`>> query targets, the slower it is.
- A common technique to improve search speed over multiple fields is to copy
- their values into a single field at index time, and then use this field at
- search time. This can be automated with the <<copy-to,`copy-to`>> directive of
- mappings without having to change the source of documents. Here is an example
- of an index containing movies that optimizes queries that search over both the
- name and the plot of the movie by indexing both values into the `name_and_plot`
- field.
- [source,console]
- --------------------------------------------------
- PUT movies
- {
- "mappings": {
- "properties": {
- "name_and_plot": {
- "type": "text"
- },
- "name": {
- "type": "text",
- "copy_to": "name_and_plot"
- },
- "plot": {
- "type": "text",
- "copy_to": "name_and_plot"
- }
- }
- }
- }
- --------------------------------------------------
- [float]
- === Pre-index data
- You should leverage patterns in your queries to optimize the way data is indexed.
- For instance, if all your documents have a `price` field and most queries run
- <<search-aggregations-bucket-range-aggregation,`range`>> aggregations on a fixed
- list of ranges, you could make this aggregation faster by pre-indexing the ranges
- into the index and using a <<search-aggregations-bucket-terms-aggregation,`terms`>>
- aggregations.
- For instance, if documents look like:
- [source,console]
- --------------------------------------------------
- PUT index/_doc/1
- {
- "designation": "spoon",
- "price": 13
- }
- --------------------------------------------------
- and search requests look like:
- [source,console]
- --------------------------------------------------
- GET index/_search
- {
- "aggs": {
- "price_ranges": {
- "range": {
- "field": "price",
- "ranges": [
- { "to": 10 },
- { "from": 10, "to": 100 },
- { "from": 100 }
- ]
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[continued]
- Then documents could be enriched by a `price_range` field at index time, which
- should be mapped as a <<keyword,`keyword`>>:
- [source,console]
- --------------------------------------------------
- PUT index
- {
- "mappings": {
- "properties": {
- "price_range": {
- "type": "keyword"
- }
- }
- }
- }
- PUT index/_doc/1
- {
- "designation": "spoon",
- "price": 13,
- "price_range": "10-100"
- }
- --------------------------------------------------
- And then search requests could aggregate this new field rather than running a
- `range` aggregation on the `price` field.
- [source,console]
- --------------------------------------------------
- GET index/_search
- {
- "aggs": {
- "price_ranges": {
- "terms": {
- "field": "price_range"
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[continued]
- [float]
- [[map-ids-as-keyword]]
- === Consider mapping identifiers as `keyword`
- include::../mapping/types/numeric.asciidoc[tag=map-ids-as-keyword]
- [float]
- === Avoid scripts
- In general, scripts should be avoided. If they are absolutely needed, you
- should prefer the `painless` and `expressions` engines.
- [float]
- === Search rounded dates
- Queries on date fields that use `now` are typically not cacheable since the
- range that is being matched changes all the time. However switching to a
- rounded date is often acceptable in terms of user experience, and has the
- benefit of making better use of the query cache.
- For instance the below query:
- [source,console]
- --------------------------------------------------
- PUT index/_doc/1
- {
- "my_date": "2016-05-11T16:30:55.328Z"
- }
- GET index/_search
- {
- "query": {
- "constant_score": {
- "filter": {
- "range": {
- "my_date": {
- "gte": "now-1h",
- "lte": "now"
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- could be replaced with the following query:
- [source,console]
- --------------------------------------------------
- GET index/_search
- {
- "query": {
- "constant_score": {
- "filter": {
- "range": {
- "my_date": {
- "gte": "now-1h/m",
- "lte": "now/m"
- }
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[continued]
- In that case we rounded to the minute, so if the current time is `16:31:29`,
- the range query will match everything whose value of the `my_date` field is
- between `15:31:00` and `16:31:59`. And if several users run a query that
- contains this range in the same minute, the query cache could help speed things
- up a bit. The longer the interval that is used for rounding, the more the query
- cache can help, but beware that too aggressive rounding might also hurt user
- experience.
- NOTE: It might be tempting to split ranges into a large cacheable part and
- smaller not cacheable parts in order to be able to leverage the query cache,
- as shown below:
- [source,console]
- --------------------------------------------------
- GET index/_search
- {
- "query": {
- "constant_score": {
- "filter": {
- "bool": {
- "should": [
- {
- "range": {
- "my_date": {
- "gte": "now-1h",
- "lte": "now-1h/m"
- }
- }
- },
- {
- "range": {
- "my_date": {
- "gt": "now-1h/m",
- "lt": "now/m"
- }
- }
- },
- {
- "range": {
- "my_date": {
- "gte": "now/m",
- "lte": "now"
- }
- }
- }
- ]
- }
- }
- }
- }
- }
- --------------------------------------------------
- // TEST[continued]
- However such practice might make the query run slower in some cases since the
- overhead introduced by the `bool` query may defeat the savings from better
- leveraging the query cache.
- [float]
- === Force-merge read-only indices
- Indices that are read-only may benefit from being <<indices-forcemerge,merged
- down to a single segment>>. This is typically the case with time-based indices:
- only the index for the current time frame is getting new documents while older
- indices are read-only. Shards that have been force-merged into a single segment
- can use simpler and more efficient data structures to perform searches.
- IMPORTANT: Do not force-merge indices to which you are still writing, or to
- which you will write again in the future. Instead, rely on the automatic
- background merge process to perform merges as needed to keep the index running
- smoothly. If you continue to write to a force-merged index then its performance
- may become much worse.
- [float]
- === Warm up global ordinals
- Global ordinals are a data-structure that is used in order to run
- <<search-aggregations-bucket-terms-aggregation,`terms`>> aggregations on
- <<keyword,`keyword`>> fields. They are loaded lazily in memory because
- Elasticsearch does not know which fields will be used in `terms` aggregations
- and which fields won't. You can tell Elasticsearch to load global ordinals
- eagerly when starting or refreshing a shard by configuring mappings as
- described below:
- [source,console]
- --------------------------------------------------
- PUT index
- {
- "mappings": {
- "properties": {
- "foo": {
- "type": "keyword",
- "eager_global_ordinals": true
- }
- }
- }
- }
- --------------------------------------------------
- [float]
- === Warm up the filesystem cache
- If the machine running Elasticsearch is restarted, the filesystem cache will be
- empty, so it will take some time before the operating system loads hot regions
- of the index into memory so that search operations are fast. You can explicitly
- tell the operating system which files should be loaded into memory eagerly
- depending on the file extension using the
- <<preload-data-to-file-system-cache,`index.store.preload`>> setting.
- WARNING: Loading data into the filesystem cache eagerly on too many indices or
- too many files will make search _slower_ if the filesystem cache is not large
- enough to hold all the data. Use with caution.
- [float]
- === Use index sorting to speed up conjunctions
- <<index-modules-index-sorting,Index sorting>> can be useful in order to make
- conjunctions faster at the cost of slightly slower indexing. Read more about it
- in the <<index-modules-index-sorting-conjunctions,index sorting documentation>>.
- [float]
- [[preference-cache-optimization]]
- === Use `preference` to optimize cache utilization
- There are multiple caches that can help with search performance, such as the
- https://en.wikipedia.org/wiki/Page_cache[filesystem cache], the
- <<shard-request-cache,request cache>> or the <<query-cache,query cache>>. Yet
- all these caches are maintained at the node level, meaning that if you run the
- same request twice in a row, have 1 <<glossary-replica-shard,replica>> or more
- and use https://en.wikipedia.org/wiki/Round-robin_DNS[round-robin], the default
- routing algorithm, then those two requests will go to different shard copies,
- preventing node-level caches from helping.
- Since it is common for users of a search application to run similar requests
- one after another, for instance in order to analyze a narrower subset of the
- index, using a preference value that identifies the current user or session
- could help optimize usage of the caches.
- [float]
- === Replicas might help with throughput, but not always
- In addition to improving resiliency, replicas can help improve throughput. For
- instance if you have a single-shard index and three nodes, you will need to
- set the number of replicas to 2 in order to have 3 copies of your shard in
- total so that all nodes are utilized.
- Now imagine that you have a 2-shards index and two nodes. In one case, the
- number of replicas is 0, meaning that each node holds a single shard. In the
- second case the number of replicas is 1, meaning that each node has two shards.
- Which setup is going to perform best in terms of search performance? Usually,
- the setup that has fewer shards per node in total will perform better. The
- reason for that is that it gives a greater share of the available filesystem
- cache to each shard, and the filesystem cache is probably Elasticsearch's
- number 1 performance factor. At the same time, beware that a setup that does
- not have replicas is subject to failure in case of a single node failure, so
- there is a trade-off between throughput and availability.
- So what is the right number of replicas? If you have a cluster that has
- `num_nodes` nodes, `num_primaries` primary shards _in total_ and if you want to
- be able to cope with `max_failures` node failures at once at most, then the
- right number of replicas for you is
- `max(max_failures, ceil(num_nodes / num_primaries) - 1)`.
- === Tune your queries with the Profile API
- You can also analyse how expensive each component of your queries and
- aggregations are using the {ref}/search-profile.html[Profile API]. This might
- allow you to tune your queries to be less expensive, resulting in a positive
- performance result and reduced load. Also note that Profile API payloads can be
- easily visualised for better readability in the
- {kibana-ref}/xpack-profiler.html[Search Profiler], which is a Kibana dev tools
- UI available in all X-Pack licenses, including the free X-Pack Basic license.
- Some caveats to the Profile API are that:
- - the Profile API as a debugging tool adds significant overhead to search execution and can also have a very verbose output
- - given the added overhead, the resulting took times are not reliable indicators of actual took time, but can be used comparatively between clauses for relative timing differences
- - the Profile API is best for exploring possible reasons behind the most costly clauses of a query but isn't intended for accurately measuring absolute timings of each clause
- [[faster-phrase-queries]]
- === Faster phrase queries with `index_phrases`
- The <<text,`text`>> field has an <<index-phrases,`index_phrases`>> option that
- indexes 2-shingles and is automatically leveraged by query parsers to run phrase
- queries that don't have a slop. If your use-case involves running lots of phrase
- queries, this can speed up queries significantly.
- [[faster-prefix-queries]]
- === Faster prefix queries with `index_prefixes`
- The <<text,`text`>> field has an <<index-phrases,`index_prefixes`>> option that
- indexes prefixes of all terms and is automatically leveraged by query parsers to
- run prefix queries. If your use-case involves running lots of prefix queries,
- this can speed up queries significantly.
|