|
@@ -5,223 +5,233 @@
|
|
|
[glossary]
|
|
|
[[glossary-analysis]] analysis ::
|
|
|
|
|
|
- Analysis is the process of converting <<glossary-text,full text>> to
|
|
|
- <<glossary-term,terms>>. Depending on which analyzer is used, these phrases:
|
|
|
- `FOO BAR`, `Foo-Bar`, `foo,bar` will probably all result in the
|
|
|
- terms `foo` and `bar`. These terms are what is actually stored in
|
|
|
- the index.
|
|
|
- +
|
|
|
- A full text query (not a <<glossary-term,term>> query) for `FoO:bAR` will
|
|
|
- also be analyzed to the terms `foo`,`bar` and will thus match the
|
|
|
- terms stored in the index.
|
|
|
- +
|
|
|
- It is this process of analysis (both at index time and at search time)
|
|
|
- that allows Elasticsearch to perform full text queries.
|
|
|
- +
|
|
|
- Also see <<glossary-text,text>> and <<glossary-term,term>>.
|
|
|
+Analysis is the process of converting <<glossary-text,full text>> to
|
|
|
+<<glossary-term,terms>>. Depending on which analyzer is used, these phrases:
|
|
|
+`FOO BAR`, `Foo-Bar`, `foo,bar` will probably all result in the
|
|
|
+terms `foo` and `bar`. These terms are what is actually stored in
|
|
|
+the index.
|
|
|
++
|
|
|
+A full text query (not a <<glossary-term,term>> query) for `FoO:bAR` will
|
|
|
+also be analyzed to the terms `foo`,`bar` and will thus match the
|
|
|
+terms stored in the index.
|
|
|
++
|
|
|
+It is this process of analysis (both at index time and at search time)
|
|
|
+that allows Elasticsearch to perform full text queries.
|
|
|
++
|
|
|
+Also see <<glossary-text,text>> and <<glossary-term,term>>.
|
|
|
|
|
|
[[glossary-cluster]] cluster ::
|
|
|
|
|
|
- A cluster consists of one or more <<glossary-node,nodes>> which share the
|
|
|
- same cluster name. Each cluster has a single master node which is
|
|
|
- chosen automatically by the cluster and which can be replaced if the
|
|
|
- current master node fails.
|
|
|
-
|
|
|
+A cluster consists of one or more <<glossary-node,nodes>> which share the
|
|
|
+same cluster name. Each cluster has a single master node which is
|
|
|
+chosen automatically by the cluster and which can be replaced if the
|
|
|
+current master node fails.
|
|
|
+
|
|
|
[[glossary-ccr]] {ccr} (CCR)::
|
|
|
|
|
|
- The {ccr} feature enables you to replicate indices in remote clusters to your
|
|
|
- local cluster. For more information, see
|
|
|
- {stack-ov}/xpack-ccr.html[{ccr-cap}].
|
|
|
-
|
|
|
+The {ccr} feature enables you to replicate indices in remote clusters to your
|
|
|
+local cluster. For more information, see
|
|
|
+{stack-ov}/xpack-ccr.html[{ccr-cap}].
|
|
|
+
|
|
|
[[glossary-ccs]] {ccs} (CCS)::
|
|
|
|
|
|
- The {ccs} feature enables any node to act as a federated client across
|
|
|
- multiple clusters. See <<modules-cross-cluster-search>>.
|
|
|
+The {ccs} feature enables any node to act as a federated client across
|
|
|
+multiple clusters. See <<modules-cross-cluster-search>>.
|
|
|
|
|
|
[[glossary-document]] document ::
|
|
|
|
|
|
- A document is a JSON document which is stored in Elasticsearch. It is
|
|
|
- like a row in a table in a relational database. Each document is
|
|
|
- stored in an <<glossary-index,index>> and has a <<glossary-type,type>> and an
|
|
|
- <<glossary-id,id>>.
|
|
|
- +
|
|
|
- A document is a JSON object (also known in other languages as a hash /
|
|
|
- hashmap / associative array) which contains zero or more
|
|
|
- <<glossary-field,fields>>, or key-value pairs.
|
|
|
- +
|
|
|
- The original JSON document that is indexed will be stored in the
|
|
|
- <<glossary-source_field,`_source` field>>, which is returned by default when
|
|
|
- getting or searching for a document.
|
|
|
+A document is a JSON document which is stored in Elasticsearch. It is
|
|
|
+like a row in a table in a relational database. Each document is
|
|
|
+stored in an <<glossary-index,index>> and has a <<glossary-type,type>> and an
|
|
|
+<<glossary-id,id>>.
|
|
|
++
|
|
|
+A document is a JSON object (also known in other languages as a hash /
|
|
|
+hashmap / associative array) which contains zero or more
|
|
|
+<<glossary-field,fields>>, or key-value pairs.
|
|
|
++
|
|
|
+The original JSON document that is indexed will be stored in the
|
|
|
+<<glossary-source_field,`_source` field>>, which is returned by default when
|
|
|
+getting or searching for a document.
|
|
|
|
|
|
[[glossary-field]] field ::
|
|
|
|
|
|
- A <<glossary-document,document>> contains a list of fields, or key-value
|
|
|
- pairs. The value can be a simple (scalar) value (eg a string, integer,
|
|
|
- date), or a nested structure like an array or an object. A field is
|
|
|
- similar to a column in a table in a relational database.
|
|
|
- +
|
|
|
- The <<glossary-mapping,mapping>> for each field has a field _type_ (not to
|
|
|
- be confused with document <<glossary-type,type>>) which indicates the type
|
|
|
- of data that can be stored in that field, eg `integer`, `string`,
|
|
|
- `object`. The mapping also allows you to define (amongst other things)
|
|
|
- how the value for a field should be analyzed.
|
|
|
+A <<glossary-document,document>> contains a list of fields, or key-value
|
|
|
+pairs. The value can be a simple (scalar) value (eg a string, integer,
|
|
|
+date), or a nested structure like an array or an object. A field is
|
|
|
+similar to a column in a table in a relational database.
|
|
|
++
|
|
|
+The <<glossary-mapping,mapping>> for each field has a field _type_ (not to
|
|
|
+be confused with document <<glossary-type,type>>) which indicates the type
|
|
|
+of data that can be stored in that field, eg `integer`, `string`,
|
|
|
+`object`. The mapping also allows you to define (amongst other things)
|
|
|
+how the value for a field should be analyzed.
|
|
|
|
|
|
[[glossary-filter]] filter ::
|
|
|
|
|
|
- A filter is a non-scoring <<glossary-query,query>>, meaning that it does not score documents.
|
|
|
- It is only concerned about answering the question - "Does this document match?".
|
|
|
- The answer is always a simple, binary yes or no. This kind of query is said to be made
|
|
|
- in a <<query-filter-context,filter context>>,
|
|
|
- hence it is called a filter. Filters are simple checks for set inclusion or exclusion.
|
|
|
- In most cases, the goal of filtering is to reduce the number of documents that have to be examined.
|
|
|
-
|
|
|
+A filter is a non-scoring <<glossary-query,query>>, meaning that it does not score documents.
|
|
|
+It is only concerned about answering the question - "Does this document match?".
|
|
|
+The answer is always a simple, binary yes or no. This kind of query is said to be made
|
|
|
+in a <<query-filter-context,filter context>>,
|
|
|
+hence it is called a filter. Filters are simple checks for set inclusion or exclusion.
|
|
|
+In most cases, the goal of filtering is to reduce the number of documents that have to be examined.
|
|
|
+
|
|
|
[[glossary-follower-index]] follower index ::
|
|
|
-
|
|
|
- Follower indices are the target indices for <<glossary-ccr,{ccr}>>. They exist
|
|
|
- in your local cluster and replicate <<glossary-leader-index,leader indices>>.
|
|
|
-
|
|
|
+
|
|
|
+Follower indices are the target indices for <<glossary-ccr,{ccr}>>. They exist
|
|
|
+in your local cluster and replicate <<glossary-leader-index,leader indices>>.
|
|
|
+
|
|
|
[[glossary-id]] id ::
|
|
|
|
|
|
- The ID of a <<glossary-document,document>> identifies a document. The
|
|
|
- `index/id` of a document must be unique. If no ID is provided,
|
|
|
- then it will be auto-generated. (also see <<glossary-routing,routing>>)
|
|
|
+The ID of a <<glossary-document,document>> identifies a document. The
|
|
|
+`index/id` of a document must be unique. If no ID is provided,
|
|
|
+then it will be auto-generated. (also see <<glossary-routing,routing>>)
|
|
|
|
|
|
[[glossary-index]] index ::
|
|
|
|
|
|
- An index is like a _table_ in a relational database. It has a
|
|
|
- <<glossary-mapping,mapping>> which contains a <<glossary-type,type>>,
|
|
|
- which contains the <<glossary-field,fields>> in the index.
|
|
|
- +
|
|
|
- An index is a logical namespace which maps to one or more
|
|
|
- <<glossary-primary-shard,primary shards>> and can have zero or more
|
|
|
- <<glossary-replica-shard,replica shards>>.
|
|
|
-
|
|
|
+An index is like a _table_ in a relational database. It has a
|
|
|
+<<glossary-mapping,mapping>> which contains a <<glossary-type,type>>,
|
|
|
+which contains the <<glossary-field,fields>> in the index.
|
|
|
++
|
|
|
+An index is a logical namespace which maps to one or more
|
|
|
+<<glossary-primary-shard,primary shards>> and can have zero or more
|
|
|
+<<glossary-replica-shard,replica shards>>.
|
|
|
+
|
|
|
[[glossary-leader-index]] leader index ::
|
|
|
-
|
|
|
- Leader indices are the source indices for <<glossary-ccr,{ccr}>>. They exist
|
|
|
- on remote clusters and are replicated to
|
|
|
- <<glossary-follower-index,follower indices>>.
|
|
|
+
|
|
|
+Leader indices are the source indices for <<glossary-ccr,{ccr}>>. They exist
|
|
|
+on remote clusters and are replicated to
|
|
|
+<<glossary-follower-index,follower indices>>.
|
|
|
|
|
|
[[glossary-mapping]] mapping ::
|
|
|
|
|
|
- A mapping is like a _schema definition_ in a relational database. Each
|
|
|
- <<glossary-index,index>> has a mapping, which defines a <<glossary-type,type>>,
|
|
|
- plus a number of index-wide settings.
|
|
|
- +
|
|
|
- A mapping can either be defined explicitly, or it will be generated
|
|
|
- automatically when a document is indexed.
|
|
|
+A mapping is like a _schema definition_ in a relational database. Each
|
|
|
+<<glossary-index,index>> has a mapping, which defines a <<glossary-type,type>>,
|
|
|
+plus a number of index-wide settings.
|
|
|
++
|
|
|
+A mapping can either be defined explicitly, or it will be generated
|
|
|
+automatically when a document is indexed.
|
|
|
|
|
|
[[glossary-node]] node ::
|
|
|
|
|
|
- A node is a running instance of Elasticsearch which belongs to a
|
|
|
- <<glossary-cluster,cluster>>. Multiple nodes can be started on a single
|
|
|
- server for testing purposes, but usually you should have one node per
|
|
|
- server.
|
|
|
- +
|
|
|
- At startup, a node will use unicast to discover an existing cluster with
|
|
|
- the same cluster name and will try to join that cluster.
|
|
|
-
|
|
|
- [[glossary-primary-shard]] primary shard ::
|
|
|
-
|
|
|
- Each document is stored in a single primary <<glossary-shard,shard>>. When
|
|
|
- you index a document, it is indexed first on the primary shard, then
|
|
|
- on all <<glossary-replica-shard,replicas>> of the primary shard.
|
|
|
- +
|
|
|
- By default, an <<glossary-index,index>> has one primary shard. You can specify
|
|
|
- more primary shards to scale the number of <<glossary-document,documents>>
|
|
|
- that your index can handle.
|
|
|
- +
|
|
|
- You cannot change the number of primary shards in an index, once the index is
|
|
|
- index is created. However, an index can be split into a new index using the
|
|
|
- <<indices-split-index, split API>>.
|
|
|
- +
|
|
|
- See also <<glossary-routing,routing>>
|
|
|
+A node is a running instance of Elasticsearch which belongs to a
|
|
|
+<<glossary-cluster,cluster>>. Multiple nodes can be started on a single
|
|
|
+server for testing purposes, but usually you should have one node per
|
|
|
+server.
|
|
|
++
|
|
|
+At startup, a node will use unicast to discover an existing cluster with
|
|
|
+the same cluster name and will try to join that cluster.
|
|
|
+
|
|
|
+[[glossary-primary-shard]] primary shard ::
|
|
|
+
|
|
|
+Each document is stored in a single primary <<glossary-shard,shard>>. When
|
|
|
+you index a document, it is indexed first on the primary shard, then
|
|
|
+on all <<glossary-replica-shard,replicas>> of the primary shard.
|
|
|
++
|
|
|
+By default, an <<glossary-index,index>> has one primary shard. You can specify
|
|
|
+more primary shards to scale the number of <<glossary-document,documents>>
|
|
|
+that your index can handle.
|
|
|
++
|
|
|
+You cannot change the number of primary shards in an index, once the index is
|
|
|
+index is created. However, an index can be split into a new index using the
|
|
|
+<<indices-split-index, split API>>.
|
|
|
++
|
|
|
+See also <<glossary-routing,routing>>
|
|
|
|
|
|
[[glossary-query]] query ::
|
|
|
|
|
|
- A query is the basic component of a search. A search can be defined by one or more queries
|
|
|
- which can be mixed and matched in endless combinations. While <<glossary-filter,filters>> are
|
|
|
- queries that only determine if a document matches, those queries that also calculate how well
|
|
|
- the document matches are known as "scoring queries". Those queries assign it a score, which is
|
|
|
- later used to sort matched documents. Scoring queries take more resources than <<glossary-filter,non scoring queries>>
|
|
|
- and their query results are not cacheable. As a general rule, use query clauses for full-text
|
|
|
- search or for any condition that requires scoring, and use filters for everything else.
|
|
|
-
|
|
|
- [[glossary-replica-shard]] replica shard ::
|
|
|
-
|
|
|
- Each <<glossary-primary-shard,primary shard>> can have zero or more
|
|
|
- replicas. A replica is a copy of the primary shard, and has two
|
|
|
- purposes:
|
|
|
- +
|
|
|
- 1. increase failover: a replica shard can be promoted to a primary
|
|
|
- shard if the primary fails
|
|
|
- 2. increase performance: get and search requests can be handled by
|
|
|
- primary or replica shards.
|
|
|
- +
|
|
|
- By default, each primary shard has one replica, but the number of
|
|
|
- replicas can be changed dynamically on an existing index. A replica
|
|
|
- shard will never be started on the same node as its primary shard.
|
|
|
+A query is the basic component of a search. A search can be defined by one or more queries
|
|
|
+which can be mixed and matched in endless combinations. While <<glossary-filter,filters>> are
|
|
|
+queries that only determine if a document matches, those queries that also calculate how well
|
|
|
+the document matches are known as "scoring queries". Those queries assign it a score, which is
|
|
|
+later used to sort matched documents. Scoring queries take more resources than <<glossary-filter,non scoring queries>>
|
|
|
+and their query results are not cacheable. As a general rule, use query clauses for full-text
|
|
|
+search or for any condition that requires scoring, and use filters for everything else.
|
|
|
+
|
|
|
+[[glossary-recovery]] recovery ::
|
|
|
+The process of syncing a shard copy from a source shard. Upon completion, the recovery process makes the shard copy available for queries.
|
|
|
++
|
|
|
+Recovery automatically occurs anytime a shard moves to a different node in the same cluster, including:
|
|
|
+
|
|
|
+* Node startup
|
|
|
+* Node failure
|
|
|
+* Index shard replication
|
|
|
+* Snapshot restoration
|
|
|
+
|
|
|
+[[glossary-replica-shard]] replica shard ::
|
|
|
+
|
|
|
+Each <<glossary-primary-shard,primary shard>> can have zero or more
|
|
|
+replicas. A replica is a copy of the primary shard, and has two
|
|
|
+purposes:
|
|
|
++
|
|
|
+1. increase failover: a replica shard can be promoted to a primary
|
|
|
+shard if the primary fails
|
|
|
+2. increase performance: get and search requests can be handled by
|
|
|
+primary or replica shards.
|
|
|
++
|
|
|
+By default, each primary shard has one replica, but the number of
|
|
|
+replicas can be changed dynamically on an existing index. A replica
|
|
|
+shard will never be started on the same node as its primary shard.
|
|
|
|
|
|
[[glossary-routing]] routing ::
|
|
|
|
|
|
- When you index a document, it is stored on a single
|
|
|
- <<glossary-primary-shard,primary shard>>. That shard is chosen by hashing
|
|
|
- the `routing` value. By default, the `routing` value is derived from
|
|
|
- the ID of the document or, if the document has a specified parent
|
|
|
- document, from the ID of the parent document (to ensure that child and
|
|
|
- parent documents are stored on the same shard).
|
|
|
- +
|
|
|
- This value can be overridden by specifying a `routing` value at index
|
|
|
- time, or a <<mapping-routing-field,routing
|
|
|
- field>> in the <<glossary-mapping,mapping>>.
|
|
|
+When you index a document, it is stored on a single
|
|
|
+<<glossary-primary-shard,primary shard>>. That shard is chosen by hashing
|
|
|
+the `routing` value. By default, the `routing` value is derived from
|
|
|
+the ID of the document or, if the document has a specified parent
|
|
|
+document, from the ID of the parent document (to ensure that child and
|
|
|
+parent documents are stored on the same shard).
|
|
|
++
|
|
|
+This value can be overridden by specifying a `routing` value at index
|
|
|
+time, or a <<mapping-routing-field,routing
|
|
|
+field>> in the <<glossary-mapping,mapping>>.
|
|
|
|
|
|
[[glossary-shard]] shard ::
|
|
|
|
|
|
- A shard is a single Lucene instance. It is a low-level “worker” unit
|
|
|
- which is managed automatically by Elasticsearch. An index is a logical
|
|
|
- namespace which points to <<glossary-primary-shard,primary>> and
|
|
|
- <<glossary-replica-shard,replica>> shards.
|
|
|
- +
|
|
|
- Other than defining the number of primary and replica shards that an
|
|
|
- index should have, you never need to refer to shards directly.
|
|
|
- Instead, your code should deal only with an index.
|
|
|
- +
|
|
|
- Elasticsearch distributes shards amongst all <<glossary-node,nodes>> in the
|
|
|
- <<glossary-cluster,cluster>>, and can move shards automatically from one
|
|
|
- node to another in the case of node failure, or the addition of new
|
|
|
- nodes.
|
|
|
-
|
|
|
- [[glossary-source_field]] source field ::
|
|
|
-
|
|
|
- By default, the JSON document that you index will be stored in the
|
|
|
- `_source` field and will be returned by all get and search requests.
|
|
|
- This allows you access to the original object directly from search
|
|
|
- results, rather than requiring a second step to retrieve the object
|
|
|
- from an ID.
|
|
|
+A shard is a single Lucene instance. It is a low-level “worker” unit
|
|
|
+which is managed automatically by Elasticsearch. An index is a logical
|
|
|
+namespace which points to <<glossary-primary-shard,primary>> and
|
|
|
+<<glossary-replica-shard,replica>> shards.
|
|
|
++
|
|
|
+Other than defining the number of primary and replica shards that an
|
|
|
+index should have, you never need to refer to shards directly.
|
|
|
+Instead, your code should deal only with an index.
|
|
|
++
|
|
|
+Elasticsearch distributes shards amongst all <<glossary-node,nodes>> in the
|
|
|
+<<glossary-cluster,cluster>>, and can move shards automatically from one
|
|
|
+node to another in the case of node failure, or the addition of new
|
|
|
+nodes.
|
|
|
+
|
|
|
+[[glossary-source_field]] source field ::
|
|
|
+
|
|
|
+By default, the JSON document that you index will be stored in the
|
|
|
+`_source` field and will be returned by all get and search requests.
|
|
|
+This allows you access to the original object directly from search
|
|
|
+results, rather than requiring a second step to retrieve the object
|
|
|
+from an ID.
|
|
|
|
|
|
[[glossary-term]] term ::
|
|
|
|
|
|
- A term is an exact value that is indexed in Elasticsearch. The terms
|
|
|
- `foo`, `Foo`, `FOO` are NOT equivalent. Terms (i.e. exact values) can
|
|
|
- be searched for using _term_ queries.
|
|
|
- +
|
|
|
- See also <<glossary-text,text>> and <<glossary-analysis,analysis>>.
|
|
|
+A term is an exact value that is indexed in Elasticsearch. The terms
|
|
|
+`foo`, `Foo`, `FOO` are NOT equivalent. Terms (i.e. exact values) can
|
|
|
+be searched for using _term_ queries.
|
|
|
++
|
|
|
+See also <<glossary-text,text>> and <<glossary-analysis,analysis>>.
|
|
|
|
|
|
[[glossary-text]] text ::
|
|
|
|
|
|
- Text (or full text) is ordinary unstructured text, such as this
|
|
|
- paragraph. By default, text will be <<glossary-analysis,analyzed>> into
|
|
|
- <<glossary-term,terms>>, which is what is actually stored in the index.
|
|
|
- +
|
|
|
- Text <<glossary-field,fields>> need to be analyzed at index time in order to
|
|
|
- be searchable as full text, and keywords in full text queries must be
|
|
|
- analyzed at search time to produce (and search for) the same terms
|
|
|
- that were generated at index time.
|
|
|
- +
|
|
|
- See also <<glossary-term,term>> and <<glossary-analysis,analysis>>.
|
|
|
+Text (or full text) is ordinary unstructured text, such as this
|
|
|
+paragraph. By default, text will be <<glossary-analysis,analyzed>> into
|
|
|
+<<glossary-term,terms>>, which is what is actually stored in the index.
|
|
|
++
|
|
|
+Text <<glossary-field,fields>> need to be analyzed at index time in order to
|
|
|
+be searchable as full text, and keywords in full text queries must be
|
|
|
+analyzed at search time to produce (and search for) the same terms
|
|
|
+that were generated at index time.
|
|
|
++
|
|
|
+See also <<glossary-term,term>> and <<glossary-analysis,analysis>>.
|
|
|
|
|
|
[[glossary-type]] type ::
|
|
|
|
|
|
- A type used to represent the _type_ of document, e.g. an `email`, a `user`, or a `tweet`.
|
|
|
- Types are deprecated and are in the process of being removed. See <<removal-of-types>>.
|
|
|
+A type used to represent the _type_ of document, e.g. an `email`, a `user`, or a `tweet`.
|
|
|
+Types are deprecated and are in the process of being removed. See <<removal-of-types>>.
|
|
|
|