Jelajahi Sumber

[DOCS] Streamline GS search topic. (#45941)

* Streamline GS search topic.

* Added missing comma.

* Update docs/reference/getting-started.asciidoc 

Co-Authored-By: István Zoltán Szabó <istvan.szabo@elastic.co>
debadair 6 tahun lalu
induk
melakukan
b237721ac2
1 mengubah file dengan 77 tambahan dan 271 penghapusan
  1. 77 271
      docs/reference/getting-started.asciidoc

+ 77 - 271
docs/reference/getting-started.asciidoc

@@ -32,19 +32,35 @@ trial of Elasticsearch Service] in the cloud.
 [[getting-started-install]]
 == Get {es} up and running
 
-To take {es} for a test drive, you can create a one-click cloud deployment
-on the https://www.elastic.co/cloud/elasticsearch-service/signup[Elasticsearch Service],
-or <<run-elasticsearch-local, set up a multi-node {es} cluster>> on your own
+To take {es} for a test drive, you can create a 
+https://www.elastic.co/cloud/elasticsearch-service/signup[hosted deployment]  on 
+the {es} Service or set up a multi-node {es} cluster on your own
 Linux, macOS, or Windows machine.
 
+[float]
+[[run-elasticsearch-hosted]]
+=== Run {es} on Elastic Cloud
+
+When you create a deployment on the {es} Service, the service provisions
+a three-node {es} cluster along with Kibana and APM.
+
+To create a deployment:
+
+. Sign up for a https://www.elastic.co/cloud/elasticsearch-service/signup[free trial] 
+and verify your email address.
+. Set a password for your account.
+. Click **Create Deployment**.
+
+Once you've created a deployment, you're ready to <<getting-started-index>>.
 
 [float]
 [[run-elasticsearch-local]]
 === Run {es} locally on Linux, macOS, or Windows
 
-When you create a cluster on the Elasticsearch Service, you automatically
-get a three-node cluster. By installing from the tar or zip archive, you can
-start multiple instances of {es} locally to see how a multi-node cluster behaves.
+When you create a deployment on the {es} Service, a master node and
+two data nodes are provisioned automatically. By installing from the tar or zip 
+archive, you can start multiple instances of {es} locally to see how a multi-node 
+cluster behaves.
 
 To run a three-node {es} cluster locally:
 
@@ -332,82 +348,14 @@ yellow open   bank  l7sSYV2cQXmu6_4rJWVIww   5   1       1000            0    12
 [[getting-started-search]]
 == Start searching
 
-Now let's start with some simple searches. There are two basic ways to run searches: one is by sending search parameters through the {ref}/search-uri-request.html[REST request URI] and the other by sending them through the {ref}/search-request-body.html[REST request body]. The request body method allows you to be more expressive and also to define your searches in a more readable JSON format. We'll try one example of the request URI method but for the remainder of this tutorial, we will exclusively be using the request body method.
-
-The REST API for search is accessible from the `_search` endpoint. This example returns all documents in the bank index:
-
-[source,js]
---------------------------------------------------
-GET /bank/_search?q=*&sort=account_number:asc&pretty
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-Let's first dissect the search call. We are searching (`_search` endpoint) in the bank index, and the `q=*` parameter instructs Elasticsearch to match all documents in the index. The `sort=account_number:asc` parameter indicates to sort the results using the `account_number` field of each document in an ascending order. The `pretty` parameter, again, just tells Elasticsearch to return pretty-printed JSON results.
-
-And the response (partially shown):
-
-[source,js]
---------------------------------------------------
-{
-  "took" : 63,
-  "timed_out" : false,
-  "_shards" : {
-    "total" : 5,
-    "successful" : 5,
-    "skipped" : 0,
-    "failed" : 0
-  },
-  "hits" : {
-    "total" : {
-        "value": 1000,
-        "relation": "eq"
-    },
-    "max_score" : null,
-    "hits" : [ {
-      "_index" : "bank",
-      "_type" : "_doc",
-      "_id" : "0",
-      "sort": [0],
-      "_score" : null,
-      "_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"}
-    }, {
-      "_index" : "bank",
-      "_type" : "_doc",
-      "_id" : "1",
-      "sort": [1],
-      "_score" : null,
-      "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
-    }, ...
-    ]
-  }
-}
---------------------------------------------------
-// TESTRESPONSE[s/"took" : 63/"took" : $body.took/]
-// TESTRESPONSE[s/\.\.\./$body.hits.hits.2, $body.hits.hits.3, $body.hits.hits.4, $body.hits.hits.5, $body.hits.hits.6, $body.hits.hits.7, $body.hits.hits.8, $body.hits.hits.9/]
+Once you have ingested some data into an {es} index, you can search it
+by sending requests to the `_search` endpoint. To access the full suite of
+search capabilities, you use the {es} Query DSL to specify the
+search criteria in the request body. You specify the name of the index you 
+want to search in the request URI.
 
-As for the response, we see the following parts:
-
-* `took` – time in milliseconds for Elasticsearch to execute the search
-* `timed_out` – tells us if the search timed out or not
-* `_shards` – tells us how many shards were searched, as well as a count of the successful/failed searched shards
-* `hits` – search results
-* `hits.total` – an object that contains information about the total number of documents matching our search criteria
-** `hits.total.value` - the value of the total hit count (must be interpreted in the context of `hits.total.relation`).
-** `hits.total.relation` - whether `hits.total.value` is the exact hit count, in which case it is equal to `"eq"` or a
-                           lower bound of the total hit count (greater than or equals), in which case it is equal to `gte`.
-* `hits.hits` – actual array of search results (defaults to first 10 documents)
-* `hits.sort` - sort value of the sort key for each result (missing if sorting by score)
-* `hits._score` and `max_score` - ignore these fields for now
-
-The accuracy of `hits.total` is controlled by the request parameter `track_total_hits`, when set to true
-the request will track the total hits accurately (`"relation": "eq"`). It defaults to `10,000`
-which means that the total hit count is accurately tracked up to `10,000` documents.
-You can force an accurate count by setting `track_total_hits` to true explicitly.
-See the <<request-body-search-track-total-hits, request body>> documentation
-for more details.
-
-Here is the same exact search above using the alternative request body method:
+For example, the following request retrieves all of the documents in the `bank`
+index sorted by account number:
 
 [source,js]
 --------------------------------------------------
@@ -422,11 +370,8 @@ GET /bank/_search
 // CONSOLE
 // TEST[continued]
 
-The difference here is that instead of passing `q=*` in the URI, we provide a JSON-style query request body to the `_search` API. We'll discuss this JSON query in the next section.
-
-////
-Hidden response just so we can assert that it is indeed the same but don't have
-to clutter the docs with it:
+By default, the `hits` section of the response includes the first 10 documents
+that match the search criteria:
 
 [source,js]
 --------------------------------------------------
@@ -441,23 +386,23 @@ to clutter the docs with it:
   },
   "hits" : {
     "total" : {
-       "value": 1000,
-       "relation": "eq"
+        "value": 1000,
+        "relation": "eq"
     },
-    "max_score": null,
+    "max_score" : null,
     "hits" : [ {
       "_index" : "bank",
       "_type" : "_doc",
       "_id" : "0",
       "sort": [0],
-      "_score": null,
+      "_score" : null,
       "_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"}
     }, {
       "_index" : "bank",
       "_type" : "_doc",
       "_id" : "1",
       "sort": [1],
-      "_score": null,
+      "_score" : null,
       "_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
     }, ...
     ]
@@ -467,54 +412,31 @@ to clutter the docs with it:
 // TESTRESPONSE[s/"took" : 63/"took" : $body.took/]
 // TESTRESPONSE[s/\.\.\./$body.hits.hits.2, $body.hits.hits.3, $body.hits.hits.4, $body.hits.hits.5, $body.hits.hits.6, $body.hits.hits.7, $body.hits.hits.8, $body.hits.hits.9/]
 
-////
-
-It is important to understand that once you get your search results back, Elasticsearch is completely done with the request and does not maintain any kind of server-side resources or open cursors into your results. This is in stark contrast to many other platforms such as SQL wherein you may initially get a partial subset of your query results up-front and then you have to continuously go back to the server if you want to fetch (or page through) the rest of the results using some kind of stateful server-side cursor.
-
-[float]
-[[getting-started-query-lang]]
-=== Introducing the Query Language
-
-Elasticsearch provides a JSON-style domain-specific language that you can use to execute queries. This is referred to as the {ref}/query-dsl.html[Query DSL]. The query language is quite comprehensive and can be intimidating at first glance but the best way to actually learn it is to start with a few basic examples.
-
-Going back to our last example, we executed this query:
-
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": { "match_all": {} }
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-Dissecting the above, the `query` part tells us what our query definition is and the `match_all` part is simply the type of query that we want to run. The `match_all` query is simply a search for all documents in the specified index.
-
-In addition to the `query` parameter, we also can pass other parameters to
-influence the search results. In the example in the section above we passed in
-`sort`, here we pass in `size`:
+The response also provides the following information about the search request:
 
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": { "match_all": {} },
-  "size": 1
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
+* `took` – how long it took {es} to run the query, in milliseconds
+* `timed_out` – whether or not the search request timed out
+* `_shards` – how many shards were searched and a breakdown of how many shards
+succeeded, failed, or were skipped. 
+* `max_score` – the score of the most relevant document found
+* `hits.total.value` - how many matching documents were found
+* `hits.sort` - the document's sort position (when not sorting by relevance score)
+* `hits._score` - the document's relevance score (not applicable when using `match_all`)
 
-Note that if `size` is not specified, it defaults to 10.
+Each search request is self-contained: {es} does not maintain any
+state information across requests. To page through the search hits, you specify
+the `from` and `size` parameters in your request. 
 
-This example does a `match_all` and returns documents 10 through 19:
+For example, the following request gets hits 10 through 19:
 
 [source,js]
 --------------------------------------------------
 GET /bank/_search
 {
   "query": { "match_all": {} },
+  "sort": [
+    { "account_number": "asc" }
+  ],
   "from": 10,
   "size": 10
 }
@@ -522,67 +444,12 @@ GET /bank/_search
 // CONSOLE
 // TEST[continued]
 
-The `from` parameter (0-based) specifies which document index to start from and the `size` parameter specifies how many documents to return starting at the from parameter. This feature is useful when implementing paging of search results. Note that if `from` is not specified, it defaults to 0.
+Now that you've seen how to submit a basic search request, you can start to
+construct queries that are a bit more interesting than `match_all`.
 
-This example does a `match_all` and sorts the results by account balance in descending order and returns the top 10 (default size) documents.
-
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": { "match_all": {} },
-  "sort": { "balance": { "order": "desc" } }
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-Now that we have seen a few of the basic search parameters, let's dig in some more into the Query DSL. Let's first take a look at the returned document fields. By default, the full JSON document is returned as part of all searches. This is referred to as the source (`_source` field in the search hits). If we don't want the entire source document returned, we have the ability to request only a few fields from within source to be returned.
-
-This example shows how to return two fields, `account_number` and `balance` (inside of `_source`), from the search:
-
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": { "match_all": {} },
-  "_source": ["account_number", "balance"]
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-Note that the above example simply reduces the `_source` field. It will still only return one field named `_source` but within it, only the fields `account_number` and `balance` are included.
-
-If you come from a SQL background, the above is somewhat similar in concept to the `SQL SELECT FROM` field list.
-
-Now let's move on to the query part. Previously, we've seen how the `match_all` query is used to match all documents. Let's now introduce a new query called the {ref}/query-dsl-match-query.html[`match` query], which can be thought of as a basic fielded search query (i.e. a search done against a specific field or set of fields).
-
-This example returns the account numbered 20:
-
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": { "match": { "account_number": 20 } }
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-This example returns all accounts containing the term "mill" in the address:
-
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": { "match": { "address": "mill" } }
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-This example returns all accounts containing the term "mill" or "lane" in the address:
+To search for specific _terms_ within a field, you can use a `match` query. 
+For example, the following request searches the `address` field to find 
+customers whose addresses contain `mill` or `lane`:
 
 [source,js]
 --------------------------------------------------
@@ -594,7 +461,9 @@ GET /bank/_search
 // CONSOLE
 // TEST[continued]
 
-This example is a variant of `match` (`match_phrase`) that returns all accounts containing the phrase "mill lane" in the address:
+To perform a phrase search rather than matching individual terms, you use
+`match_phrase` instead of `match`. For example, the following request only 
+matches addresses that contain the phrase `mill lane`: 
 
 [source,js]
 --------------------------------------------------
@@ -606,74 +475,13 @@ GET /bank/_search
 // CONSOLE
 // TEST[continued]
 
-Let's now introduce the {ref}/query-dsl-bool-query.html[`bool` query]. The `bool` query allows us to compose smaller queries into bigger queries using boolean logic.
+To construct more complex queries, you can use a `bool` query to combine
+multiple query criteria. You can designate criteria as required (must match), 
+desirable (should match), or undesirable (must not match).
 
-This example composes two `match` queries and returns all accounts containing "mill" and "lane" in the address:
-
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": {
-    "bool": {
-      "must": [
-        { "match": { "address": "mill" } },
-        { "match": { "address": "lane" } }
-      ]
-    }
-  }
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-In the above example, the `bool must` clause specifies all the queries that must be true for a document to be considered a match.
-
-In contrast, this example composes two `match` queries and returns all accounts containing "mill" or "lane" in the address:
-
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": {
-    "bool": {
-      "should": [
-        { "match": { "address": "mill" } },
-        { "match": { "address": "lane" } }
-      ]
-    }
-  }
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-In the above example, the `bool should` clause specifies a list of queries either of which must be true for a document to be considered a match.
-
-This example composes two `match` queries and returns all accounts that contain neither "mill" nor "lane" in the address:
-
-[source,js]
---------------------------------------------------
-GET /bank/_search
-{
-  "query": {
-    "bool": {
-      "must_not": [
-        { "match": { "address": "mill" } },
-        { "match": { "address": "lane" } }
-      ]
-    }
-  }
-}
---------------------------------------------------
-// CONSOLE
-// TEST[continued]
-
-In the above example, the `bool must_not` clause specifies a list of queries none of which must be true for a document to be considered a match.
-
-We can combine `must`, `should`, and `must_not` clauses simultaneously inside a `bool` query. Furthermore, we can compose `bool` queries inside any of these `bool` clauses to mimic any complex multi-level boolean logic.
-
-This example returns all accounts of anybody who is 40 years old but doesn't live in ID(aho):
+For example, the following request searches the `bank` index for accounts that
+belong to customers who are 40 years old, but excludes anyone who lives in
+Idaho (ID):
 
 [source,js]
 --------------------------------------------------
@@ -694,17 +502,19 @@ GET /bank/_search
 // CONSOLE
 // TEST[continued]
 
-[float]
-[[getting-started-filters]]
-=== Executing filters
-
-In the previous section, we skipped over a little detail called the document score (`_score` field in the search results). The score is a numeric value that is a relative measure of how well the document matches the search query that we specified. The higher the score, the more relevant the document is, the lower the score, the less relevant the document is.
-
-But queries do not always need to produce scores, in particular when they are only used for "filtering" the document set. Elasticsearch detects these situations and automatically optimizes query execution in order not to compute useless scores.
+Each `must`, `should`, and `must_not` element in a Boolean query is referred
+to as a query clause. How well a document meets the criteria in each `must` or
+`should` clause contributes to the document's _relevance score_. The higher the
+score, the better the document matches your search criteria. By default, {es}
+returns documents ranked by these relevance scores. 
 
-The {ref}/query-dsl-bool-query.html[`bool` query] that we introduced in the previous section also supports `filter` clauses which allow us to use a query to restrict the documents that will be matched by other clauses, without changing how scores are computed. As an example, let's introduce the {ref}/query-dsl-range-query.html[`range` query], which allows us to filter documents by a range of values. This is generally used for numeric or date filtering.
+The criteria in a `must_not` clause is treated as a _filter_. It affects whether
+or not the document is included in the results, but does not contribute to
+how documents are scored. You can also explicitly specify arbitrary filters to
+include or exclude documents based on structured data. 
 
-This example uses a bool query to return all accounts with balances between 20000 and 30000, inclusive. In other words, we want to find accounts with a balance that is greater than or equal to 20000 and less than or equal to 30000.
+For example, the following request uses a range filter to limit the results to
+accounts with a balance between $20,000 and $30,000 (inclusive). 
 
 [source,js]
 --------------------------------------------------
@@ -728,10 +538,6 @@ GET /bank/_search
 // CONSOLE
 // TEST[continued]
 
-Dissecting the above, the bool query contains a `match_all` query (the query part) and a `range` query (the filter part). We can substitute any other queries into the query and the filter parts. In the above case, the range query makes perfect sense since documents falling into the range all match "equally", i.e., no document is more relevant than another.
-
-In addition to the `match_all`, `match`, `bool`, and `range` queries, there are a lot of other query types that are available and we won't go into them here. Since we already have a basic understanding of how they work, it shouldn't be too difficult to apply this knowledge in learning and experimenting with the other query types.
-
 [[getting-started-aggregations]]
 == Analyze results with aggregations