Browse Source

[Docs] Added _source filtering to documentation

Relates to #3301
Boaz Leskes 12 years ago
parent
commit
c63d8c4fb5

+ 48 - 8
docs/reference/docs/get.asciidoc

@@ -65,22 +65,55 @@ are stored.
 The get API allows for `_type` to be optional. Set it to `_all` in order
 to fetch the first document matching the id across all types.
 
+
+[float]
+[[get-source-filtering]]
+=== Source filtering
+
+added[1.0.0.Beta1]
+
+By default, the get operation returns the contents of the `_source` field unless
+you have used the `fields` parameter or if the `_source` field is disabled. 
+You can turn off `_source` retrieval by using the `_source` parameter:
+
+[source,js]
+--------------------------------------------------
+curl -XGET 'http://localhost:9200/twitter/tweet/1?_source=false'
+--------------------------------------------------
+
+If you only need one or two fields from the complete `_source`, you can use the `_source_include`
+& `_source_exclude` parameters to include or filter out that parts you need. This can be especially helpful
+with large documents where partial retrieval can save on network overhead. Both parameters take a comma separated list
+of fields or wildcard expressions. Example:
+
+[source,js]
+--------------------------------------------------
+curl -XGET 'http://localhost:9200/twitter/tweet/1?_source_include=*.id&_source_exclude=entities'
+--------------------------------------------------
+
+If only want to specify includes, you can use a shorter notation:
+
+[source,js]
+--------------------------------------------------
+curl -XGET 'http://localhost:9200/twitter/tweet/1?_source=*.id,retweeted'
+--------------------------------------------------
+
+
 [float]
 [[get-fields]]
 === Fields
 
-The get operation allows specifying a set of fields that will be
-returned (by default, the `_source` field) by passing the `fields`
-parameter. For example:
+The get operation allows specifying a set of stored fields that will be
+returned by passing the `fields` parameter. For example:
 
 [source,js]
 --------------------------------------------------
 curl -XGET 'http://localhost:9200/twitter/tweet/1?fields=title,content'
 --------------------------------------------------
 
-The returned fields will either be loaded if they are stored, or fetched
-from the `_source` (parsed and extracted). It also supports sub objects
-extraction from _source, like `obj1.obj2`.
+For backward compatibility, if the requested fields are not stored, they will be fetched
+from the `_source` (parsed and extracted). This functionality has been replaced by the
+<<get-source-filtering,source filtering>> parameter.
 
 [float]
 [[_source]]
@@ -95,8 +128,15 @@ without any additional content around it. For example:
 curl -XGET 'http://localhost:9200/twitter/tweet/1/_source'
 --------------------------------------------------
 
-Note, there is also a HEAD variant for the _source endpoint. Curl
-example:
+You can also use the same source filtering parameters to control which parts of the `_source` will be returned:
+
+[source,js]
+--------------------------------------------------
+curl -XGET 'http://localhost:9200/twitter/tweet/1/_source?_source_include=*.id&_source_exclude=entities'
+--------------------------------------------------
+
+Note, there is also a HEAD variant for the _source endpoint to efficiently test for document existence.
+Curl example:
 
 [source,js]
 --------------------------------------------------

+ 45 - 1
docs/reference/docs/multi-get.asciidoc

@@ -70,11 +70,55 @@ curl 'localhost:9200/test/type/_mget' -d '{
 }'
 --------------------------------------------------
 
+[float]
+[[mget-source-filtering]]
+=== Source filtering
+
+added[1.0.0.Beta1]
+
+By default, the `_source` field will be returned for every document (if stored).
+Similar to the <<get-source-filtering,get>> API, you can retrieve only parts of
+the `_source` (or not at all) by using the `_source` parameter. You can also use
+the url parameters `_source`,`_source_include` & `_source_exclude` to specify defaults,
+which will be used when there are no per-document instructions.
+
+For example:
+
+[source,js]
+--------------------------------------------------
+curl 'localhost:9200/_mget' -d '{
+    "docs" : [
+        {
+            "_index" : "test",
+            "_type" : "type",
+            "_id" : "1",
+            "_source" : false
+        },
+        {
+            "_index" : "test",
+            "_type" : "type",
+            "_id" : "2",
+            "_source" : ["field3", "field4"]
+        },
+        {
+            "_index" : "test",
+            "_type" : "type",
+            "_id" : "3",
+            "_source" : {
+                "include": ["user"],
+                "_exclude": ["user.location"]
+            }
+        }
+    ]
+}'
+--------------------------------------------------
+
+
 [float]
 [[mget-fields]]
 === Fields
 
-Specific fields can be specified to be retrieved per document to get.
+Specific stored fields can be specified to be retrieved per document to get, similar to the <<get-fields,fields>> parameter of the Get API.
 For example:
 
 [source,js]

+ 7 - 2
docs/reference/search/explain.asciidoc

@@ -62,9 +62,14 @@ This will yield the same result as the previous request.
 === All parameters:
 
 [horizontal]
+`_source`::
+
+    added[1.0.0.Beta1] Set to `true` to retrieve the `_source` of the document explained. You can also
+    retrieve part of the document by using `_source_include` & `_source_exclude` (see <<get-source-filtering,Get API>> for more details)
+
 `fields`::
-    Allows to control which fields to return as part of the
-    document explained (support `_source` for the full document).
+    Allows to control which stored fields to return as part of the
+    document explained.
 
 `routing`::
     Controls the routing in the case the routing was used

+ 2 - 0
docs/reference/search/request-body.asciidoc

@@ -81,6 +81,8 @@ include::request/from-size.asciidoc[]
 
 include::request/sort.asciidoc[]
 
+include::request/source-filtering.asciidoc[]
+
 include::request/fields.asciidoc[]
 
 include::request/script-fields.asciidoc[]

+ 10 - 6
docs/reference/search/request/fields.asciidoc

@@ -1,8 +1,8 @@
 [[search-request-fields]]
 === Fields
 
-Allows to selectively load specific fields for each document represented
-by a search hit. Defaults to load the internal `_source` field.
+Allows to selectively load specific stored fields for each document represented
+by a search hit.
 
 [source,js]
 --------------------------------------------------
@@ -14,10 +14,6 @@ by a search hit. Defaults to load the internal `_source` field.
 }
 --------------------------------------------------
 
-The fields will automatically load stored fields (`store` mapping set to
-`true`), or, if not stored, will load the `_source` and extract it from
-it (allowing to return nested document object).
-
 `*` can be used to load all stored fields from the document.
 
 An empty array will cause only the `_id` and `_type` for each hit to be
@@ -33,6 +29,11 @@ returned, for example:
 }
 --------------------------------------------------
 
+
+For backwards compatibility, if the fields parameter specifies fields which are not stored (`store` mapping set to
+`false`), it will load the `_source` and extract it from it. This functionality has been replaced by the
+<<search-request-source-filtering,source filtering>> parameter.
+
 Script fields can also be automatically detected and used as fields, so
 things like `_source.obj1.obj2` can be used, though not recommended, as
 `obj1.obj2` will work as well.
@@ -40,6 +41,9 @@ things like `_source.obj1.obj2` can be used, though not recommended, as
 [[partial]]
 ==== Partial
 
+deprecated[1.0.0Beta1,Replaced by <<search-request-source-filtering>>]
+
+
 When loading data from `_source`, partial fields can be used to use
 wildcards to control what part of the `_source` will be loaded based on
 `include` and `exclude` patterns. For example:

+ 64 - 0
docs/reference/search/request/source-filtering.asciidoc

@@ -0,0 +1,64 @@
+[[search-request-source-filtering]]
+=== Source filtering
+
+added[1.0.0.Beta1]
+
+
+Allows to control how the `_source` field is returned with every hit.
+
+By default, the contents of the `_source` field unless
+you have used the `fields` parameter or if the `_source` field is disabled. 
+You can turn off `_source` retrieval by using the `_source` parameter:
+
+To disable `_source` retrieval set to `false`:
+
+[source,js]
+--------------------------------------------------
+{
+    "_source": false,
+    "query" : {
+        "term" : { "user" : "kimchy" }
+    }
+}
+--------------------------------------------------
+
+The `_source` also accepts one or more wildcard patterns to control what parts of the `_source` should be returned:
+
+For example:
+
+[source,js]
+--------------------------------------------------
+{
+    "_source": "obj.*",
+    "query" : {
+        "term" : { "user" : "kimchy" }
+    }
+}
+--------------------------------------------------
+
+Or
+
+[source,js]
+--------------------------------------------------
+{
+    "_source": [ "obj1.*", "obj2.*" ],
+    "query" : {
+        "term" : { "user" : "kimchy" }
+    }
+}
+--------------------------------------------------
+
+Finally, for complete control, you can specify both include and exclude patterns:
+
+[source,js]
+--------------------------------------------------
+{
+    "_source": {
+        "include": [ "obj1.*", "obj2.*" ],
+        "exclude": [ "*.description" ],
+    }
+    "query" : {
+        "term" : { "user" : "kimchy" }
+    }
+}
+--------------------------------------------------

+ 6 - 4
docs/reference/search/uri-request.asciidoc

@@ -62,10 +62,12 @@ query.
 |`explain` |For each hit, contain an explanation of how scoring of the
 hits was computed.
 
-|`fields` |The selective fields of the document to return for each hit
-(either retrieved from the index if stored, or from the `_source` if
-not), comma delimited. Defaults to the internal `_source` field. Not
-specifying any value will cause no fields to return.
+|`_source`| added[1.0.0.Beta1]Set to `false` to disable retrieval of the `_source` field. You can also retrieve
+part of the document by using `_source_include` & `_source_exclude` (see the <<search-request-source-filtering, request body>>
+documentation for more details)
+
+|`fields` |The selective stored fields of the document to return for each hit, 
+comma delimited. Not specifying any value will cause no fields to return.
 
 |`sort` |Sorting to perform. Can either be in the form of `fieldName`, or
 `fieldName:asc`/`fieldName:desc`. The fieldName can either be an actual