123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112 |
- [[indices-analyze]]
- == Analyze
- Performs the analysis process on a text and return the tokens breakdown
- of the text.
- Can be used without specifying an index against one of the many built in
- analyzers:
- [source,js]
- --------------------------------------------------
- curl -XGET 'localhost:9200/_analyze' -d '
- {
- "analyzer" : "standard",
- "text" : "this is a test"
- }'
- --------------------------------------------------
- coming[2.0.0-beta1, body based parameters were added in 2.0.0]
- If text parameter is provided as array of strings, it is analyzed as a multi-valued field.
- [source,js]
- --------------------------------------------------
- curl -XGET 'localhost:9200/_analyze' -d '
- {
- "analyzer" : "standard",
- "text" : ["this is a test", "the second text"]
- }'
- --------------------------------------------------
- coming[2.0.0-beta1, body based parameters were added in 2.0.0]
- Or by building a custom transient analyzer out of tokenizers,
- token filters and char filters. Token filters can use the shorter 'filters'
- parameter name:
- [source,js]
- --------------------------------------------------
- curl -XGET 'localhost:9200/_analyze' -d '
- {
- "tokenizer" : "keyword",
- "filters" : ["lowercase"],
- "text" : "this is a test"
- }'
- curl -XGET 'localhost:9200/_analyze' -d '
- {
- "tokenizer" : "keyword",
- "token_filters" : ["lowercase"],
- "char_filters" : ["html_strip"],
- "text" : "this is a <b>test</b>"
- }'
- --------------------------------------------------
- coming[2.0.0-beta1, body based parameters were added in 2.0.0]
- It can also run against a specific index:
- [source,js]
- --------------------------------------------------
- curl -XGET 'localhost:9200/test/_analyze' -d '
- {
- "text" : "this is a test"
- }'
- --------------------------------------------------
- The above will run an analysis on the "this is a test" text, using the
- default index analyzer associated with the `test` index. An `analyzer`
- can also be provided to use a different analyzer:
- [source,js]
- --------------------------------------------------
- curl -XGET 'localhost:9200/test/_analyze' -d '
- {
- "analyzer" : "whitespace",
- "text : "this is a test"
- }'
- --------------------------------------------------
- coming[2.0.0-beta1, body based parameters were added in 2.0.0]
- Also, the analyzer can be derived based on a field mapping, for example:
- [source,js]
- --------------------------------------------------
- curl -XGET 'localhost:9200/test/_analyze' -d '
- {
- "field" : "obj1.field1",
- "text" : "this is a test"
- }'
- --------------------------------------------------
- coming[2.0.0-beta1, body based parameters were added in 2.0.0]
- Will cause the analysis to happen based on the analyzer configured in the
- mapping for `obj1.field1` (and if not, the default index analyzer).
- All parameters can also supplied as request parameters. For example:
- [source,js]
- --------------------------------------------------
- curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&filters=lowercase&text=this+is+a+test'
- --------------------------------------------------
- For backwards compatibility, we also accept the text parameter as the body of the request,
- provided it doesn't start with `{` :
- [source,js]
- --------------------------------------------------
- curl -XGET 'localhost:9200/_analyze?tokenizer=keyword&token_filters=lowercase&char_filters=html_strip' -d 'this is a <b>test</b>'
- --------------------------------------------------
|