lqb
/
elasticsearch
spegling av https://gitee.com/mirrors/elasticsearch.git


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152
							[[search-aggregations-bucket-adjacency-matrix-aggregation]]
=== Adjacency matrix aggregation
++++
<titleabbrev>Adjacency matrix</titleabbrev>
++++

A bucket aggregation returning a form of {wikipedia}/Adjacency_matrix[adjacency matrix].
The request provides a collection of named filter expressions, similar to the `filters` aggregation
request. 
Each bucket in the response represents a non-empty cell in the matrix of intersecting filters.

Given filters named `A`, `B` and `C` the response would return buckets with the following names:


[options="header"]
|=======================
|  h|A   h|B  h|C   
h|A |A   |A&B |A&C 
h|B |    |B   |B&C 
h|C |    |    |C  
|=======================

The intersecting buckets e.g `A&C` are labelled using a combination of the two filter names with a default separator
of `&`. Note that the response does not also include a `C&A` bucket as this would be the
same set of documents as `A&C`. The matrix is said to be _symmetric_ so we only return half of it. To do this we sort 
the filter name strings and always use the lowest of a pair as the value to the left of the separator. 


[[adjacency-matrix-agg-ex]]
==== Example

The following `interactions` aggregation uses `adjacency_matrix` to determine
which groups of individuals exchanged emails.

[source,console,id=adjacency-matrix-aggregation-example]
--------------------------------------------------
PUT emails/_bulk?refresh
{ "index" : { "_id" : 1 } }
{ "accounts" : ["hillary", "sidney"]}
{ "index" : { "_id" : 2 } }
{ "accounts" : ["hillary", "donald"]}
{ "index" : { "_id" : 3 } }
{ "accounts" : ["vladimir", "donald"]}

GET emails/_search
{
  "size": 0,
  "aggs" : {
    "interactions" : {
      "adjacency_matrix" : {
        "filters" : {
          "grpA" : { "terms" : { "accounts" : ["hillary", "sidney"] }},
          "grpB" : { "terms" : { "accounts" : ["donald", "mitt"] }},
          "grpC" : { "terms" : { "accounts" : ["vladimir", "nigel"] }}
        }
      }
    }
  }
}
--------------------------------------------------

The response contains buckets with document counts for each filter and
combination of filters. Buckets with no matching documents are excluded from the
response.

[source,console-result]
--------------------------------------------------
{
  "took": 9,
  "timed_out": false,
  "_shards": ...,
  "hits": ...,
  "aggregations": {
    "interactions": {
      "buckets": [
        {
          "key":"grpA",
          "doc_count": 2
        },
        {
          "key":"grpA&grpB",
          "doc_count": 1
        },
        {
          "key":"grpB",
          "doc_count": 2
        },
        {
          "key":"grpB&grpC",
          "doc_count": 1
        },
        {
          "key":"grpC",
          "doc_count": 1
        }
      ]
    }
  }
}
--------------------------------------------------
// TESTRESPONSE[s/"took": 9/"took": $body.took/]
// TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]
// TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]

[role="child_attributes"]
[[adjacency-matrix-agg-params]]
==== Parameters

`filters`::
(Required, object)
Filters used to create buckets.
+
.Properties of `filters`
[%collapsible%open]
====
`<filter>`::
(Required, <<query-dsl,Query DSL object>>)
Query used to filter documents. The key is the filter name.
+
At least one filter is required. The total number of filters cannot exceed the
<<indices-query-bool-max-clause-count,`indices.query.bool.max_clause_count`>>
setting. See <<adjacency-matrix-agg-filter-limits>>.
====

`separator`::
(Optional, string)
Separator used to concatenate filter names. Defaults to `&`.

[[adjacency-matrix-agg-response]]
==== Response body

`key`::
(string)
Filters for the bucket. If the bucket uses multiple filters, filter names are
concatenated using a `separator`.

`document_count`::
(integer)
Number of documents matching the bucket's filters.

[[adjacency-matrix-agg-usage]]
==== Usage
On its own this aggregation can provide all of the data required to create an undirected weighted graph.
However, when used with child aggregations such as a `date_histogram` the results can provide the
additional levels of data required to perform {wikipedia}/Dynamic_network_analysis[dynamic network analysis]
where examining interactions _over time_ becomes important.

[[adjacency-matrix-agg-filter-limits]]
==== Filter limits
For N filters the matrix of buckets produced can be N²/2 which can be costly.
The circuit breaker settings prevent results producing too many buckets and to avoid excessive disk seeks
the `indices.query.bool.max_clause_count` setting is used to limit the number of filters.