Home > Net >  How to understand this description of 'collapse' in the Elasticsearch document?
How to understand this description of 'collapse' in the Elasticsearch document?

Time:01-14

ES version:6.4.3

First, pls imagine that I have an index like this:

  1. create a new index "test_1",
  2. store some data,
#### 1.create a new index "test_1"
DELETE test_1

PUT /test_1/
{
  "settings": {
    "number_of_shards": 1
  },
  "mappings": {
    "_doc": {
      "properties": {
          "title": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
      }
    }
  }
}

GET /test_1/_mapping
GET /test_1/_refresh
GET /test_1/_search

#### 2.put some doc
POST _bulk
{ "index" : { "_index" : "test_1", "_id" : "100" } }
{ "title" : ["100","101"] }
{ "index" : { "_index" : "test_1", "_id" : "101" } }
{ "title" : "100" }
  1. test agg
#### 3.test agg 
GET /test_1/_search
{
 "size": 0,
 "aggs": {
   "title": {
     "terms": {
       "field": "title.keyword",
       "size": 100
     }
   }
 }
}

It works as expected, and the results are as follows:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "title": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "100",
          "doc_count": 2
        },
        {
          "key": "101",
          "doc_count": 1
        }
      ]
    }
  }
}
  1. test collapse
#### 4. test collapse
GET /test_1/_search
{
  "_source": false,
  "from":0,
  "size": 10,
 "query": {
    "match_all": {
    }
  },
  "collapse": {
    "field": "title.keyword",
    "inner_hits": {
      "name": "latest",
      "size": 1
    }
  }
}

The result is an error:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_state_exception",
        "reason": "failed to collapse 0, the collapse field must be single valued"
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "test_1",
        "node": "1TlabepgQSi-5WvjVm6MuQ",
        "reason": {
          "type": "illegal_state_exception",
          "reason": "failed to collapse 0, the collapse field must be single valued"
        }
      }
    ],
    "caused_by": {
      "type": "illegal_state_exception",
      "reason": "failed to collapse 0, the collapse field must be single valued",
      "caused_by": {
        "type": "illegal_state_exception",
        "reason": "failed to collapse 0, the collapse field must be single valued"
      }
    }
  },
  "status": 500
}

So my question is why the error is reported, is it related to this description of es about collapse:

The field used for collapsing must be a single valued keyword or numeric field with doc_values activated.

If the two are related, why is the reason for the error being failed to collapse 0, where does this 0 come from? Sincerely appreciate any answer.

CodePudding user response:

First of all, thanks for providing a reproducible example, that helps a lot!!

Then, regarding collapse, indeed, it is only working on single valued fields. In your first document, title is an array, and hence, is multi-valued, which is not ok for collapsing.

Simply put, the 0 you see in the error message is the internal document ID, i.e. it's an incremental number that each document gets whenever it is indexed. In your case, 0 stands for the first document that has been indexed. If you invert the documents in your bulk call, you'll see 1 instead.

  •  Tags:  
  • Related