I'm an Elastic beginner and I have trouble understanding how to find the most popular search terms used by my users.
Each time a user searches for something, Logstash enters a document such as this in Elastic:
{
"_index" : "user_searches-2022.02.14",
"_type" : "doc",
"_id" : "xGQA-H4BVgDEPVU6QZPf",
"_score" : 1.0,
"_source" : {
"message" : """[Large line in apache combined log format]""",
"@timestamp" : "2022-02-14T11:31:13.395Z",
"search_string": "hello world",
"search_terms" : ["hello", "world"]
}
},
The search_string is extracted from the URL; the search_terms is the search_string splitted (only one of these is needed, but I'm not yet certain which one).
I can't figure out what query can give me the counts of the search terms. I've had some success using "significant_text": {"field: "search_string"}, but it treats the whole string as a term, it doesn't split it into words. _termvectors, on the other hand, appears to only work on a single document, not on the entire index.
CodePudding user response:
I assume you want to count hello and world separately and I assume that type of search_terms is text in your mapping. If so, if you set fielddata to truein your mapping for search_terms field, you can use terms aggregation as below to get the count of each word.
{
"size": 0,
"aggs": {
"asd": {
"terms": {
"field": "search_terms",
"size": 10
}
}
}
}
Note that usign fielddata=true for text fields can cause high memory usage.
If search_terms field's type is keyword in the index mapping, you should be able to get the count with the above query without setting fielddata
CodePudding user response:
Here's how I did it in the end, without changing anything else:
GET /user_searches-*/_search
{
"size": 0,
"aggs": {
"search_term_count": {
"terms": {
"field": "search_terms.keyword"
}
}
}
}
