Need Help.
I have data in elasticsearch, field "interests" like this:
{ "interests": "A,C,D,E" }, { "interests": "B,C,D" }, { "interests": "A,B,C,D,E" }, { "interests": "D,E" }
I want to get data like this: { "key": "A", "doc_count": 2 }, { "key": "B", "doc_count": 2 }, { "key": "C", "doc_count": 3 }, { "key": "D", "doc_count": 4 }, { "key": "E", "doc_count": 3 }
what steps should I take. Thank you.
CodePudding user response:
A simple terms aggregation should do the job:
POST test/_search
{
"aggs": {
"interests": {
"terms": { "field": "interests" }
}
}
}
UPDATE
Since the interests field is a single string you first need to split the contained values. If you don't want to reindex all your data, you can achieve this with an ingest pipeline that updates the data in-place.
First create the pipeline using a split processor like this:
PUT _ingest/pipeline/splitter
{
"processors": [
{
"split": {
"field": "interests",
"separator": ","
}
},
{
"trim": {
"field": "interests"
}
}
]
}
Then update your index using that new pipeline, like this:
POST your-index/_update_by_query?pipeline=splitter
Your interests field will be changed from this
{
"interests" : "A, C, D, E"
},
To this:
{
"interests" : [
"A",
"C",
"D",
"E"
]
},
