I would like to transform a dataframe with the following layout:
| image | finding1 | finding2 | nofinding |
| ------- | -------- | -------- | --------- |
| 039.png | true | false | false |
| 006.png | true | true | false |
| 012.png | false | false | true |
into a dictionary with the following structure:
{
"039.png" : [
"finding1"
],
"006.png" : [
"finding1",
"finding2"
],
"012.png" : [
"nofinding"
]}
CodePudding user response:
IIUC, you could replace the False to NA (assuming boolean False here, for strings use 'false'), then stack to remove the values and use groupby.agg to aggregate as list before converting to dictionary:
dic = (df
.set_index('image')
.replace({False: pd.NA})
.stack()
.reset_index(1)
.groupby(level='image', sort=False)['level_1'].agg(list)
.to_dict()
)
output:
{'039.png': ['finding1'],
'006.png': ['finding1', 'finding2'],
'012.png': ['nofinding']}
