For example, I have this dataFrame,
| plates | food |
|---|---|
| 1 | eggs |
| 1 | bacon |
| 2 | waffles |
| 2 | toast |
| 3 | |
| 3 | |
| 4 | cereal |
| 4 | milk |
The result I want is...
| food |
|---|
| eggs bacon |
| waffles toast |
| None |
| cereal milk |
This is what I have
result = df['food'].groupby(df['plates'].agg('sum')
but obviously the rows that aren't grouped are ignored. I want to find those values that do not have a group and fill those rows with Nan.
CodePudding user response:
I'm assuming, the missing values in food column are empty strings (""):
Then:
x = df.groupby("plates")["food"].agg(lambda x: " ".join(x).strip() or np.nan)
print(x)
Prints:
plates
1 eggs bacon
2 waffles toast
3 NaN
4 cereal milk
Name: food, dtype: object
CodePudding user response:
Try using .agg(list) with str.join()
df.groupby('plates')['food'].agg(list).str.join(' ')
plates
1 eggs bacon
2 waffles toast
3 NaN
4 cereal milk
