Firstly I did a groupby operation: df.groupby('a')['b'].agg(list).reset_index(name='b')
a b
A 1
A 2
B 5
B 5
B 4
C 6
Resulting in this df:
a b
A [1,2]
B [5,5,4]
C [6]
Now I want to explode these lists into multiple cumulative lists by row.
a b
A [1]
A [1,2]
B [5]
B [5,5]
B [5,5,4]
C [6]
CodePudding user response:
You need 1st convert the cell value to list then we can do cumsum
df['out'] = df['b'].apply(lambda x : [x]).groupby(df['a']).apply(lambda x : x.cumsum() )
Out[382]:
0 [1]
1 [1, 2]
2 [5]
3 [5, 5]
4 [5, 5, 4]
5 [6]
Name: b, dtype: object
CodePudding user response:
As DataFrame.expanding() seems only to work on numeric data, I resort to this nested list comprehension:
df['b'] = [subdf['b'].tolist()[:i 1]
for group, subdf in df.groupby('a')
for i in range(subdf.shape[0])]
print(df)
a b
0 A [1]
1 A [1, 2]
2 B [5]
3 B [5, 5]
4 B [5, 5, 4]
5 C [6]
