Home > Net >  Copy rows to new dataframe based on change in date column and transaction
Copy rows to new dataframe based on change in date column and transaction

Time:02-06

I've checked several websites and StackOverflow, but I wasn't able to find any documentation or help around this topic, so hoping that someone in the community might know the answer. I have a list of groceries, quantity, transaction (B for buy and S for sell) and date purchased/sold:

TRANSACTIONS    Qty Transaction Date
Apple   5   B   1/1/22
Banana  5   B   1/1/22
Cereal  3   B   1/1/22
Milk    4   B   1/1/22
Banana  2   S   2/2/22
Milk    1   S   2/2/22
Toy     4   B   2/2/22

Grocery Transactions

What I would like to do is generate a new dataframe that incorporates the transactions to show the quantity remaining based on the transaction and each date there was a transaction:

BALANCE Qty Date
Apple   5   1/1/22
Banana  5   1/1/22
Cereal  3   1/1/22
Milk    4   1/1/22
Apple   5   2/2/22
Banana  3   2/2/22
Cereal  3   2/2/22
Milk    3   2/2/22
Toy     4   2/2/22

Grocery Balance

CodePudding user response:

Here's a one-liner solution (formatted):

new_df = (
    df.pivot(index='TRANSACTIONS', columns='Date')
    .pipe(lambda x: x.assign(Qty=x.Qty.fillna(0), Transaction=x.Transaction.ffill(axis=1)))
    .stack()
    .sort_index(level=1)
    .reset_index()
    .dropna()
    .pipe(lambda x: x.assign(Qty=x['Qty'].where(x['Transaction'] == 'B', -x['Qty'])))
    .pipe(lambda x: x.assign(Qty=x.groupby('TRANSACTIONS')['Qty'].cumsum().astype(int)))
    .drop(['Transaction'], axis=1)
    .rename({'TRANSACTIONS': 'BALANCE'}, axis=1)
)

Output:

>>> new_df
  BALANCE    Date  Qty
0   Apple  1/1/22    5
1  Banana  1/1/22    5
2  Cereal  1/1/22    3
3    Milk  1/1/22    4
5   Apple  2/2/22    5
6  Banana  2/2/22    3
7  Cereal  2/2/22    3
8    Milk  2/2/22    3
9     Toy  2/2/22    4
  •  Tags:  
  • Related