Home > Mobile >  Using previous row to calculate sum of current row
Using previous row to calculate sum of current row

Time:01-27

I have a python issue which I can't seem to figure out. The following dataframe illustrates persons who are spending money in given years and months:

ID Year Month Amount
134 2020 11 -199
134 2020 12 -50
134 2021 1 40
135 2020 11 -365
135 2020 12 -23
135 2021 1 400

However, I want to add a new column that sums these expenses each month.

Example: A person spends 50 euro's in the 1st month and 100 in the 2nd. This will amount to 150 euro's spent in total in month 2. I want to illustrate this spending behaviour in the new column with the assumption that everyone has 100 euro's as starting capital. (This is why In the example below, a person spends 200 but the new column shows -100.

Here is the column that I would like:

ID Year Month Amount NewColumn
134 2020 11 -200 -100
134 2020 12 -50 -150
134 2021 1 40 -110
135 2020 11 -365 -265
135 2020 12 -23 -288
135 2021 1 400 112

Here is the code to recreate the first dataframe:

data = {'ID': ['134', '134','134','135','135','135',], 'Year': [2020, 2020, 2021, 2020, 2020, 2021,], 'Month': [11, 12, 1, 11, 12, 1], 'Amount': [-199, -50, 40, -365, -23, 400]}  
df = pd.DataFrame(data)  

I've tried cumsum() and .rolling but can't figure it out. Hopefully I've given enough info and otherwise I'll be happy to supply some more

Thanks in advance

CodePudding user response:

You can use cumsum() 100:

df["NewColumn"] = df.groupby(["ID"])["Amount"].cumsum() 100

Output:

    ID  Year        Month   Amount  NewColumn
0   134     2020    11      -199    -99
1   134     2020    12      -50     -149
2   134     2021    1       40      -109
3   135     2020    11      -365    -265
4   135     2020    12      -23     -288
5   135     2021    1       400     112

PS: there is a small difference to your expected output, as you started with -200, but in the data it was -199 :-)

  •  Tags:  
  • Related