I'm trying to calculate the cumulative AUC of a dataframe values from first row to the current row.
Ex:
| points | AUC | |
|---|---|---|
| 0 | 0 | 0 |
| 1 | 1 | 0.5 |
| 2 | 2 | 1 |
| 3 | 3 | 4.5 |
| 4 | 4 | 8 |
| 5 | 5 | 12.5 |
| 6 | 4 | 17 |
| 7 | 0 | 19 |
| 8 | -2 | 18 |
| 9 | -2 | 16 |
I can use np.trapz() but I have to calculate it row by row, by a for loop.
for i in df.index:
row={"AUC" : trapz(df["points"].iloc[:i])}
df["AUC"].iloc[i]=row
Is there any way to apply it to the whole column without using a for loop?
The second problem is that my dataframe gets updated every minutes so either I have to calculate this cumulative AUC from the beginning of the df which makes the calculation longer and longer, or choose a part of the df (ex: df.tail(25)) and apply a function to it, and by doing this I would lose calculate AUC of the curve before iloc[-25].
CodePudding user response:
I would try something like this:
np.cumsum(df.points)-np.concatenate(([0], np.cumsum(np.diff(df.points)/2)), axis=0)
here is a working example: https://abstra.show/dezL0ASX4s
