I have a pandas dataframe and in the dataframe there is a column named "Type". like this:
| Type | |
|---|---|
| 0 | 1 |
| 1 | 2 |
| 2 | 4 |
| 3 | 3 |
| 4 | 2 |
| 5 | 2 |
| 6 | 3 |
| 7 | 4 |
| 8 | 2 |
what i want is to find possibility of occurrence of each type in like each previous 5 rows. so for the table above we have something like this: (POx-5 is possibility of occurrence of Type x in previous 5 rows.)
| Type | PO1-5 | PO2-5 | PO3-5 | PO4-5 | |
|---|---|---|---|---|---|
| 0 | 1 | 0.2 | 0 | 0 | 0 |
| 1 | 2 | 0.2 | 0.2 | 0 | 0 |
| 2 | 4 | 0.2 | 0.2 | 0 | 0.2 |
| 3 | 3 | 0.2 | 0.2 | 0.2 | 0.2 |
| 4 | 2 | 0.2 | 0.4 | 0.2 | 0.2 |
| 5 | 2 | 0 | 0.6 | 0.2 | 0.2 |
| 6 | 3 | 0 | 0.4 | 0.4 | 0.2 |
| 7 | 4 | 0 | 0.4 | 0.4 | 0.2 |
| 8 | 2 | 0 | 0.6 | 0.2 | 0.2 |
how can I do this and add it to the dataframe? I have no clue at all how to achieve this.
CodePudding user response:
I Found the answer but just wanted to leave the question here for if anyone hit the same problem:
say we have 4 types:
for i in range(4):
df['PO' str(i 1) '-4'] = df['Type'].rolling(5).apply(lambda v: sum(v == i 1)/4)
