I will need to pivot a column in pandas, would greatly appreciate any help.
Input:
| ID | Status | Date |
|---|---|---|
| 1 | Online | 2022-06-31 |
| 1 | Offline | 2022-07-28 |
| 2 | Online | 2022-08-01 |
| 3 | Online | 2022-07-03 |
| 3 | None | 2022-07-05 |
| 4 | Offline | 2022-05-02 |
| 5 | Online | 2022-04-04 |
| 5 | Online | 2022-04-06 |
Output: Pivot on Status
| ID | Date | Online | Offline | None |
|---|---|---|---|---|
| 1 | 2022-06-31 | 1 | 0 | 0 |
| 1 | 2022-07-28 | 0 | 1 | 0 |
| 2 | 2022-08-01 | 1 | 0 | 0 |
| 3 | 2022-07-03 | 1 | 0 | 0 |
| 3 | 2022-07-05 | 1 | 0 | 0 |
| 4 | 2022-05-02 | 0 | 0 | 1 |
| 5 | 2022-04-04 | 1 | 0 | 0 |
| 5 | 2022-04-06 | 1 | 0 | 0 |
Or even better output if I am able to merge the counts for example:
Output: Pivot on Status & merge
| ID | Online | Offline | None |
|---|---|---|---|
| 1 | 1 | 1 | 0 |
| 2 | 1 | 0 | 0 |
| 3 | 2 | 0 | 0 |
| 4 | 0 | 0 | 1 |
| 5 | 2 | 0 | 0 |
The main issue here is that I won't know the status values i.e. Offline, Online, None.
I believe doing it in pandas might be easier due to the dynamic nature of not knowing column values for the column I want to pivot on.
CodePudding user response:
df.assign(seq=1).pivot_table(index='ID', columns='Status', values='seq', aggfunc='sum').fillna(0)
Status None Offline Online
ID
1 0.0 1.0 1.0
2 0.0 0.0 1.0
3 1.0 0.0 1.0
4 0.0 1.0 0.0
5 0.0 0.0 2.0
