I have a pandas dataframe with the following content (X represents some value)
| personal id | time point | label |
|---|---|---|
| 1 | 1 | X |
| . | . | . |
| 1 | 12 | X |
| 2 | 400 | X |
| . | . | . |
| 2 | 411 | X |
Basically, for every person, I have labels for 12 consecutive time entries (not always from 1 to 12). What I want is the following form
| personal id | label (first time point) | label (seconde time point) | ... | label (twelfth time point) |
|---|---|---|---|---|
| 1 | X | X | ... | X |
| 2 | X | X | ... | X |
How can I achieve this? Since I want to ignore the time point numbers and only want the twelve labels next to each other, pivot tables doesn't seem to do the job.
CodePudding user response:
Use GroupBy.cumcount for counter by personal id and then use DataFrame.pivot:
df = df.assign(g = df.groupby('personal id').cumcount()).pivot('personal id','g','label')
