I have a dataset that is organized as the first table below, and I would like to transform it into a table like the second, in a relatively efficient way. Thanks !
Input:
| id | start | end | value |
|---|---|---|---|
| A | 01-01-2021 | 01-02-2021 | 3 |
| B | 01-04-2021 | 01-06-2021 | 4 |
| A | 01-04-2021 | 01-05-2021 | 5 |
| C | 01-02-2021 | 01-03-2021 | 6 |
Target:
| id | 01-01-2021 | 01-02-2021 | 01-03-2021 | 01-04-2021 | 01-05-2021 | 01-06-2021 | 01-07-2021 |
|---|---|---|---|---|---|---|---|
| A | 3 | 3 | 5 | 5 | 0 | 0 | 0 |
| B | 0 | 0 | 0 | 4 | 4 | 4 | 0 |
| C | 0 | 6 | 6 | 0 | 0 | 0 | 0 |
Thanks!
CodePudding user response:
You can use melt pivot:
(df.melt(id_vars=['id', 'value'], value_name='col')
.pivot_table(index='id', columns='col', values='value', fill_value=0)
.reset_index() # optional
)
output:
id 01-01-2021 01-02-2021 01-03-2021 01-04-2021 01-05-2021 01-06-2021
A 3 3 0 5 5 0
B 0 0 0 4 0 4
C 0 6 6 0 0 0
CodePudding user response:
Here is one way:
# Pivot "start"
t1 = df.pivot(index='id', columns='start', values='value')
# Pivot "end"
t2 = df.pivot(index='id', columns='end', values='value')
# Concat
new = pd.concat([t1, t2], axis=1).fillna(0)
