I have a table like this:
d = {
"person_id" : [10,10,10,10,20,20],
"type" : ["Exit", "Enter", "Exit", "Enter", "Exit", "Enter"],
"exit_date": ["2022-01-01", "2022-01-02", "2022-01-05", "2022-01-07", "2022-02-01", "2022-02-05"]
}
pd.DataFrame(d)
I wanto convert dataframe into this:
d = {
"person_id":[10, 10,20],
"Exit": ["2022-01-01","2022-01-05", "2022-02-01"],
"Enter": ["2022-01-02","2022-01-07", "2022-02-05"]
}
pd.DataFrame(d)
Thanks.
CodePudding user response:
Try:
>>> df.assign(index=df.groupby(['person_id', 'type']).cumcount()) \
.pivot(['index', 'person_id'], 'type', 'exit_date') \
.reset_index(level=1).to_dict('list')
{'person_id': [10, 20, 10],
'Enter': ['2022-01-02', '2022-02-05', '2022-01-07'],
'Exit': ['2022-01-01', '2022-02-01', '2022-01-05']}
CodePudding user response:
Here is just an alternative to this answer. Using pivot_table, one can pass a grouper directly:
(df.pivot_table(index=[df.groupby(['person_id', 'type']).cumcount(), 'person_id'],
columns='type', values='exit_date', aggfunc='first')
.droplevel(0).reset_index()
)
output:
type person_id Enter Exit
0 10 2022-01-02 2022-01-01
1 20 2022-02-05 2022-02-01
2 10 2022-01-07 2022-01-05
