How to split rows with respect to date?-CodePudding

I have a table like this:

d = {
    "person_id" : [10,10,10,10,20,20],
    "type" : ["Exit", "Enter", "Exit", "Enter", "Exit", "Enter"],
    "exit_date": ["2022-01-01", "2022-01-02", "2022-01-05", "2022-01-07", "2022-02-01", "2022-02-05"]
}
pd.DataFrame(d)

I wanto convert dataframe into this:

d = {
    "person_id":[10, 10,20],
    "Exit": ["2022-01-01","2022-01-05", "2022-02-01"],
    "Enter": ["2022-01-02","2022-01-07", "2022-02-05"]
}
pd.DataFrame(d)

Thanks.

CodePudding user response：

Try:

>>> df.assign(index=df.groupby(['person_id', 'type']).cumcount()) \
      .pivot(['index', 'person_id'], 'type', 'exit_date') \
      .reset_index(level=1).to_dict('list')

{'person_id': [10, 20, 10],
 'Enter': ['2022-01-02', '2022-02-05', '2022-01-07'],
 'Exit': ['2022-01-01', '2022-02-01', '2022-01-05']}

CodePudding user response：

Here is just an alternative to this answer. Using pivot_table, one can pass a grouper directly:

(df.pivot_table(index=[df.groupby(['person_id', 'type']).cumcount(), 'person_id'],
                columns='type', values='exit_date', aggfunc='first')
   .droplevel(0).reset_index()
)

output:

type  person_id       Enter        Exit
0            10  2022-01-02  2022-01-01
1            20  2022-02-05  2022-02-01
2            10  2022-01-07  2022-01-05