I have two columns, one with an ID and the orhter with transaction dates.
| ID | Date |
|---|---|
| A | 2019-04-30 |
| A | 2020-12-15 |
| B | 2019-08-24 |
| B | 2019-09-29 |
| B | 2019-10-14 |
And so on...
I want to create a column that orders the date by ID.
| ID | Date | Type |
|---|---|---|
| A | 2019-04-30 | Order_1 |
| A | 2020-12-15 | Order_2 |
| B | 2019-08-24 | Order_1 |
| B | 2019-09-29 | Order_2 |
| B | 2019-10-14 | Order_3 |
Thanks!
CodePudding user response:
Use GroupBy.cumcount for counter with add 1 with cast to strings, so possible add Order_:
df['Type'] = 'Order_' df.groupby('ID').cumcount().add(1).astype(str)
df['diff'] = df.groupby('ID')['Date'].diff().dt.days
I got a problem, when I pivot the table, we orders the date alphabetically (Order_1, Order_10, Order_100,..) and not Order_1, Order_2. Because it is a string. Is there a way around this?
Convert values to numeric and after pivoting use DataFrame.add_prefix:
df['Type'] = df.groupby('ID').cumcount().add(1)
df1 = df.pivot('ID','Type','diff').add_prefix('Order_')
