I have a dataframe that looks like this:
| Part | Date |
|---|---|
| 1 | 9/1/2021 |
| 1 | 9/8/2021 |
| 1 | 9/15/2021 |
| 2 | 9/1/2020 |
| 2 | 9/1/2021 |
| 2 | 9/1/2022 |
The dataframe is already sorted by part, then by date.
I am trying to find the average days between each date grouped by the Part column.
So the desired output would be:
| Part | Avg Days |
|---|---|
| 1 | 7 |
| 2 | 365 |
How would you go about processing this data to achieve the desired output?
Any assistance on this would be greatly appreciated!
Thank you
CodePudding user response:
You can groupby "Date", use apply diff to get the time delta between consecutive rows, and get the mean:
(df.groupby('Part')['Date']
.apply(lambda s: s.diff().mean())
.to_frame()
.reset_index()
)
output:
Part Date
1 7 days
2 365 days
