Home > Net >  get last week date from last group for each group by pandas?
get last week date from last group for each group by pandas?

Time:02-08

        hy_code date       week_date   last_week_date
340     880301  2006/10/31  2006/11/5   NaT
341     880301  2006/11/1   2006/11/5   NaT
342     880301  2006/11/2   2006/11/5   NaT
343     880301  2006/11/3   2006/11/5   NaT
355     880301  2006/11/21  2006/11/26  2006/11/5
482916  880969  2021/12/24  2021/12/26  NaT
482918  880969  2021/12/28  2022/1/2    2021/12/26
482919  880969  2021/12/29  2022/1/2    2021/12/26
482920  880969  2021/12/30  2022/1/2    2021/12/26
482921  880969  2021/12/31  2022/1/2    2021/12/26

Goal

  • I want to get last week date for each group week date.The expected result is above.

Try

df['new_week_date']=tmp_df.groupby(['hy_code','week_date'])['week_date'].shift(1), but failed.

    hy_code date    week_date   last_week_date
340 880301  2006-10-31  2006-11-05  NaT
341 880301  2006-11-01  2006-11-05  2006-11-05
342 880301  2006-11-02  2006-11-05  2006-11-05
343 880301  2006-11-03  2006-11-05  2006-11-05
355 880301  2006-11-21  2006-11-26  NaT
482916  880969  2021-12-24  2021-12-26  NaT
482918  880969  2021-12-28  2022-01-02  2021-12-26
482919  880969  2021-12-29  2022-01-02  2022-01-02
482920  880969  2021-12-30  2022-01-02  2022-01-02
482921  880969  2021-12-31  2022-01-02  2022-01-02

CodePudding user response:

You can first drop_duplicates, then create the shift week date and merge back

out = df.drop_duplicates(['hy_code','week_date']).copy()
out['lastweekdate'] = out.groupby('hy_code')['week_date'].shift(1)

df = df.merge(out.drop('date',1),how='left')
df
Out[233]: 
   hy_code        date   week_date lastweekdate
0   880301  2006/10/31   2006/11/5          NaN
1   880301   2006/11/1   2006/11/5          NaN
2   880301   2006/11/2   2006/11/5          NaN
3   880301   2006/11/3   2006/11/5          NaN
4   880301  2006/11/21  2006/11/26    2006/11/5
5   880969  2021/12/24  2021/12/26          NaN
6   880969  2021/12/28    2022/1/2   2021/12/26
7   880969  2021/12/29    2022/1/2   2021/12/26
8   880969  2021/12/30    2022/1/2   2021/12/26
9   880969  2021/12/31    2022/1/2   2021/12/26

CodePudding user response:

You can first compute the shifted week as a series with a double groupby:

s = (df
   .groupby(['hy_code','week_date'],sort=False)['week_date'].first()
   .groupby('hy_code').shift()
   .rename('last_week_date')
 )
hy_code  week_date 
880301   2006/11/5            NaN
         2006/11/26     2006/11/5
880969   2021/12/26           NaN
         2022/1/2      2021/12/26
Name: last_week_date, dtype: object

Then merge it to the original data:

# ensure last_week_date is not pre-existing
# df = df.drop(columns='last_week_date')
# merge
df.merge(s, left_on=['hy_code','week_date'], right_index=True)

Output:

        hy_code        date   week_date last_week_date
340      880301  2006/10/31   2006/11/5            NaN
341      880301   2006/11/1   2006/11/5            NaN
342      880301   2006/11/2   2006/11/5            NaN
343      880301   2006/11/3   2006/11/5            NaN
355      880301  2006/11/21  2006/11/26      2006/11/5
482916   880969  2021/12/24  2021/12/26            NaN
482918   880969  2021/12/28    2022/1/2     2021/12/26
482919   880969  2021/12/29    2022/1/2     2021/12/26
482920   880969  2021/12/30    2022/1/2     2021/12/26
482921   880969  2021/12/31    2022/1/2     2021/12/26
  •  Tags:  
  • Related