I got a dataframe as shown below:
Note: Datetime is the index
Name target_mtd
Datetime
2021-12-01 Amy 1000
2021-12-02 Amy 2500
2021-12-03 Amy 4000
2021-12-01 Bobo 2000
2021-12-02 Bobo 3000
2021-12-03 Bobo 4000
And I would like to transform the column target_mtd into daily values in each group, therefore I perform the following code:
df['target_daily'] = df.groupby([df.index.month, 'Name'])['target_mtd'].transform(lambda x:x.diff())
And gives the result which is not the same as I expected:
Name target_mtd target_daily
Datetime
2021-12-01 Amy 1000 NaN
2021-12-02 Amy 2500 1500
2021-12-03 Amy 4000 1500
2021-12-01 Bobo 2000 NaN
2021-12-02 Bobo 3000 1000
2021-12-03 Bobo 4000 1000
Expected result is that the first value will be kept:
Name target_mtd target_daily
Datetime
2021-12-01 Amy 1000 1000
2021-12-02 Amy 2500 1500
2021-12-03 Amy 4000 1500
2021-12-01 Bobo 2000 2000
2021-12-02 Bobo 3000 1000
2021-12-03 Bobo 4000 1000
Thanks!
CodePudding user response:
You can replace missing values by original column by Series.fillna:
df['target_daily'] = (df.groupby([df.index.month, 'Name'])['target_mtd']
.diff()
.fillna(df['target_mtd']))
If there is multiple years is necessary use month periods for distinguish years with months separately:
df['target_daily'] = (df.groupby([df.index.to_period('m'), 'Name'])['target_mtd']
.diff()
.fillna(df['target_mtd']))
Or use Grouper per months (also years months are count separately):
df['target_daily'] = (df.groupby([pd.Grouper(freq='m'), 'Name'])['target_mtd']
.diff()
.fillna(df['target_mtd']))
