How do I upsample a dataframe using resample() to get the initial values divided over the new sample frequency?
Dataframe with monthly sample frequency
date revenue
0 2021-11-01 00:00:00 00:00 300
1 2021-10-01 00:00:00 00:00 500
2 2021-09-01 00:00:00 00:00 100
3 2021-08-01 00:00:00 00:00 50
4 2021-07-01 00:00:00 00:00 200
5 2021-06-01 00:00:00 00:00 150
Approximate expected Dataframe with revenue divided over the days in that month
revenue
date
2021-06-01 00:00:00 00:00 4.8
2021-06-02 00:00:00 00:00 4.8
2021-06-03 00:00:00 00:00 4.8
2021-06-04 00:00:00 00:00 4.8
2021-06-05 00:00:00 00:00 4.8
... ...
2021-11-28 00:00:00 00:00 9.6
2021-11-29 00:00:00 00:00 9.6
2021-11-30 00:00:00 00:00 9.6
2021-11-31 00:00:00 00:00 9.6
ie, i want to be sure that the values get divided over the amount of days in that sepcific month
CodePudding user response:
You can use asfreq to convert the timeseries from monthly to daily frequency, then use ffill to forward fill the values then divide the revenue by daysinmonth attribute of datetimeindex to calculate distributed revenue
s = df.set_index('date')
s.loc[s.index.max() pd.offsets.MonthEnd()] = np.nan
s = s.asfreq('D').ffill()
s['revenue'] /= s.index.daysinmonth
print(s)
revenue
date
2021-06-01 00:00:00 00:00 5.000000
2021-06-02 00:00:00 00:00 5.000000
2021-06-03 00:00:00 00:00 5.000000
2021-06-04 00:00:00 00:00 5.000000
2021-06-05 00:00:00 00:00 5.000000
...
2021-07-24 00:00:00 00:00 6.451613
2021-07-25 00:00:00 00:00 6.451613
...
2021-11-30 00:00:00 00:00 10.000000
