I have the following data that I would like to add a new column to that is the current month-over-month percent change. The date is the index in my dataframe
date close
1/26/1990 421.2999878
1/29/1990 418.1000061
1/30/1990 410.7000122
1/31/1990 415.7999878
2/23/1990 419.5
2/26/1990 421
2/27/1990 422.6000061
2/28/1990 425.7999878
3/26/1990 438.7999878
3/27/1990 439.5
3/28/1990 436.7000122
3/29/1990 435.3999939
3/30/1990 435.5
The simplest way I could think to do this is to add a column that will contain the previous month end date and also for convenience, the previous month-end 'close' value - and from that I can calculate the current month-over-month change. So in the end, i would have a table that looks like this:
I was able to add the previous month end just fine, but I am having problems now trying to look up the previous month-end close based on the previous month end date. In the the code below, the first line works fine to add the previous month end date as a new column. But the second does not - the idea is to use the prev_month_end date to look up the the month end close value and add that as a column.
df['prev_month_end'] = df.index pd.offsets.BMonthEnd(-1)
df['prev_month_close'] = df[df.index == df['prev_month_end']]
Any help or suggestions on how to accomplish this would be appreciated.
CodePudding user response:
You can have prev_month_close as follows:
df.reset_index(inplace=True)
df = df[['date', 'close', 'prev_month_end']].merge(df[['date', 'close']].rename(columns={'close': 'prev_month_close',
'date': 'prev_month_end'}),
how='left', on='prev_month_end')
OUTPUT
date close prev_month_end prev_month_close
0 1990-01-26 421.299988 1989-12-29 NaN
1 1990-01-29 418.100006 1989-12-29 NaN
2 1990-01-30 410.700012 1989-12-29 NaN
3 1990-01-31 415.799988 1989-12-29 NaN
4 1990-02-23 419.500000 1990-01-31 415.799988
5 1990-02-26 421.000000 1990-01-31 415.799988
6 1990-02-27 422.600006 1990-01-31 415.799988
7 1990-02-28 425.799988 1990-01-31 415.799988
8 1990-03-26 438.799988 1990-02-28 425.799988
9 1990-03-27 439.500000 1990-02-28 425.799988
10 1990-03-28 436.700012 1990-02-28 425.799988
11 1990-03-29 435.399994 1990-02-28 425.799988
12 1990-03-30 435.500000 1990-02-28 425.799988
OR without using reset_index
df = df[['close', 'prev_month_end']].merge(df[['close']].rename(columns={'close': 'prev_month_close'}),
how='left', left_on='prev_month_end', right_index=True)
OUTPUT
close prev_month_end prev_month_close
date
1990-01-26 421.299988 1989-12-29 NaN
1990-01-29 418.100006 1989-12-29 NaN
1990-01-30 410.700012 1989-12-29 NaN
1990-01-31 415.799988 1989-12-29 NaN
1990-02-23 419.500000 1990-01-31 415.799988
1990-02-26 421.000000 1990-01-31 415.799988
1990-02-27 422.600006 1990-01-31 415.799988
1990-02-28 425.799988 1990-01-31 415.799988
1990-03-26 438.799988 1990-02-28 425.799988
1990-03-27 439.500000 1990-02-28 425.799988
1990-03-28 436.700012 1990-02-28 425.799988
1990-03-29 435.399994 1990-02-28 425.799988
1990-03-30 435.500000 1990-02-28 425.799988
CodePudding user response:
We can convert the index to period index, then group the dataframe by period and aggregate close using last, then shift the period index one month back and map it with the closing values, finally calculate the percent change
i = pd.to_datetime(df.index).to_period('M')
s = i.shift(-1).map(df.groupby(i)['close'].last())
df['mom_pct_change'] = df['close'].sub(s).div(s).mul(100)
close mom_pct_change
date
1/26/1990 421.299988 NaN
1/29/1990 418.100006 NaN
1/30/1990 410.700012 NaN
1/31/1990 415.799988 NaN
2/23/1990 419.500000 0.889854
2/26/1990 421.000000 1.250604
2/27/1990 422.600006 1.635406
2/28/1990 425.799988 2.405002
3/26/1990 438.799988 3.053077
3/27/1990 439.500000 3.217476
3/28/1990 436.700012 2.559893
3/29/1990 435.399994 2.254581
3/30/1990 435.500000 2.278068

