I have a dataframe as below:
d = {'col1': ["A", "A", "A", "B", "B", "B"], 'col2': [2015, 2016, 2017, 2015, 2016, 2017], 'col3': [10, 20, 25, 10, 12, 14]}
I would like to get the difference of col3 by col1 and col2. Such that, col1 represent companies, while col2 years and col3 stock price. I am trying to get the stock price difference each year. Such that, the output should be:
d2= {'col4': ['nan', 10, 5, 'nan', 2, 4]}
Thank you for your suggestions in advance.
Note: we cannot re-index the dataframe, we have col5, col6, col7... etc.. many other columns.
CodePudding user response:
groupby diff:
df.groupby('col1').col3.transform('diff')
0 NaN
1 10.0
2 5.0
3 NaN
4 2.0
5 2.0
Name: col3, dtype: float64
