Home > Mobile >  define a function to have n periods after one index
define a function to have n periods after one index

Time:01-09

Assuming we have the dataframe below:
The indicator is simply the binary column.

 date      indicator 
    2021-12-01 1
    2021-12-02 1
    2021-12-03 0
    2021-12-04 0
    2021-12-05 0
    2021-12-06 1
    2021-12-07 1
    2021-12-08 1
    2021-12-09 0
    2021-12-10 0

To label which date is the date of the change, I have defined the following function:

def sign_change(df, indicator, strategy_name):
    df[strategy_name] = 0
    df["sign_change"] = np.sign(df[indicator]).diff()
    # when the diff equals 2, then it indicates the first sign change
    con1 = df["sign_change"] == 2
    con2 = df["sign_change"] == -2
    df.loc[con1, strategy_name] = 1
    df.loc[con2, strategy_name] = -1
    df.drop(columns="sign_change", axis=1, inplace=True)
    return df

However, I want to have another input for n_periods for not getting not only the date that the indicator changed, but also the n_periods, if the status remains for n_periods.

For example, n_periods = 2 will generate the below output (column: strategy_name_):

 date      indicator strategy_name    strategy_name_
    2021-12-01  1     0                0
    2021-12-02  1     0                0
    2021-12-03 -1    -1                -1
    2021-12-04 -1     0                -1
    2021-12-05 -1     0                0
    2021-12-06  1     1                1
    2021-12-07  1     0                1
    2021-12-08  1     0                0
    2021-12-09 -1    -1                -1
    2021-12-10 -1    -1                -1

CodePudding user response:

It looks to me you are actually looking last 3 rows including itself when you say n_periods = 2.

For 12-03, you are looking the indicator of 1, 1, -1, this has a value change and taking the last value which is -1.

For 12-05, the indicator of last 3 rows are -1, -1, -1, this has no changes, therefore 0.

So, if my assumption is correct, you can use rolling and apply to check whether there is a value change or not.

df['strategy_name_'] = (df.rolling(n_periods   1, min_periods=1)
                         .indicator
                         .apply(lambda x: 0 if (x.iloc[0] == x).all() else x.iloc[-1])
                         .astype(int))

the lambda function is doing if all values in the rolling period is same, return 0. Otherwise, return the last value in the period.

         date  indicator  strategy_name_
0  2021-12-01          1               0
1  2021-12-02          1               0
2  2021-12-03         -1              -1
3  2021-12-04         -1              -1
4  2021-12-05         -1               0
5  2021-12-06          1               1
6  2021-12-07          1               1
7  2021-12-08          1               0
8  2021-12-09         -1              -1
9  2021-12-10         -1              -1
  •  Tags:  
  • Related