Home > Blockchain >  Is there a way to conditionally copy values in a vectorized way on a PANDAS df?
Is there a way to conditionally copy values in a vectorized way on a PANDAS df?

Time:01-18

Is there a way to conditionally copy values in a vectorized way (i.e. without using a for loop) on a PANDAS df?

import numpy as np
import pandas as pd

Open = {'Open': np.array([86.34, 84.04, 79.06, 78.46, 75.85, 80.78, 79.66, 80.67, 82.32,80.1 , 77.63, 77.  , 79.15, 76.32, 77.  , 77.11, 77.04, 79.74,79.92, 79.09])}
High = {'High': np.array([86.45, 84.24, 80.29, 79.11, 79.98, 80.98, 80.57, 82.18, 83.25,81.25, 78.28, 79.2 , 79.19, 77.55, 79.  , 77.5 , 81.93, 81.04,82.48, 86.74])}
Low = {'Low': np.array([83.15, 79.07, 75.59, 76.99, 74.78, 77.45, 78.48, 80.11, 80.35, 77.  , 71.96, 76.15, 76.73, 75.83, 76.11, 73.46, 76.55, 78.7 ,77.65, 78.47])
Close = {'Close': np.array([84.02, 79.17, 77.28, 77.56, 79.24, 79.86, 79.91, 82.03, 81.83,77.63, 76.19, 79.13, 76.85, 76.98, 78.31, 77.49, 81.65, 80.57,77.92, 85.51])}
pos = {'pos': np.array([0, -1,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, -1,  0,  0,  0,  0, 0,  0,  0])}
stop = {'stop': np.array([ 0.  , 84.24,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  , 0.  ,  0.  ,  0.  , 79.19,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,0.  ,  0.  ])}
index = pd.date_range('2021-1-1',periods=20)
df = pd.DataFrame(dict(Open, **High, **Low, **Close, **pos, **stop), index = index)
df
             Open   High    Low  Close  pos   stop
2021-01-01  86.34  86.45  83.15  84.02    0   0.00
2021-01-02  84.04  84.24  79.07  79.17   -1  84.24
2021-01-03  79.06  80.29  75.59  77.28    0   0.00
2021-01-04  78.46  79.11  76.99  77.56    0   0.00
2021-01-05  75.85  79.98  74.78  79.24    0   0.00
2021-01-06  80.78  80.98  77.45  79.86    0   0.00
2021-01-07  79.66  80.57  78.48  79.91    0   0.00
2021-01-08  80.67  82.18  80.11  82.03    0   0.00
2021-01-09  82.32  83.25  80.35  81.83    0   0.00
2021-01-10  80.10  81.25  77.00  77.63    0   0.00
2021-01-11  77.63  78.28  71.96  76.19    0   0.00
2021-01-12  77.00  79.20  76.15  79.13    0   0.00
2021-01-13  79.15  79.19  76.73  76.85   -1  79.19
2021-01-14  76.32  77.55  75.83  76.98    0   0.00
2021-01-15  77.00  79.00  76.11  78.31    0   0.00
2021-01-16  77.11  77.50  73.46  77.49    0   0.00
2021-01-17  77.04  81.93  76.55  81.65    0   0.00
2021-01-18  79.74  81.04  78.70  80.57    0   0.00
2021-01-19  79.92  82.48  77.65  77.92    0   0.00
2021-01-20  79.09  86.74  78.47  85.51    0   0.00

In the stop column of the above df, I'm trying to copy the stop value down the column based on the following conditions:

  1. stop in current row == 0
  2. stop in previous row != 0
  3. High in current row < stop in previous row

I'm able to perform the operation of interest one row at a time using the below:

df.loc[:,'stop'] = np.where((df['stop'] == 0) & (df['stop'].shift(1) != 0) & (df['High'] < df['stop'].shift(1)), df['stop'].shift(1),df['stop'])

However, the result I'm after is:

             Open   High    Low  Close  pos   stop
2021-01-01  86.34  86.45  83.15  84.02    0   0.00
2021-01-02  84.04  84.24  79.07  79.17   -1  84.24
2021-01-03  79.06  80.29  75.59  77.28    0  84.24
2021-01-04  78.46  79.11  76.99  77.56    0  84.24
2021-01-05  75.85  79.98  74.78  79.24    0  84.24
2021-01-06  80.78  80.98  77.45  79.86    0  84.24
2021-01-07  79.66  80.57  78.48  79.91    0  84.24
2021-01-08  80.67  82.18  80.11  82.03    0  84.24
2021-01-09  82.32  83.25  80.35  81.83    0  84.24
2021-01-10  80.10  81.25  77.00  77.63    0  84.24
2021-01-11  77.63  78.28  71.96  76.19    0  84.24
2021-01-12  77.00  79.20  76.15  79.13    0  84.24
2021-01-13  79.15  79.19  76.73  76.85   -1  79.19
2021-01-14  76.32  77.55  75.83  76.98    0  79.19
2021-01-15  77.00  79.00  76.11  78.31    0  79.19
2021-01-16  77.11  77.50  73.46  77.49    0  79.19
2021-01-17  77.04  81.93  76.55  81.65    0   0.00
2021-01-18  79.74  81.04  78.70  80.57    0   0.00
2021-01-19  79.92  82.48  77.65  77.92    0   0.00
2021-01-20  79.09  86.74  78.47  85.51    0   0.00

CodePudding user response:

You can mask the stop column to NaN with all 0 , then we just need ffill and mask those unwanted cell by condition

s = df[['stop']].mask(df['stop'].eq(0)).ffill()
df.update(s.mask(s['stop']<df['High']))
df
             Open   High    Low  Close  pos   stop
2021-01-01  86.34  86.45  83.15  84.02    0   0.00
2021-01-02  84.04  84.24  79.07  79.17   -1  84.24
2021-01-03  79.06  80.29  75.59  77.28    0  84.24
2021-01-04  78.46  79.11  76.99  77.56    0  84.24
2021-01-05  75.85  79.98  74.78  79.24    0  84.24
2021-01-06  80.78  80.98  77.45  79.86    0  84.24
2021-01-07  79.66  80.57  78.48  79.91    0  84.24
2021-01-08  80.67  82.18  80.11  82.03    0  84.24
2021-01-09  82.32  83.25  80.35  81.83    0  84.24
2021-01-10  80.10  81.25  77.00  77.63    0  84.24
2021-01-11  77.63  78.28  71.96  76.19    0  84.24
2021-01-12  77.00  79.20  76.15  79.13    0  84.24
2021-01-13  79.15  79.19  76.73  76.85   -1  79.19
2021-01-14  76.32  77.55  75.83  76.98    0  79.19
2021-01-15  77.00  79.00  76.11  78.31    0  79.19
2021-01-16  77.11  77.50  73.46  77.49    0  79.19
2021-01-17  77.04  81.93  76.55  81.65    0   0.00
2021-01-18  79.74  81.04  78.70  80.57    0   0.00
2021-01-19  79.92  82.48  77.65  77.92    0   0.00
2021-01-20  79.09  86.74  78.47  85.51    0   0.00
  •  Tags:  
  • Related