Here is some sample data with an "input" column and a "trigger" column. The "input" column is generally False but has True segments (e.g. one True segment in the sample data). I am trying to create a third column ("output") that is a modified version of the "input" column. Essentially the True segments in "output" should begin earlier than in "input", as indicated by the "trigger" (by the previous True value of the "trigger"). I want to achieve this with vector operations and I want to avoid loops e.g. for.
index = pd.date_range('2020-01-01', '2020-01-13', freq='D')
columns = ['Input', 'Trigger']
data = [[False, False],
[False, False],
[False, True],
[False, False],
[False, False],
[False, True],
[False, False],
[False, False],
[True, False],
[True, False],
[True, False],
[True, False],
[False, False]
]
pd.DataFrame(data, index, columns)
I don't know how to achieve my purpose but the result with the sample data I provided would look like this:
columns = ['Input', 'Trigger', 'Output']
data = [[False, False, False],
[False, False, False],
[False, True, False],
[False, False, False],
[False, False, False],
[False, True, True],
[False, False, True],
[False, False, True],
[True, False, True],
[True, False, True],
[True, False, True],
[True, False, True],
[False, False, False]
]
pd.DataFrame(data, index, columns)
CodePudding user response:
You can use the Trigger column to create groups, then split and backfill the Input.
df['Output'] = (
df['Input']
.replace({False: None})
.groupby(df['Trigger'].cumsum()).bfill() # backfill previous NA in group
.fillna(False)
)
output:
Input Trigger Output
2020-01-01 False False False
2020-01-02 False False False
2020-01-03 False True False
2020-01-04 False False False
2020-01-05 False False False
2020-01-06 False True True
2020-01-07 False False True
2020-01-08 False False True
2020-01-09 True False True
2020-01-10 True False True
2020-01-11 True False True
2020-01-12 True False True
2020-01-13 False False False
NB. this does not account for the case where there is a trigger within a stretch as the expected behavior is unclear
