I am using the following code to make a mask of a data frame. The mask means that I return TRUE for all values in a row where one cell in that row has a certain condition, for instance where one cell value is exactly 21.
mask_pipe21 = np.column_stack([output[col].str.contains("^21$", regex=True, na=False) for col in output])
What I want to do now, is to return TRUE not for the row that contains 21, but for the row after the row that contains 21.
To use a simpler example from the pandas instruction:
s1 = pd.Series(['Mouse', 'dog', 'house and parrot', '23', np.NaN])
s1.str.contains('og', regex=False)
0 False
1 True
2 False
3 False
4 NaN
Instead of this results, with the same logic, I would like to return:
0 False
1 False
2 True
3 False
4 NaN
Does anybody know how I could achieve this with my code line? Thanks in advance.
CodePudding user response:
Try
s1 = pd.Series(['Mouse', 'dog', 'house and parrot', '23', np.NaN])
s1.str.contains('og').shift(1)
>>>
0 NaN
1 False
2 True
3 False
4 False
This is not 100% your wanted output. Therefor you maybe want to change the NaN values afterwards.
CodePudding user response:
A complete solution would be to fill the shift with False, and to mask the original NaN.
(s1.str.contains('og', regex=False)
.shift(fill_value=False)
.mask(s1.isna())
)
Output:
0 False
1 False
2 True
3 False
4 NaN
dtype: object
