Home > Blockchain >  How to use pandas.Series.str.contains to return true value for row after row that contains the given
How to use pandas.Series.str.contains to return true value for row after row that contains the given

Time:01-29

I am using the following code to make a mask of a data frame. The mask means that I return TRUE for all values in a row where one cell in that row has a certain condition, for instance where one cell value is exactly 21.

 mask_pipe21 = np.column_stack([output[col].str.contains("^21$", regex=True, na=False) for col in output])

What I want to do now, is to return TRUE not for the row that contains 21, but for the row after the row that contains 21.

To use a simpler example from the pandas instruction:

s1 = pd.Series(['Mouse', 'dog', 'house and parrot', '23', np.NaN])
s1.str.contains('og', regex=False)
0    False
1     True
2    False
3    False
4      NaN

Instead of this results, with the same logic, I would like to return:

0    False
1    False
2    True
3    False
4      NaN

Does anybody know how I could achieve this with my code line? Thanks in advance.

CodePudding user response:

Try

s1 = pd.Series(['Mouse', 'dog', 'house and parrot', '23', np.NaN])
s1.str.contains('og').shift(1)
>>>
0      NaN
1    False
2     True
3    False
4    False

This is not 100% your wanted output. Therefor you maybe want to change the NaN values afterwards.

CodePudding user response:

A complete solution would be to fill the shift with False, and to mask the original NaN.

(s1.str.contains('og', regex=False)
   .shift(fill_value=False)
   .mask(s1.isna())
)

Output:

0    False
1    False
2     True
3    False
4      NaN
dtype: object
  •  Tags:  
  • Related