Compare 2 consecutive cells in a dataframe for equality-CodePudding

I have the following problem, I want to detect if 2 or more consecutive values in a column of a dataframe have a value greater than 0.5. For this I have chosen the following approach: I check each cell if the value is less than 0.5 and create an entry in the column "condition". (See table) Now I have the following problem, how can I detect in a column if 2 consecutive cells have the same value (row 4-5)? Or is it possible to detect the problem also directly in the Data column. If 2 consecutive cells are False, the dataframe can be discarded.

I would be very grateful for any help!

	data	condition
0	0.1	True
1	0.1	True
2	0.25	True
3	0.3	True
4	0.6	False
5	0.7	False
6	0.3	True
7	0.1	True
6	0.9	False
7	0.1	True

CodePudding user response：

You can compute a boolean series of values greater than 0.5 (i.e True when invalid). Then apply a boolean and (&) between this series and its shift. Any two consecutive True values will yield True. You can check if any is present to decide to discard the dataset:

s = df['data'].gt(0.5)
(s&s.shift()).any()

Output: True -> the dataset is invalid

CodePudding user response：

You can use the .diff method and check that it is equal to zero.

df['eq_to_prev'] = df.data.diff().eq(0)