I have a pandas.DataFrame of the form.(It doesn't matter if you use numpy.) I want to output a value of 'moID' whenever the value of column 'time' changes. I'll show you a simple example below. I will mark the row that should be output as '<<<'.
index 'moID' 'time'
0 1 0 <<<
1 25 0
2 3 1 <<<
3 45 1
4 12 1
5 2 2 <<<
6 34 1 <<<
7 4 1
8 12 1
9 2 3 <<<
10 5 3
11 37 3
12 85 0 <<<
13 2 0
14 45 1 <<<
15 55 1
16 2 3 <<<
17 23 3
18 42 0 <<<
19 1 0
20 42 1 <<<
21 2 2 <<<
22 41 2
23 3 1 <<<
24 52 1
25 2 1
26 24 3 <<<
27 3 3
28 5 3
result is :
index 'moID'
1
3
2
34
2
85
45
2
42
42
2
3
24
help me please.
CodePudding user response:
You can use shift ne to see if consecutive rows match and create a boolean Series (where it's False if the time is the same but True if it's different). Then use it as a mask to filter the desired items:
out = df.loc[df['time'].ne(df['time'].shift()), 'moID']
Output:
0 1
2 3
5 2
6 34
9 2
12 85
14 45
16 2
18 42
20 42
21 2
23 3
26 24
Name: moID, dtype: int64
CodePudding user response:
You can use boolean indexing the following way:
result = df.moID[df.time.diff() != 0]
df.time.diff() != 0 generates a Series of boolean and it is used
to index moID column.
The result, for your source data, is:
0 1
2 3
5 2
6 34
9 2
12 85
14 45
16 2
18 42
20 42
21 2
23 3
26 24
Name: moID, dtype: int64
The left column is the index and the right one - actual values.
