I have some timeseries data
time x y
1s 34 8017
1s 43 5019
1s 1 8017
2s 64 8870
2s 34 8305
2s 11 8305
3s 343 8221
3s 3 8221
3s 143 8221
that I grouped by df.groupby(data.index.second) using python pandas groupby. Producing 3 groups where group 1 looks like this which corresponds to the first second
time x y
1s 34 8017
1s 43 5019
1s 1 8017
How can I remove the first group (1th second) and the last group (3th second)?
I only want this group (group 2)
time x y
2s 64 8870
2s 34 8305
2s 11 8305
I have tried this without success and maybe the groupby function is not the way to go.
CodePudding user response:
You can do something like this
df2 = df[df['time']=='2s']
This will remove both '1s' and '3s' from your main df and then we can store it in new variable which we can all df2
CodePudding user response:
I solved it by saving all the keys
l = list(df.groupby(data.index.second))
and then deleting the first and last key from the list
del l[0]
del l[-1]
see https://docs.python.org/3/library/stdtypes.html#dict
CodePudding user response:
I note that you answered your own question, but perhaps this is of some use: using filter,
df.groupby('time').filter(lambda g: g.name not in ['1s','3s'])
produces
time x y
3 2s 64 8870
4 2s 34 8305
5 2s 11 8305
CodePudding user response:
You can filter all unique times without first and last unique values in Series.isin with boolean indexing:
df = df[df['time'].isin(df['time'].unique()[1:-1])]
print (df)
time x y
3 2s 64 8870
4 2s 34 8305
5 2s 11 8305
