I have a pandas data frame that looks like this
| Date_Time | level |
|---|---|
| 2018-02-12 13:22:27 | 5 |
| 2018-02-12 13:17:27 | 7 |
| 2018-02-12 13:12:27 | 2 |
| 2018-02-12 13:07:27 | 6 |
| 2018-02-13 13:12:27 | 4 |
| 2018-02-13 13:17:27 | 5 |
How do I make it so If there is less than 3 entries on a specific date they get removed i.e since 2018-03-13 < 4 entries remove them and get this table
| Date_Time | level |
|---|---|
| 2018-02-12 13:22:27 | 5 |
| 2018-02-12 13:17:27 | 7 |
| 2018-02-12 13:12:27 | 2 |
| 2018-02-12 13:07:27 | 6 |
I tried using a for loop but that takes too long to run
CodePudding user response:
You can do groupby and transform with count and then use ge to get the rows you wanted:
df[df.groupby(df['Date_Time'].dt.date)['Date_Time'].transform('count').ge(4)]
