Home > Blockchain >  Removing rows after threshold has been reach
Removing rows after threshold has been reach

Time:02-03

Consider this table:

City degree_day
Asheville 0
Asheville 15
Asheville 30
Asheville 40
Asheville 75
Asheville 90
clayton 0
clayton 20
clayton 30
clayton 80
clayton 100
clayton 120

I would like to remove all rows after the threshold has been reached. Lets say the threshold is 60. If I do the following code it will only give me values less than or equal to the threshold:

dat %>% group_by(City) %>% filter(degree_day <= 60)
City degree_day
Asheville 0
Asheville 15
Asheville 30
Asheville 40
clayton 0
clayton 20
clayton 30

Whereas my desired output includes the first row after the filter threshold:

City degree_day
Asheville 0
Asheville 15
Asheville 30
Asheville 40
Asheville 75
clayton 0
clayton 20
clayton 30
clayton 80

How can I achieve this result?

CodePudding user response:

My solution may not be very clever, see if someone else can have better idea.

I'll filter again in the other direction (i.e. degree_day > 60), and slice() the minimum row of that, which will be the next record after threshold. Then combine the two together.

library(tidyverse)

rbind(dat %>% filter(degree_day <= 60), 
      dat %>% filter(degree_day > 60) %>% group_by(City) %>% slice_min(degree_day)) %>% 
  arrange(City)

# A tibble: 9 x 2
  City      degree_day
  <chr>          <int>
1 Asheville          0
2 Asheville         15
3 Asheville         30
4 Asheville         40
5 Asheville         75
6 clayton            0
7 clayton           20
8 clayton           30
9 Clayton           80

CodePudding user response:

A solution via defining temporary subsets of dfand finally binding them back again:

df1 <- df %>% group_by(City) %>% filter(degree_day <= 60)
df2 <- df %>% group_by(City) %>% filter(degree_day > 60) %>% slice_head()
df3 <- bind_rows(df1, df2) %>% arrange(City)
# A tibble: 10 × 2
# Groups:   City [2]
   City  degree_day
   <chr>      <dbl>
 1 A              0
 2 A             12
 3 A             25
 4 A             30
 5 A             75
 6 B              0
 7 B             20
 8 B             27
 9 B             58
10 B            100

Data:

df <- data.frame(
  City = c(rep("A", 7), rep("B", 7)),
  degree_day = c(0,12,25,30,75,90,100,
                 0,20,27,58,100,120,150)
)
  •  Tags:  
  • Related