I have data which as below
df <- data.frame("ID" = c(rep("AA",8), rep("BB",6)),
"A" = c(1:14),
"Flag" = c("N","Y",rep("N",2), rep("Y",3), "N", "N", "Y", "Y", "N", rep("Y", 2)))
For each ID I want to find the maximum length of a continuous Flag == "Y"
That is
AA 3
BB 2
Can't figure out how to go about
CodePudding user response:
Using by
by(df,list(df$ID),function(x){
tmp=rle(x$Flag)
max(tmp$lengths[tmp$values=="Y"])
})
: AA
[1] 3
---------------------------------------------------------------------
: BB
[1] 2
CodePudding user response:
A dplyr solution. rle is the key function here.
df %>%
group_by(ID) %>%
summarise(max = max(rle(Flag)$l[rle(Flag)$v == "Y"]))
# A tibble: 2 x 2
ID max
<chr> <int>
1 AA 3
2 BB 2
CodePudding user response:
Another possible solution:
library(tidyverse)
df <- data.frame("ID" = c(rep("AA",8), rep("BB",6)),
"A" = c(1:14),
"Flag" = c("N","Y",rep("N",2), rep("Y",3), "N", "N", "Y", "Y", "N", rep("Y", 2)))
df %>%
mutate(aux = data.table::rleid(Flag)) %>%
add_count(ID, aux, name = "nSeq") %>%
group_by(ID) %>%
slice_max(nSeq, n = 1) %>%
ungroup %>%
select(ID, nSeq) %>%
distinct
#> # A tibble: 2 × 2
#> ID nSeq
#> <chr> <int>
#> 1 AA 3
#> 2 BB 2
