Home > Blockchain >  Length of a series by group
Length of a series by group

Time:01-06

I have data which as below

df <- data.frame("ID" = c(rep("AA",8), rep("BB",6)),
"A" = c(1:14),
"Flag" = c("N","Y",rep("N",2), rep("Y",3), "N", "N", "Y", "Y", "N", rep("Y", 2)))

For each ID I want to find the maximum length of a continuous Flag == "Y"

That is

AA    3
BB    2

Can't figure out how to go about

CodePudding user response:

Using by

by(df,list(df$ID),function(x){
  tmp=rle(x$Flag)
  max(tmp$lengths[tmp$values=="Y"])
})

: AA
[1] 3
--------------------------------------------------------------------- 
: BB
[1] 2

CodePudding user response:

A dplyr solution. rle is the key function here.

df %>% 
  group_by(ID) %>% 
  summarise(max = max(rle(Flag)$l[rle(Flag)$v == "Y"]))

# A tibble: 2 x 2
  ID      max
  <chr> <int>
1 AA        3
2 BB        2

CodePudding user response:

Another possible solution:

library(tidyverse)

df <- data.frame("ID" = c(rep("AA",8), rep("BB",6)),
"A" = c(1:14),
"Flag" = c("N","Y",rep("N",2), rep("Y",3), "N", "N", "Y", "Y", "N", rep("Y", 2)))

df %>% 
  mutate(aux = data.table::rleid(Flag)) %>% 
  add_count(ID, aux, name = "nSeq") %>% 
  group_by(ID) %>% 
  slice_max(nSeq, n = 1) %>% 
  ungroup %>% 
  select(ID, nSeq) %>% 
  distinct

#> # A tibble: 2 × 2
#>   ID     nSeq
#>   <chr> <int>
#> 1 AA        3
#> 2 BB        2
  •  Tags:  
  • Related