Filter dataframe by slicing group-first values-CodePudding

In this dataframe:

df <- data.frame(
  PP_by = c("A","B","A","A","B","B", rep("C",2),"A","A","A","B"),
  Sequ = c(1,1,1,1,1,1,4,4,4,4,4,4),
  Line = c(55,55,77,77,77,77,99,99,99,99,33,33),
  pp = rnorm(12),
  pp2 = c(1.1,1.2,1.1,1.2,1.3,1.5,1.3,1.2,1.5,1.7,1.0,1.9),
  other = 1:12,
  other2 = letters[1:12]
)

I want to slice the rows with the first values grouped by Sequ and Line. I've tried this but it returns empty:

df %>%
   group_by(Sequ) %>%
   slice(first(Line))
# A tibble: 0 × 7
# Groups:   Sequ [0]
# … with 7 variables: PP_by <chr>, Sequ <dbl>, Line <dbl>, pp <dbl>, pp2 <dbl>, other <int>, other2 <chr>

The desired result is this:

df
   PP_by Sequ Line          pp pp2 other other2
1      A    1   55  0.30277770 1.1     1      a
2      B    1   55  1.55909373 1.2     2      b
7      C    4   99  1.33796806 1.3     7      g
8      C    4   99 -0.23230445 1.2     8      h
9      A    4   99 -0.12740409 1.5     9      i
10     A    4   99 -0.02168540 1.7    10      j

CodePudding user response：

According to ?slice

slice() lets you index rows by their (integer) locations.

The first(Line) is the first value for Line and not a position index. If we convert to logical and then wrap with which, can get the position index required for slice

library(dplyr)
df %>%
  group_by(Sequ) %>%
  slice(which(Line == first(Line))) %>% 
  ungroup
# A tibble: 6 × 7
  PP_by  Sequ  Line     pp   pp2 other other2
  <chr> <dbl> <dbl>  <dbl> <dbl> <int> <chr> 
1 A         1    55 -0.664   1.1     1 a     
2 B         1    55 -0.520   1.2     2 b     
3 C         4    99 -2.11    1.3     7 g     
4 C         4    99  0.165   1.2     8 h     
5 A         4    99  2.43    1.5     9 i     
6 A         4    99  0.657   1.7    10 j

CodePudding user response：

You can do:

library(tidyverse)
df %>%
   group_by(Sequ) %>%
   filter(Line == first(Line))

which gives:

# A tibble: 6 x 7
# Groups:   Sequ [2]
  PP_by  Sequ  Line     pp   pp2 other other2
  <chr> <dbl> <dbl>  <dbl> <dbl> <int> <chr> 
1 A         1    55  0.425   1.1     1 a     
2 B         1    55 -0.744   1.2     2 b     
3 C         4    99  0.497   1.3     7 g     
4 C         4    99  1.39    1.2     8 h     
5 A         4    99  0.685   1.5     9 i     
6 A         4    99  0.128   1.7    10 j