Home > Back-end >  Group column by regular interval
Group column by regular interval

Time:01-15

I'm struggling to solve a seemingly simple task. In this type of data:

df <- data.frame(
  A = c(1,2,3,4,1,2,3,2,2,2,2,1,1,1,1,6,7)
)

I want to add an id variable that groups column A by a regular interval, say, of 5. An additional difficulty is that the number of rows is 17, thus not a multiple of 5:

df <- data.frame(
  A = c(1,2,3,4,1,2,3,2,2,2,2,1,1,1,1,6,7),
  id = c("a","a","a","a","a", 
         "b","b","b","b","b",
         "c","c","c","c","c",
         "d","d"))

How can this be done?

CodePudding user response:

base R

df$id <- rep(letters, each = 5, length = nrow(df))

dplyr

library(dplyr)
df %>% 
  mutate(id = rep(letters, each = 5, length = n()))

output

df
   A id
1  1  a
2  2  a
3  3  a
4  4  a
5  1  a
6  2  b
7  3  b
8  2  b
9  2  b
10 2  b
11 2  c
12 1  c
13 1  c
14 1  c
15 1  c
16 6  d
17 7  d

CodePudding user response:

A possible solution:

library(dplyr)

df <- data.frame(
  A = c(1,2,3,4,1,2,3,2,2,2,2,1,1,1,1,6,7)
)

df %>% 
  mutate(id = rep(letters, each=5)[1:n()])

#>    A id
#> 1  1  a
#> 2  2  a
#> 3  3  a
#> 4  4  a
#> 5  1  a
#> 6  2  b
#> 7  3  b
#> 8  2  b
#> 9  2  b
#> 10 2  b
#> 11 2  c
#> 12 1  c
#> 13 1  c
#> 14 1  c
#> 15 1  c
#> 16 6  d
#> 17 7  d

CodePudding user response:

df <- data.frame(A = c(1, 2, 3, 4, 1, 2, 3, 2, 2, 2, 2, 1, 1, 1, 1, 6, 7))

library(dplyr)

df %>% 
  mutate(grp = (row_number() - 1) %/% 5)
#>    A grp
#> 1  1   0
#> 2  2   0
#> 3  3   0
#> 4  4   0
#> 5  1   0
#> 6  2   1
#> 7  3   1
#> 8  2   1
#> 9  2   1
#> 10 2   1
#> 11 2   2
#> 12 1   2
#> 13 1   2
#> 14 1   2
#> 15 1   2
#> 16 6   3
#> 17 7   3

Created on 2022-01-14 by the reprex package (v2.0.1)

or

df %>% 
  mutate(grp = letters[(row_number() - 1) %/% 5   1])

   A grp
1  1   a
2  2   a
3  3   a
4  4   a
5  1   a
6  2   b
7  3   b
8  2   b
9  2   b
10 2   b
11 2   c
12 1   c
13 1   c
14 1   c
15 1   c
16 6   d
17 7   d
  •  Tags:  
  • Related