I'm struggling to solve a seemingly simple task. In this type of data:
df <- data.frame(
A = c(1,2,3,4,1,2,3,2,2,2,2,1,1,1,1,6,7)
)
I want to add an id variable that groups column A by a regular interval, say, of 5. An additional difficulty is that the number of rows is 17, thus not a multiple of 5:
df <- data.frame(
A = c(1,2,3,4,1,2,3,2,2,2,2,1,1,1,1,6,7),
id = c("a","a","a","a","a",
"b","b","b","b","b",
"c","c","c","c","c",
"d","d"))
How can this be done?
CodePudding user response:
base R
df$id <- rep(letters, each = 5, length = nrow(df))
dplyr
library(dplyr)
df %>%
mutate(id = rep(letters, each = 5, length = n()))
output
df
A id
1 1 a
2 2 a
3 3 a
4 4 a
5 1 a
6 2 b
7 3 b
8 2 b
9 2 b
10 2 b
11 2 c
12 1 c
13 1 c
14 1 c
15 1 c
16 6 d
17 7 d
CodePudding user response:
A possible solution:
library(dplyr)
df <- data.frame(
A = c(1,2,3,4,1,2,3,2,2,2,2,1,1,1,1,6,7)
)
df %>%
mutate(id = rep(letters, each=5)[1:n()])
#> A id
#> 1 1 a
#> 2 2 a
#> 3 3 a
#> 4 4 a
#> 5 1 a
#> 6 2 b
#> 7 3 b
#> 8 2 b
#> 9 2 b
#> 10 2 b
#> 11 2 c
#> 12 1 c
#> 13 1 c
#> 14 1 c
#> 15 1 c
#> 16 6 d
#> 17 7 d
CodePudding user response:
df <- data.frame(A = c(1, 2, 3, 4, 1, 2, 3, 2, 2, 2, 2, 1, 1, 1, 1, 6, 7))
library(dplyr)
df %>%
mutate(grp = (row_number() - 1) %/% 5)
#> A grp
#> 1 1 0
#> 2 2 0
#> 3 3 0
#> 4 4 0
#> 5 1 0
#> 6 2 1
#> 7 3 1
#> 8 2 1
#> 9 2 1
#> 10 2 1
#> 11 2 2
#> 12 1 2
#> 13 1 2
#> 14 1 2
#> 15 1 2
#> 16 6 3
#> 17 7 3
Created on 2022-01-14 by the reprex package (v2.0.1)
or
df %>%
mutate(grp = letters[(row_number() - 1) %/% 5 1])
A grp
1 1 a
2 2 a
3 3 a
4 4 a
5 1 a
6 2 b
7 3 b
8 2 b
9 2 b
10 2 b
11 2 c
12 1 c
13 1 c
14 1 c
15 1 c
16 6 d
17 7 d
