I'm struggling on how can I split my dataframe in 2 or more parts. I have a lot of columns and rows, but imagine a toy example:
test = data.frame(car = c("A", "A", "B", "C", "D", "E", "B", "C", "D"), value = c(5,4,3,5, 6, 6, 7 ,8 ,10))
#result
# car value group
#1 A 5 1
#2 A 4 1
#3 B 3 2
#4 C 5 1
#5 D 6 2
#6 E 6 2
#7 B 7 2
#8 C 8 1
#9 D 10 2
The only restriction that I need is:
The same car cannot be part of the same category, i.e., the same car, for example car A, it will appear in several lines of my real dataframe. Every time it occurs, it must have the same corresponding category, for example group = 1. The same group will have several different cars, but the same car can never be in different groups.
Any hint? I tried test %>% mutate(group = ntile(car, 4)) without success.
CodePudding user response:
gr <- function(df, groups){
g <- as.integer(factor(df[[1]])) %% groups
df$groups <- as.integer(factor(g))
df
}
gr(test, 1)
car value groups
1 A 5 1
2 A 4 1
3 B 3 1
4 C 5 1
5 D 6 1
gr(test, 2)
car value groups
1 A 5 2
2 A 4 2
3 B 3 1
4 C 5 2
5 D 6 1
gr(test, 3)
car value groups
1 A 5 2
2 A 4 2
3 B 3 3
4 C 5 1
5 D 6 2
gr(test, 4)
car value groups
1 A 5 2
2 A 4 2
3 B 3 3
4 C 5 4
5 D 6 1
CodePudding user response:
Using a dplyr approach:
library(dplyr)
test = data.frame(car = c("A", "A", "B", "C", "D", "E", "B", "C", "D"), value = c(5,4,3,5, 6, 6, 7 ,8 ,10))
test %>%
mutate(group = 1 match(car,car) %% 4)
#> car value group
#> 1 A 5 2
#> 2 A 4 2
#> 3 B 3 4
#> 4 C 5 1
#> 5 D 6 2
#> 6 E 6 3
#> 7 B 7 4
#> 8 C 8 1
#> 9 D 10 2
