Let me know if I need a dummy example for this but essentially I have a df of subgroups, each subgroup a different length (typically 30-35k values). I'd like to bind in a vector with partial vector recycling of c(1:200). From this question I figure I can use rep_len() to get around the dataframe's anti-partial-recycling. The problem is, I can't define length.out in rep_len(), as length.out changes with each subgroup. Any help would be appreciated. I tried doing this:
df_new <- df %>%
group_by(subgroup) %>%
mutate(newcol <- rep_len(1:200, length.out=.))
Which threw an invalid length.out error. I also tried
df_new <- df %>%
group_by(subgroup) %>%
mutate(newcol <- rep_len(1:200, length.out=nrow(.)))
But this throws an error that length.out is the length of my entire df, not the previous subgroup. Any help would be appreciated!
CodePudding user response:
The dplyr package has a count function n() which could work.
mtcars %>%
group_by(cyl) %>%
mutate(newcol = rep_len(1:200, length.out=n()))
Also in the mutate statement it should be a "=" and not "<-"
