Count identical values in a string, than group, and count the results using tidyverse and dplyr-CodePudding

Let´s say I have this dataset with 3 variables. One of them student is a string with the "name" of students.

df = data.frame(quest = c(2,4,6), test = rep(c("math","science","arts"),3), student = c("0risk, student1, student3","student1","0risk, student2, student3")) %>% arrange(quest, test)

I would like to count students grouped by quest and test and return something like that

Any advice would be helpful. Thank you

CodePudding user response：

Using

tidyr::separate_rows you could split the student column into rows,
get the counts using dplyr::count,
tidyr::complete your dataset to get all combos of the categorical variables and
finally use tidyr::pivot_wider to convert to wide format:

library(dplyr)
library(tidyr)

df %>% 
  tidyr::separate_rows(student) %>% 
  count(quest, test, student, .drop = FALSE) %>% 
  tidyr::complete(quest, test, student, fill = list(n = 0)) %>% 
  tidyr::pivot_wider(names_from = test, values_from = n, values_fill = 0)
#> # A tibble: 12 × 5
#>    quest student   arts  math science
#>    <dbl> <chr>    <dbl> <dbl>   <dbl>
#>  1     2 0risk        0     3       0
#>  2     2 student1     0     3       0
#>  3     2 student2     0     0       0
#>  4     2 student3     0     3       0
#>  5     4 0risk        0     0       0
#>  6     4 student1     0     0       3
#>  7     4 student2     0     0       0
#>  8     4 student3     0     0       0
#>  9     6 0risk        3     0       0
#> 10     6 student1     0     0       0
#> 11     6 student2     3     0       0
#> 12     6 student3     3     0       0