I am looking for a data.table solution for this problem. I have data like this:
library(data.table)
codes1 <- c("A1", "A2", "A3")
codes2 <- c("B1", "B2", "B3")
codes3 <- c("C1", "C2", "C3")
data <- data.table(
id = c(1,1,2,3,3,4,4,4),
code = c("A1","A3", "B1", "A2", "B2","A1","B2","C1")
)
I wish to count, for each unique id, number of times data$code matches an element in vectors codes1,codes2, and codes3, counting only once for a match in each vector. I wish to end up with the following:
data_want <- data.table(
id = c(1,2,3,4),
match = c(1,1,2,3)
)
CodePudding user response:
Place the codes vectors in a list, loop over the list with lapply, after grouping by 'id', then check whether any of the elements are %in% the 'code' column, Reduce the list of logical vectors to integer by adding ( - TRUE -> 1 and FALSE -> 0)
library(data.table)
data[, .(match = Reduce(` `, lapply(list(codes1, codes2, codes3),
\(x) any(x %in% code)))), by = id]
-output
id match
<num> <int>
1: 1 1
2: 2 1
3: 3 2
4: 4 3
