I have this data:
data <- structure(list(client_id = c("A", "B",
"C", "D", "E", "F"
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))
Where I need to match it with this data_to_match:
data_to_match <- structure(list(client_id = c("A", "E", "F"
)), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"
))
Such as:
data[data$client_id %in% data_to_match,] # returns empty
I have tried without success (which are the solutions to various questions here):
as.list(data_to_match)
list(data_to_match)
unlist(as.list(data_to_match))
But If I create a list from scratch it works perfectly:
data_to_match_created_as_list <- c("A", "E", "F")
data[data$client_id %in% data_to_match_created_as_list,] # it returns the right rows
At the end, my question is, how do I transform the data_to_match to a list like data_to_match_created_as_list?
In addition, how is the right name for this different types of list? I looked up for how to transform vector or one-column dataframe to list and the solutions are not equal to a list created from scratch (as in the example above and my multiple tries)
CodePudding user response:
While @Marco_CH answers works perfectly, if you just want an answer to the question:
At the end, my question is, how do I transform the data_to_match to a list like data_to_match_created_as_list?
Applying arrays to the unlisted initial data to match will work:
array(unlist(data_to_match))
# [1] "A" "E" "F"
data[data$client_id %in% array(unlist(data_to_match)),]
# [1] "A" "E" "F"
CodePudding user response:
Why don't just using the name of the column (which will gives you a vector)?
data[data$client_id %in% data_to_match$client_id,]
or
data[data$client_id %in% data_to_match[[1]],]
Output:
# A tibble: 3 × 1
client_id
<chr>
1 A
2 E
3 F
Check:
data[data$client_id %in% data_to_match$client_id,] == data[data$client_id %in% data_to_match_created_as_list,]
client_id
[1,] TRUE
[2,] TRUE
[3,] TRUE
CodePudding user response:
This question could be approached in different ways, depending on how many common columns the data have, what their structure is and what the expected output should look like.
If you want a vector
Reduce(intersect, c(data, data_to_match))
#> [1] "A" "E" "F"
unlist(c(data, data_to_match), use.names = F) |>
(\(.) .[duplicated(.)])()
a tibble
dplyr::right_join(data, data_to_match, "client_id")
# A tibble: 3 x 1
client_id
<chr>
1 A
2 E
3 F
or matrix
mapply(intersect, data, data_to_match)
client_id
[1,] "A"
[2,] "E"
[3,] "F"
