I know there are many questions about duplicate removal but I could find anything that matches my needs.
i have
df<-data.frame(var1=c("A", "A", "B", "B", "C", "D", "E"), var2=c(1, 2, 3, 4,5, 5, 6 ))
A is mapped to 1, 2
B is mapped to 2, 3
5 is mapped to C, D
and only E is uniquely mapped to 6 and 6 is uniquely mapped to E
I would like filter the dataset so that only
var1 var2
7 E 6
is returned. Base oder tidyverse solution are welcomed.
I have tried
unique(df$var1, df$var2)
df[!duplicated(df),]
df %>% distinct(var1, var2)
but without the wanted result.
CodePudding user response:
Using a custom function to determine if the mapping is unique you could achieve your desired result like so:
df <- data.frame(
var1 = c("A", "A", "B", "B", "C", "D", "E"),
var2 = c(1, 2, 3, 4, 5, 5, 6)
)
is_unique <- function(x, y) ave(as.numeric(factor(x)), y, FUN = function(x) length(unique(x)) == 1)
df[is_unique(df$var2, df$var1) & is_unique(df$var1, df$var2), ]
#> var1 var2
#> 7 E 6
CodePudding user response:
Using igraph tools:
library(igraph)
g = graph_from_data_frame(df)
cmp = components(g)
cmp$membership[cmp$membership %in% which(cmp$csize == 2)]
# E 6
# 4 4
plot(g)

