I have two dataframes with different number of rows.
df1 is longer than df2, they both share several common rows.
My example
df1 <- data.frame(name1 = "a", "b", "c",
name2 = "a1","b1","c1",
name3 = "a2","b2","c2")
df1
name1 name2 name3
1 a a1 a2
2 b b1 b2
3 c c1 c2
df2 <- data.frame(name1 = c("a", "b", "m"),
name2 = c("a3","b3", "m1"),
name3 = c("a4", "b4", "m2"))
df2
name1 name2 name3
1 a a3 a4
2 b b3 b4
3 m m1 m2
I would like to exclude the common rows in two dataframe and only keep one row of df2 in this case using tidyverse. Any suggestion for this?
Desired output
name1 name2 name3
m m1 m2
CodePudding user response:
anti_join(df1, df2, by = "name1")
name1 name2 name3
1 c c1 c2
anti_join(df2, df1, by = "name1")
name1 name2 name3
1 m m1 m2
CodePudding user response:
We may use anti_join (originally posted as comments way before the other answer was posted)
library(dplyr)
anti_join(df1, df2, by = c("name1"))
data
df1 <- structure(list(name1 = c("a", "b", "c"), name2 = c("a1", "b1",
"c1"), name3 = c("a2", "b2", "c2")), class = "data.frame", row.names = c(NA,
-3L))
df2 <- structure(list(name1 = c("a", "b"), name2 = c("a3", "b3")), class = "data.frame", row.names = c(NA,
-2L))
