I have a large dataset for which this needs to be done but I will describe the problem for a smaller one.
| Var1 | Var2 |
|---|---|
| A | B |
| A | C |
| A | D |
| B | A |
| E | F |
| G | H |
I want to keep only one of the rows with values "A-B" and drop the row with the reverse "B-A". All the other rows should also remain.
Thanks in advance.
CodePudding user response:
df[!(df$Var1 == 'B' & df$Var1 == 'A'), ]
With dplyr:
dplyr::filter(df, !(Var1 == 'B' & Var1 == 'A'))
Reproducible input data:
df <- data.frame(
Var1 = c('A', 'A', 'A', 'B', 'E', 'G'),
Var2 = c('B', 'C', 'D', 'A', 'F', 'H')
)
CodePudding user response:
if I understand correctly
df <- df <- data.frame(
Var1 = c('A', 'A', 'A', 'B', 'E', 'G'),
Var2 = c('B', 'C', 'D', 'A', 'F', 'H')
)
fltr <- !duplicated(apply(df, 1, function(x) paste0(sort(x), collapse = "")))
df[fltr, ]
#> Var1 Var2
#> 1 A B
#> 2 A C
#> 3 A D
#> 5 E F
#> 6 G H
Created on 2022-01-11 by the reprex package (v2.0.1)
