I have the following data. I want to do a conditional filter where if C (confirmed) value for AMZN is present, then delete the row with E, and if no C, then keep E. In this case, we would have one C row for AMZN and one E row for AAPL.
Any ideas on how best to achieve this in R?
| AMZN| C| 1|
| AMZN| E| 2|
| AAPL| E| 2|
CodePudding user response:
You may try this with dplyr -
library(dplyr)
df %>%
group_by(V1) %>%
filter(if(any(V2 == "C")) V2 != "E" else V2 == "E") %>%
ungroup
# V1 V2 V3
# <chr> <chr> <int>
#1 AMZN C 1
#2 AAPL E 2
data
It is easier to help if you provide data in a reproducible format
df <- structure(list(V1 = c("AMZN", "AMZN", "AAPL"), V2 = c("C", "E",
"E"), V3 = c(1L, 2L, 2L)), class = "data.frame", row.names = c(NA, -3L))
CodePudding user response:
library(dplyr)
filter(df, V1 == "AMZN" & V2 == "C" | V2 == "E" & V1 != "AMZN")
V1 V2 V3
1 AMZN C 1
2 AAPL E 2
CodePudding user response:
Here is a possible base R solution:
df[as.logical(with(df, ave(V2, V1, FUN = function(i)
if(any(i == "C")) i != "E" else i == "E"))), ]
Output
V1 V2 V3
1 AMZN C 1
3 AAPL E 2
Or using data.table:
library(data.table)
setDT(dt)[, .SD[if(any(V2 == "C")) V2 != "E" else V2 == "E"], .(V1)]
Data
df <-
structure(list(
V1 = c("AMZN", "AMZN", "AAPL"),
V2 = c("C", "E", "E"),
V3 = c(1L, 2L, 2L)
),
class = "data.frame",
row.names = c(NA,-3L))
