Hello coding community,
If my data frame looks like:
ID Col1 Col2 Col3 Col4
Per1 1 2 3 4
Per2 2 NA NA NA
Per3 NA NA 5 NA
Is there any syntax to delete the row associated with ID = Per2, on the basis that Col2, Col3, AND Col4 = NA? I am hoping for code that will allow me to delete a row on the basis that three specific columns (Col2, Col3, and Col4) ALL are NA. This code would NOT delete the row ID = Per3, even though there are three NAs.
Please note that I know how to delete a specific row, but my data frame is big so I do not want to manually sort through all rows/columns.
Big thanks!
CodePudding user response:
Test for NA and delete rows with a number of NA's equal to the number of columns tested using rowSums.
dat[!rowSums(is.na(dat[c('Col2', 'Col3', 'Col4')])) == 3, ]
# ID Col1 Col2 Col3 Col4
# 1 Per1 1 2 3 4
# 3 Per3 NA NA 5 NA
CodePudding user response:
You can use if_all
library(dplyr)
filter(df, !if_all(c(Col2, Col3, Col4), ~ is.na(.)))
# ID Col1 Col2 Col3 Col4
# 1 Per1 1 2 3 4
# 2 Per3 NA NA 5 NA
data
df <- structure(list(ID = c("Per1", "Per2", "Per3"), Col1 = c(1L, 2L,
NA), Col2 = c(2L, NA, NA), Col3 = c(3L, NA, 5L), Col4 = c(4L,
NA, NA)), class = "data.frame", row.names = c(NA, -3L))
CodePudding user response:
Using if_any
library(dplyr)
df %>%
filter(if_any(Col2:Col4, complete.cases))
ID Col1 Col2 Col3 Col4
1 Per1 1 2 3 4
2 Per3 NA NA 5 NA
