if want to filter data on the basis of ID columns . so if ID columns have NA in all three columns then filter data for example create a subset of df5 if id1 id2 and id3 column have NA
df5 <- structure(list(name = c("ka","au","wa","ny","de","ws","ol","yb","uy","hj"),
`ID 1` = c(653, NA, 240, NA, 826, 872, 556, 440,
NA), `ID 2` = c(385, NA, 970, 240, 551, 854, 460, NA, NA),
`ID 3` =c(385, NA, 970, NA, 551, 854, 460, NA, NA)), row.names = c(NA,
-9L), class = "data.frame")
CodePudding user response:
code
df5[complete.cases(df5[ , 2:4]), ]
output
name ID 1 ID 2 ID 3
1 ka 653 385 385
3 wa 240 970 970
5 de 826 551 551
6 ws 872 854 854
7 ol 556 460 460
CodePudding user response:
Here is a tidyverse approach, which will ONLY filter rows that are ALL NA (i.e. rows with 2 NAs or 1NA are retained).
Also in your comment, you mentioned you would like to have dynamic column names, I also included this in my answer. It will sum up the number of NAs in each row with column names that starts with ID, and filter away rows that are have ncol() - 1 number of NA (since you don't count the name column, therefore you need the - 1.
library(tidyverse)
df5 %>% rowwise() %>%
filter(sum(is.na(c_across(starts_with("ID")))) != ncol(df5) - 1) %>%
ungroup()
Output
# A tibble: 7 x 4
name ID1 ID2 ID3
<chr> <dbl> <dbl> <dbl>
1 ka 653 385 385
2 wa 240 970 970
3 ny NA 240 NA
4 de 826 551 551
5 ws 872 854 854
6 ol 556 460 460
7 yb 440 NA NA
