remove records if particular cell have NA-CodePudding

if want to filter data on the basis of ID columns . so if ID columns have NA in all three columns then filter data for example create a subset of df5 if id1 id2 and id3 column have NA


df5 <- structure(list(name = c("ka","au","wa","ny","de","ws","ol","yb","uy","hj"),
  `ID 1` = c(653, NA, 240, NA, 826, 872, 556, 440, 
 NA), `ID 2` = c(385, NA, 970, 240, 551, 854, 460, NA, NA),
 `ID 3` =c(385, NA, 970, NA, 551, 854, 460, NA, NA)), row.names = c(NA, 
      -9L), class = "data.frame")

CodePudding user response：

code

df5[complete.cases(df5[ , 2:4]), ]

output

  name ID 1 ID 2 ID 3
1   ka  653  385  385
3   wa  240  970  970
5   de  826  551  551
6   ws  872  854  854
7   ol  556  460  460

CodePudding user response：

Here is a tidyverse approach, which will ONLY filter rows that are ALL NA (i.e. rows with 2 NAs or 1NA are retained).

Also in your comment, you mentioned you would like to have dynamic column names, I also included this in my answer. It will sum up the number of NAs in each row with column names that starts with ID, and filter away rows that are have ncol() - 1 number of NA (since you don't count the name column, therefore you need the - 1.

library(tidyverse)

df5 %>% rowwise() %>% 
  filter(sum(is.na(c_across(starts_with("ID")))) != ncol(df5) - 1) %>% 
  ungroup()

Output

# A tibble: 7 x 4
  name    ID1   ID2   ID3
  <chr> <dbl> <dbl> <dbl>
1 ka      653   385   385
2 wa      240   970   970
3 ny       NA   240    NA
4 de      826   551   551
5 ws      872   854   854
6 ol      556   460   460
7 yb      440    NA    NA