If some of the columns of the data set are empty, I want to eliminate that columns in R-CodePudding

I'm using R and stuck at the following problem. I have a data called data. It has 48 columns. id, title_1, title_2, ..., title_47. id stands for personal id and the other 47 columns include the values either H or L or P or N or "" . Here I mean "" as empty. My goal is to eliminate the columns that have all empty values. Definitely all the values of id are filled with number. So I think I should make a for sentence for title_1 to title_47 to check whether some of them have all empty values.

CodePudding user response：

You can Filter the columns which have all values "". Here I use an example dataframe where the column title_2 has only "" values:

Filter(function(x)!all(x == ""), df)

Output:

  id title_1
1  1       2
2  2       3
3  3       5

Data used as an example:

df <- data.frame(id = c(1,2,3),
                 title_1 = c(2,3,5),
                 title_2 = c("", "", ""))

CodePudding user response：

Data from @Quinten(many thanks). Here is a solution with sapply

df <- data.frame(id = c(1,2,3),
                 title_1 = c(2,3,5),
                 title_2 = c("", "", ""))

df[, !sapply(df, function(x) all(x == ""))]

  id title_1
1  1       2
2  2       3
3  3       5

CodePudding user response：

Another possible solution, based on dplyr (I am using @Quinten's data, to whom I thank):

library(dplyr)

df %>% 
  select(which(colSums(df != "") != 0))

#>   id title_1
#> 1  1       2
#> 2  2       3
#> 3  3       5