I have the following dataset, and I need to remove rows if they are all empty or have same value across all the columns:
df <- data.frame(players=c('', 'Uncredited', 'C', 'D', 'E'),
assists=c("", "Uncredited", 4, 4, 3),
ratings=c("", "Uncredited", 4, 7, ""))
df
players assists ratings
<chr> <chr> <chr>
Uncredited Uncredited Uncredited
C 4 4
D 4 7
E 3
In our example, the 1st row is all empty and the 2nd row has the same value of Uncredited. Hence, the 1st two rows would be removed.
Desired Output
players assists ratings
<chr> <dbl> <chr>
C 4 4
D 4 7
E 3
Any suggestions would be appreciated. Thanks!
CodePudding user response:
You can use apply to loop over all rows and filter for those that have more than a single distinct value. Note that if all value in a row are empty the row also has only one distinct value, so the first condition is part of the second condition.
df[apply(df,
MARGIN = 1, # rowwise
FUN = function(x) length(unique(x)) > 1), ]
#> players assists ratings
#> 3 C 4 4
#> 4 D 4 7
#> 5 E 3
CodePudding user response:
We could use if_any
library(dplyr)
df %>%
filter(if_any(assists:ratings, ~ .x != players))
-output
players assists ratings
1 C 4 4
2 D 4 7
3 E 3
