I'm having a problem where I want to mutate two variables with values 0, 1 and NA into a new variable with the sum of 0 and 1, however, R in my case counts NA as 0 or return only NA. Are there an easy fix to this, to exclude the NA?
I am using an R-textbook that does not adress my specific problem.
Code I have tried:
(1)
library(tidyverse)
df <- df |>
mutate((naked_man = naked_fj naked_naked), na.rm = TRUE)
Returns all OBS as NA
Data:
| naked_fj | naked_naked | naked_man (problem VAR) |
|---|---|---|
| 0 | 0 | NA |
| 1 | 0 | NA |
| NA | 1 | NA |
| 0 | NA | NA |
CodePudding user response:
you are just setting it up incorrectly for the mutate function. You can also use the tidyr::drop_na to remove the NA values in the data frame.
library(tidyverse)
df <- data.frame(naked_fj = c(0,1, NA, 0),
naked_naked = c(0, 0, 1, NA))
df <- df |>
mutate(naked_man = naked_fj naked_naked) %>%
drop_na()
RESULT:
naked_fj naked_naked naked_man
1 0 0 0
2 1 0 1
CodePudding user response:
To sum across columns excluding the NA, one implementation of your code in dplyr is to use rowwise :
df |>
rowwise() |>
mutate(naked_man = sum(c(naked_fj, naked_naked), na.rm = TRUE))
# naked_fj naked_naked naked_man
# <dbl> <dbl> <dbl>
# 1 0 0 0
# 2 1 0 1
# 3 NA 1 1
# 4 0 NA 0
But if not needing to use dplyr, base R may be easier:
df$naked_man <- rowSums(df, na.rm = TRUE)
Data:
df <- read.table(text = "naked_fj naked_naked naked_man
0 0 NA
1 0 NA
NA 1 NA
0 NA NA", header = TRUE)
df <- df[,-3]
df[] <- lapply(df[], as.numeric)
