Here is my dataframe:
P1 <- c("UP", "UP", "UP", "UP", "UP", "UP", "UP", NA, NA, NA, NA, NA, NA)
P2 <- c(NA, "UP", "UP", "UP", "UP", "UP", "UP", "UP", "UP", NA, NA, NA, NA)
P3 <- c(NA, NA, "Normal", "Normal", NA, "Normal", "Normal", NA, "Normal", "UP", NA, "UP", NA)
P4 <- c(NA, NA, NA, NA, "Normal", "Normal", "Normal", NA, NA, NA, "UP", "UP", NA)
P5 <- c(NA, NA, NA, "Normal", NA, NA, "Normal", NA, NA, NA, NA, NA, "UP")
df <- data.frame(P1, P2, P3, P4, P5)
I am trying to add a column with the status based on the values of P1:P4 columns. But for some reason, it returns NA instead of "ANTI" or "UNKNOWN"
df['status'] <- ifelse(df$P1=="UP" | df$P2 == "UP", "PRO",
ifelse(df$P3=="UP" | df$P4 == "UP", "ANTI", "UNKNOWN"))
CodePudding user response:
Please find one possible alternative solution using the library data.table
Reprex
- Code
library(data.table)
setDT(df)[, status := fcase(P1 =="UP" | P2 == "UP", "PRO",
P3 =="UP" | P4 == "UP", "ANTI",
default = "UNKNOWN")][]
- Output
#> P1 P2 P3 P4 P5 status
#> 1: UP <NA> <NA> <NA> <NA> PRO
#> 2: UP UP <NA> <NA> <NA> PRO
#> 3: UP UP Normal <NA> <NA> PRO
#> 4: UP UP Normal <NA> Normal PRO
#> 5: UP UP <NA> Normal <NA> PRO
#> 6: UP UP Normal Normal <NA> PRO
#> 7: UP UP Normal Normal Normal PRO
#> 8: <NA> UP <NA> <NA> <NA> PRO
#> 9: <NA> UP Normal <NA> <NA> PRO
#> 10: <NA> <NA> UP <NA> <NA> ANTI
#> 11: <NA> <NA> <NA> UP <NA> ANTI
#> 12: <NA> <NA> UP UP <NA> ANTI
#> 13: <NA> <NA> <NA> <NA> UP UNKNOWN
Created on 2022-01-06 by the reprex package (v2.0.1)
CodePudding user response:
ifelse will need you to specify !is.na(df$var) which can become quite verbose. If you aren't married to using ifelse, a more parsimonious solution may be:
library(dplyr)
df2 <- df %>%
mutate(status = case_when(
P1=="UP" | P2=="UP" ~ "PRO",
P3=="UP" | P4=="UP" ~ "ANTI",
TRUE ~ "UNKNOWN"
))
# P1 P2 P3 P4 P5 status
# 1 UP <NA> <NA> <NA> <NA> PRO
# 2 UP UP <NA> <NA> <NA> PRO
# 3 UP UP Normal <NA> <NA> PRO
# 4 UP UP Normal <NA> Normal PRO
# 5 UP UP <NA> Normal <NA> PRO
# 6 UP UP Normal Normal <NA> PRO
# 7 UP UP Normal Normal Normal PRO
# 8 <NA> UP <NA> <NA> <NA> PRO
# 9 <NA> UP Normal <NA> <NA> PRO
# 10 <NA> <NA> UP <NA> <NA> ANTI
# 11 <NA> <NA> <NA> UP <NA> ANTI
# 12 <NA> <NA> UP UP <NA> ANTI
# 13 <NA> <NA> <NA> <NA> UP UNKNOWN
If you do need to use ifelse statements:
df3 <- df
df3['status'] <- ifelse((!is.na(df$P1) & df$P1=="UP") | (!is.na(df$P2) & df$P2 == "UP"), "PRO",
ifelse((!is.na(df$P3) & df$P3=="UP") | (!is.na(df$P4) & df$P4 == "UP"), "ANTI", "UNKNOWN"))
# > df3
# P1 P2 P3 P4 P5 status
# 1 UP <NA> <NA> <NA> <NA> PRO
# 2 UP UP <NA> <NA> <NA> PRO
# 3 UP UP Normal <NA> <NA> PRO
# 4 UP UP Normal <NA> Normal PRO
# 5 UP UP <NA> Normal <NA> PRO
# 6 UP UP Normal Normal <NA> PRO
# 7 UP UP Normal Normal Normal PRO
# 8 <NA> UP <NA> <NA> <NA> PRO
# 9 <NA> UP Normal <NA> <NA> PRO
# 10 <NA> <NA> UP <NA> <NA> ANTI
# 11 <NA> <NA> <NA> UP <NA> ANTI
# 12 <NA> <NA> UP UP <NA> ANTI
# 13 <NA> <NA> <NA> <NA> UP UNKNOWN
CodePudding user response:
There has been alternative provided to you in other answers. I'll explain why you get those NA values in status column. It is because of NA values in your data.
Consider this small example -
x <- c(1, 2, NA, 1)
x == 1
#[1] TRUE FALSE NA TRUE
if you have NA in the data and you compare it with == it will return NA as output which in turn returns NA in ifelse. A simple and quick fix without changing a lot of your code would be to replace == with %in% which returns FALSE for NA values.
x <- c(1, 2, NA, 1)
x %in% 1
#[1] TRUE FALSE FALSE TRUE
Implementing it in your case you get -
df <- transform(df, status = ifelse(P1 %in% "UP" | P2 %in% "UP", "PRO",
ifelse(P3 %in%"UP" | P4 %in% "UP", "ANTI", "UNKNOWN"))
# P1 P2 P3 P4 P5 status
#1 UP <NA> <NA> <NA> <NA> PRO
#2 UP UP <NA> <NA> <NA> PRO
#3 UP UP Normal <NA> <NA> PRO
#4 UP UP Normal <NA> Normal PRO
#5 UP UP <NA> Normal <NA> PRO
#6 UP UP Normal Normal <NA> PRO
#7 UP UP Normal Normal Normal PRO
#8 <NA> UP <NA> <NA> <NA> PRO
#9 <NA> UP Normal <NA> <NA> PRO
#10 <NA> <NA> UP <NA> <NA> ANTI
#11 <NA> <NA> <NA> UP <NA> ANTI
#12 <NA> <NA> UP UP <NA> ANTI
#13 <NA> <NA> <NA> <NA> UP UNKNOWN

