Home > Mobile >  ifelse returns NA instead of expected value
ifelse returns NA instead of expected value

Time:01-06

Here is my dataframe:

P1 <- c("UP", "UP", "UP", "UP", "UP", "UP", "UP", NA, NA, NA, NA, NA, NA)
P2 <- c(NA, "UP", "UP", "UP", "UP", "UP", "UP", "UP", "UP", NA, NA, NA, NA)
P3 <- c(NA, NA, "Normal", "Normal", NA, "Normal", "Normal", NA, "Normal", "UP", NA, "UP", NA)
P4 <- c(NA, NA, NA, NA, "Normal", "Normal", "Normal", NA, NA, NA, "UP", "UP", NA)
P5 <- c(NA, NA, NA, "Normal", NA, NA, "Normal", NA, NA, NA, NA, NA, "UP")
df <- data.frame(P1, P2, P3, P4, P5)

I am trying to add a column with the status based on the values of P1:P4 columns. But for some reason, it returns NA instead of "ANTI" or "UNKNOWN"

df['status'] <- ifelse(df$P1=="UP" | df$P2 == "UP", "PRO", 
                          ifelse(df$P3=="UP" | df$P4 == "UP", "ANTI", "UNKNOWN"))

enter image description here

CodePudding user response:

Please find one possible alternative solution using the library data.table

Reprex

  • Code
library(data.table)

setDT(df)[, status := fcase(P1 =="UP" | P2 == "UP", "PRO",
                            P3 =="UP" | P4 == "UP", "ANTI",
                            default = "UNKNOWN")][]
  • Output
#>       P1   P2     P3     P4     P5  status
#>  1:   UP <NA>   <NA>   <NA>   <NA>     PRO
#>  2:   UP   UP   <NA>   <NA>   <NA>     PRO
#>  3:   UP   UP Normal   <NA>   <NA>     PRO
#>  4:   UP   UP Normal   <NA> Normal     PRO
#>  5:   UP   UP   <NA> Normal   <NA>     PRO
#>  6:   UP   UP Normal Normal   <NA>     PRO
#>  7:   UP   UP Normal Normal Normal     PRO
#>  8: <NA>   UP   <NA>   <NA>   <NA>     PRO
#>  9: <NA>   UP Normal   <NA>   <NA>     PRO
#> 10: <NA> <NA>     UP   <NA>   <NA>    ANTI
#> 11: <NA> <NA>   <NA>     UP   <NA>    ANTI
#> 12: <NA> <NA>     UP     UP   <NA>    ANTI
#> 13: <NA> <NA>   <NA>   <NA>     UP UNKNOWN

Created on 2022-01-06 by the reprex package (v2.0.1)

CodePudding user response:

ifelse will need you to specify !is.na(df$var) which can become quite verbose. If you aren't married to using ifelse, a more parsimonious solution may be:

library(dplyr)
df2 <- df %>%
  mutate(status = case_when(
    P1=="UP" | P2=="UP" ~ "PRO",
    P3=="UP" | P4=="UP" ~ "ANTI",
    TRUE ~ "UNKNOWN"
  ))

#      P1   P2     P3     P4     P5  status
# 1    UP <NA>   <NA>   <NA>   <NA>     PRO
# 2    UP   UP   <NA>   <NA>   <NA>     PRO
# 3    UP   UP Normal   <NA>   <NA>     PRO
# 4    UP   UP Normal   <NA> Normal     PRO
# 5    UP   UP   <NA> Normal   <NA>     PRO
# 6    UP   UP Normal Normal   <NA>     PRO
# 7    UP   UP Normal Normal Normal     PRO
# 8  <NA>   UP   <NA>   <NA>   <NA>     PRO
# 9  <NA>   UP Normal   <NA>   <NA>     PRO
# 10 <NA> <NA>     UP   <NA>   <NA>    ANTI
# 11 <NA> <NA>   <NA>     UP   <NA>    ANTI
# 12 <NA> <NA>     UP     UP   <NA>    ANTI
# 13 <NA> <NA>   <NA>   <NA>     UP UNKNOWN

If you do need to use ifelse statements:

df3 <- df
df3['status'] <- ifelse((!is.na(df$P1) & df$P1=="UP") | (!is.na(df$P2) & df$P2 == "UP"), "PRO", 
                       ifelse((!is.na(df$P3) & df$P3=="UP") | (!is.na(df$P4) & df$P4 == "UP"), "ANTI", "UNKNOWN"))

# > df3
#      P1   P2     P3     P4     P5  status
# 1    UP <NA>   <NA>   <NA>   <NA>     PRO
# 2    UP   UP   <NA>   <NA>   <NA>     PRO
# 3    UP   UP Normal   <NA>   <NA>     PRO
# 4    UP   UP Normal   <NA> Normal     PRO
# 5    UP   UP   <NA> Normal   <NA>     PRO
# 6    UP   UP Normal Normal   <NA>     PRO
# 7    UP   UP Normal Normal Normal     PRO
# 8  <NA>   UP   <NA>   <NA>   <NA>     PRO
# 9  <NA>   UP Normal   <NA>   <NA>     PRO
# 10 <NA> <NA>     UP   <NA>   <NA>    ANTI
# 11 <NA> <NA>   <NA>     UP   <NA>    ANTI
# 12 <NA> <NA>     UP     UP   <NA>    ANTI
# 13 <NA> <NA>   <NA>   <NA>     UP UNKNOWN

CodePudding user response:

There has been alternative provided to you in other answers. I'll explain why you get those NA values in status column. It is because of NA values in your data.

Consider this small example -

x <- c(1, 2, NA, 1)
x == 1
#[1]  TRUE FALSE    NA  TRUE

if you have NA in the data and you compare it with == it will return NA as output which in turn returns NA in ifelse. A simple and quick fix without changing a lot of your code would be to replace == with %in% which returns FALSE for NA values.

x <- c(1, 2, NA, 1)
x %in% 1
#[1]  TRUE FALSE FALSE  TRUE

Implementing it in your case you get -

df <- transform(df, status = ifelse(P1 %in% "UP" | P2 %in% "UP", "PRO", 
                          ifelse(P3 %in%"UP" | P4 %in% "UP", "ANTI", "UNKNOWN"))

#     P1   P2     P3     P4     P5  status
#1    UP <NA>   <NA>   <NA>   <NA>     PRO
#2    UP   UP   <NA>   <NA>   <NA>     PRO
#3    UP   UP Normal   <NA>   <NA>     PRO
#4    UP   UP Normal   <NA> Normal     PRO
#5    UP   UP   <NA> Normal   <NA>     PRO
#6    UP   UP Normal Normal   <NA>     PRO
#7    UP   UP Normal Normal Normal     PRO
#8  <NA>   UP   <NA>   <NA>   <NA>     PRO
#9  <NA>   UP Normal   <NA>   <NA>     PRO
#10 <NA> <NA>     UP   <NA>   <NA>    ANTI
#11 <NA> <NA>   <NA>     UP   <NA>    ANTI
#12 <NA> <NA>     UP     UP   <NA>    ANTI
#13 <NA> <NA>   <NA>   <NA>     UP UNKNOWN
  •  Tags:  
  • Related