Home > Mobile >  assign values based on values in multiple row
assign values based on values in multiple row

Time:01-25

I have a dataframe like this,

DATA <- data.frame(
    CARS = c(NA, 66, NA, NA, 74, NA, NA, NA),
    EYE_SIGHT= c("GOOD", "EXCELLENT", "POOR", "POOR", "GOOD", "POOR", "EXCELLENT", "GOOD"))

and I want to create a new column (“OUTCOME”) and tell R, If EYE_SIGHT is "GOOD" or "EXCELLENT” then "OUTCOME" should be ‘1’ for everywhere there is a value in “CARS”. However, if the corresponding value in “CARS” is ‘NA’, then "OUTCOME" will be ‘NA’.

Also, if EYE_SIGHT is “POOR” when “CARS” is NA, then "OUTCOME" should be ‘0’

So, I can have something like this:

   CARS   EYE_SIGHT  OUTCOME
1   NA      GOOD      NA
2   66 EXCELLENT       1
3   NA      POOR       0
4   NA      POOR       0
5   74      GOOD       1
6   NA      POOR       0
7   NA EXCELLENT      NA
8   NA      GOOD      NA

I am not sure how to run this. Any idea will be appreciated.

CodePudding user response:

case_when() from the {{dplyr}} package is a good way to handle multiple if statements.

library(dplyr)
DATA %>%
  mutate(OUTCOME = case_when(
    EYE_SIGHT == "POOR" ~ 0,
    EYE_SIGHT %in% c("EXCELLENT", "GOOD") & !is.na(CARS) ~ 1,
    TRUE ~ NA_real_
  ))

  CARS EYE_SIGHT OUTCOME
1   NA      GOOD      NA
2   66 EXCELLENT       1
3   NA      POOR       0
4   NA      POOR       0
5   74      GOOD       1
6   NA      POOR       0
7   NA EXCELLENT      NA
8   NA      GOOD      NA
  •  Tags:  
  • Related