I have a table in R like this:
x Y
1 2 1
2 1 1
3 NA 1
4 2 NA
5 1 2
6 2 2
7 1 1
and what I'm hoping to do is make a new column called xy which bases on if there is a 1 exist in either x or y.
For example, if x is 1 and y is 2 then the xy should be 1 ; if x is NAand y is 1 then the xyshould be 1. If both x and y is 2 then xyshould be 2.
The priority of the categorical variables 1, 2 and NA is 1>2>NA.
In short what my desired output looks like this:
x Y XY
1 2 1 1
2 1 1 1
3 NA 1 1
4 2 NA 2
5 NA NA NA
6 2 2 2
7 1 1 1
I'm new to R and trying to trim my data. Thank you for your help! I'm really appreciated:)
CodePudding user response:
Try this
library(dplyr)
df |> rowwise() |>
mutate(z1 = coalesce(c_across(x) , 0) , z2 = coalesce(c_across(Y) , 0)) |>
mutate(XY = case_when(any(c_across(z1:z2) == 1) ~ 1 , any(c_across(z1:z2) == 2) ~ 2)) |>
select(-z1 , -z2) |> ungroup() -> ans
- output
# A tibble: 7 × 3
x Y XY
<int> <int> <dbl>
1 2 1 1
2 1 1 1
3 NA 1 1
4 2 NA 2
5 NA NA NA
6 2 2 2
7 1 1 1
- data
df <- structure(list(x = c(2L, 1L, NA, 2L, NA, 2L, 1L), Y = c(1L, 1L,
1L, NA, NA, 2L, 1L)), row.names = c("1", "2", "3", "4", "5",
"6", "7"), class = "data.frame")
CodePudding user response:
You could do it with a case_when (remembering that it evaluates from the bottom and up):
library(dplyr)
df <-
df |>
mutate(XY = case_when(x == 1 | Y == 1 ~ 1,
x == 2 | Y == 2 ~ 2,
TRUE ~ NA_real_))
Or apply the same logic using base functionalities:
df$XY <- NA
df$XY[df$x == 2 | df$Y == 2] <- 2
df$XY[df$x == 1 | df$Y == 1] <- 1
Output:
x Y XY
<dbl> <dbl> <dbl>
1 2 1 1
2 1 1 1
3 NA 1 1
4 2 NA 2
5 NA NA NA
6 2 2 2
7 1 1 1
Data:
library(readr)
df <- read_table("
x Y
2 1
1 1
NA 1
2 NA
NA NA
2 2
1 1")
