Creating a column based off of multiple columns-CodePudding

I have a data set with multiple columns. A sample of the first three columns are as follows:

df$a1 <- c("00845", "486", "49392", "04186", "5990")

df$a2 <- c("34580", "**2761**", "27800", "4439", "5849")

df$a3 <- c("0340", "49392", "78831", "70714", "486")

I want to create a column df$b which gives me a "1" if any of the columns a1-a15 contain the string "2761".

a1	a2	a3	...	a15	b
00845	34580	0340	...	4280	0
486	2761	49392	...	25000	1
49392	27800	78831	...	7955	0
04186	4439	70714	...	27800	0
5990	5849	486	...	4400	0

So far, I've developed the following code:

df %>%

  mutate(d = c(0, 1)[(a1:a15 %in% c("2761"))   1])

but it doesn't work. Any help would be greatly appreciated!

CodePudding user response：

We may use if_any to check if the 'a1' to 'a15' columns in a row contain the string "2761" - if_any returns a logical vector, which is coerced to binary with or as.integer to create a new column 'd'

library(dplyr)
df <- df %>%
     mutate(d =  (if_any(matches("^a\\d $"), ~ . %in% "2761")))

CodePudding user response：

You may use dplyr's rowwise() and c_across() as follows:

df |> 
  rowwise() |> 
  mutate(
    b = grepl(pattern = "2761", x = c_across(a1:a15)) |> any() |> as.numeric()
  )