Home > Blockchain >  Creating a new column based on TRUE/FALSE of several columns in R
Creating a new column based on TRUE/FALSE of several columns in R

Time:01-24

I am new in R and this is my first post. Please help me out.

I have a dataset that has 10 columns that look like this:

Red Blue Green
True False False
True False False
False True False
False False True

I want one column that should look like:

Color
Red
Red
Blue
Green

The 'True' should be read into that color. Only one 'True' across columns in a given row.

I tried: df <- df %>% add_column(color=ifelse(.$col_name == TRUE,colnames(df)[1],"")

Red Blue Green col_1 col_2 col_3
True False False Red
True False False Red
False True False Blue
False False True Green

Thus creating 10 extra columns with a hope to merge them later. But I am stuck. Can anyone please help?

Thank you!

CodePudding user response:

If you have a logical dataframe:

cbind(df, col = names(df)[max.col(df1)])
    Red  Blue Green   col
1  True False False   Red
2  True False False   Red
3 False  True False  Blue
4 False False  True Green

On the other hand, if you have data as presented above, then:

df1 <- df #THIS IS TO ENSURE YOU MAINTAIN YOUR ORIGINAL DATAFRAME
df1[]<-as.logical(as.matrix(df1))

cbind(df1, color = names(df)[max.col(df1)])

    Red  Blue Green color
1  TRUE FALSE FALSE   Red
2  TRUE FALSE FALSE   Red
3 FALSE  TRUE FALSE  Blue
4 FALSE FALSE  TRUE Green

If copying the data is expensive then:

cbind(df, col = names(df)[max.col(array(as.logical(unlist(df)), dim(df)))])

CodePudding user response:

Here are tidyverse approaches.

df = tibble(Red = c(T,T,F,F), Blue = c(F,F,T,F), Green = c(F,F,F,T))

Approach 1: case_when, a vectorised multiple if - else.

df %>%
  mutate(color = case_when(Red ~ "Red",
                           Blue ~ "Blue",
                           Green ~ "Green"))

Swap mutate with transmute to only return the new color column.
Approach 2: Use column name properties.

df %>%
  pivot_longer(everything(), names_to = "color") %>%
  filter(value) %>%
  select(color)

Approach 3: subset column names

df %>%
  mutate(color = names(.)[apply(., 1, which)])

CodePudding user response:

A base R approach, using ifelse:

df$col_1 <- ifelse(df$Red, "Red", "")
df$col_2 <- ifelse(df$Blue, "Blue", "")
df$col_3 <- ifelse(df$Green, "Green", "")
  •  Tags:  
  • Related