I am new in R and this is my first post. Please help me out.
I have a dataset that has 10 columns that look like this:
| Red | Blue | Green |
|---|---|---|
| True | False | False |
| True | False | False |
| False | True | False |
| False | False | True |
I want one column that should look like:
| Color |
|---|
| Red |
| Red |
| Blue |
| Green |
The 'True' should be read into that color. Only one 'True' across columns in a given row.
I tried: df <- df %>% add_column(color=ifelse(.$col_name == TRUE,colnames(df)[1],"")
| Red | Blue | Green | col_1 | col_2 | col_3 |
|---|---|---|---|---|---|
| True | False | False | Red | ||
| True | False | False | Red | ||
| False | True | False | Blue | ||
| False | False | True | Green |
Thus creating 10 extra columns with a hope to merge them later. But I am stuck. Can anyone please help?
Thank you!
CodePudding user response:
If you have a logical dataframe:
cbind(df, col = names(df)[max.col(df1)])
Red Blue Green col
1 True False False Red
2 True False False Red
3 False True False Blue
4 False False True Green
On the other hand, if you have data as presented above, then:
df1 <- df #THIS IS TO ENSURE YOU MAINTAIN YOUR ORIGINAL DATAFRAME
df1[]<-as.logical(as.matrix(df1))
cbind(df1, color = names(df)[max.col(df1)])
Red Blue Green color
1 TRUE FALSE FALSE Red
2 TRUE FALSE FALSE Red
3 FALSE TRUE FALSE Blue
4 FALSE FALSE TRUE Green
If copying the data is expensive then:
cbind(df, col = names(df)[max.col(array(as.logical(unlist(df)), dim(df)))])
CodePudding user response:
Here are tidyverse approaches.
df = tibble(Red = c(T,T,F,F), Blue = c(F,F,T,F), Green = c(F,F,F,T))
Approach 1: case_when, a vectorised multiple if - else.
df %>%
mutate(color = case_when(Red ~ "Red",
Blue ~ "Blue",
Green ~ "Green"))
Swap mutate with transmute to only return the new color column.
Approach 2: Use column name properties.
df %>%
pivot_longer(everything(), names_to = "color") %>%
filter(value) %>%
select(color)
Approach 3: subset column names
df %>%
mutate(color = names(.)[apply(., 1, which)])
CodePudding user response:
A base R approach, using ifelse:
df$col_1 <- ifelse(df$Red, "Red", "")
df$col_2 <- ifelse(df$Blue, "Blue", "")
df$col_3 <- ifelse(df$Green, "Green", "")
