Home > Net >  How can I add column based on elements of three columns?
How can I add column based on elements of three columns?

Time:02-05

I have a small data.frame in which I am trying to compare the columns rowwise. My df looks like this:

enter image description here

The three columns Coder1, Coder2 and Coder3 inhibit data on which syllable they think is stressed. The column concat is a concatenated version of their answers.

I am trying to add two new columns:

  1. This column should be named StressAgreeLax. For this I would need a code which checks whether Coder1 is identical to Coder2, Coder1 to Coder3 or Coder2 to Coder3. And if so: adds the agreed syllable and if not adds "DISAGREE"

e.g. Coder1 Coder2 Coder3 StressAgreeLax

  C1     C2      C3         StressAgreeLax
  fin    fin     pen        fin 
  fin    pen     other      DISAGREE
  pen    fin     fin        fin
  ante   pen     ante       ante
  fin    fin     fin        fin

  1. The second column should be easier. The Column name should be StressAgreeStrict. Here all three coders have to agree with the stressed syllable. Again the agreed stress would need to be put in the new column and if not "DISAGREE".

e.g: Coder1, Coder2, Coder3, StressAgreeLax, StressAgreeStrict

  C1     C2      C3         StressAgreeLax   StressAgreeStrict
  fin    fin     pen        fin                  DISAGREE
  fin    pen     other      DISAGREE             DISAGREE
  pen    fin     fin        fin                  DISAGREE
  ante   pen     ante       ante                 DISAGREE
  fin    fin     fin        fin                  fin

This is beyond my R-Knowledge.. I tried various ifelse() combinations and case_when() as well as match(), but nothing worked..

CodePudding user response:

One solution with ifelse could be :

library(dplyr)

data=data.frame(C1=c("fin","fin","pen","ante","fin"),
                C2=c("fin","pen","fin","pen","fin"),
                C3=c("pen","other","fin","ante","fin"))

data = data %>% 
  mutate("StressAgreeLax"=ifelse(C1==C2,C1,ifelse(C1==C3,C1,ifelse(C2==C3,C2,"DISAGREE")))) %>% 
  mutate("StressAgreeStrict"=ifelse(C1==C2 & C2==C3, C1,"DISAGREE"))

data

    C1  C2    C3 StressAgreeLax StressAgreeStrict
1  fin fin   pen            fin          DISAGREE
2  fin pen other       DISAGREE          DISAGREE
3  pen fin   fin            fin          DISAGREE
4 ante pen  ante           ante          DISAGREE
5  fin fin   fin            fin               fin

CodePudding user response:

Here are the data with a couple of functions that should help with the matching.

These will only work with the 3 coder set up. If the number of coders isn't fixed you'd need a few more changes (and possibly a new method of determining agreement) but this should be fine with what you provided.

Additional notes in the inline comments:

data <- structure(
  list(
    C1 = c("fin", "fin", "pen", "ante", "fin"),
    C2 = c("fin", "pen", "fin", "pen", "fin"),
    C3 = c("pen", "other", "fin", "ante", "fin"),
    StressAgreeLax = c("fin", "DISAGREE", "fin", "ante","fin")
  ),
  row.names = c(NA, -5L),
  class = "data.frame"
)

stress_agree <- function(x, y, z) {
  # check for equal lengths
  n <- length(x)
  stopifnot(n == length(y), n == length(z))
            
  # set up output to default to disagree
  out <- rep("DISAGREE", n)
  
  # find the the possible matches (positions)
  agree_x <- which(x == y | x == z)
  agree_y <- which(z == y)
  
  # replace output with those values
  out[agree_x] <- x[agree_x]
  out[agree_y] <- y[agree_y]
  
  # outupt
  out
}

stress_agree_strict <- function(x, y, z) {
  # check for equal lengths
  n <- length(x)
  stopifnot(n == length(y), n == length(z))
  
  # set up output
  out <- rep("DISAGREE", n)
  
  # find agreements
  agree <- x == y & x == z
  
  # replace output with value
  out[agree] <- x[agree]
  
  # return
  out
}

data$StressAgreeLax_check <- with(data, stress_agree(C1, C2, C3))
data$StressAgreeStrict <- with(data, stress_agree_strict(C1, C2, C3))
data
#>     C1  C2    C3 StressAgreeLax StressAgreeLax_check StressAgreeStrict
#> 1  fin fin   pen            fin                  fin          DISAGREE
#> 2  fin pen other       DISAGREE             DISAGREE          DISAGREE
#> 3  pen fin   fin            fin                  fin          DISAGREE
#> 4 ante pen  ante           ante                 ante          DISAGREE
#> 5  fin fin   fin            fin                  fin               fin

Created on 2022-02-04 by the reprex package (v2.0.1)

Tip: try using dput() with your data to return R code to make it easier for us to work with your data:

data <- structure(
  list(
    C1 = c("fin", "fin", "pen", "ante", "fin"),
    C2 = c("fin", "pen", "fin", "pen", "fin"),
    C3 = c("pen", "other", "fin", "ante", "fin"),
    StressAgreeLax = c("fin", "DISAGREE", "fin", "ante","fin")
  ),
  row.names = c(NA, -5L),
  class = "data.frame"
)

dput(data)
#> structure(list(C1 = c("fin", "fin", "pen", "ante", "fin"), C2 = c("fin", 
#> "pen", "fin", "pen", "fin"), C3 = c("pen", "other", "fin", "ante", 
#> "fin"), StressAgreeLax = c("fin", "DISAGREE", "fin", "ante", 
#> "fin")), row.names = c(NA, -5L), class = "data.frame")

Created on 2022-02-04 by the reprex package (v2.0.1)

CodePudding user response:

We may use rowwise to get the Mode value if there are duplicates or else return 'DISAGREE'

library(dplyr)
 Mode <- function(x) {
    ux <- unique(x)
    ux[which.max(tabulate(match(x, ux)))]
  }
  

df %>%
    rowwise %>%
    mutate(StressAgreeLax = replace(Mode(c_across(everything())), 
     n_distinct(c_across(everything())) == 3, 'DISAGREE'), 
    StressAgreeStrict = if(n_distinct(c_across(everything())) > 1 ) 
       "DISAGREE" else
     StressAgreeLax) %>% 
     ungroup

-output

# A tibble: 5 × 5
  C1    C2    C3    StressAgreeLax StressAgreeStrict
  <chr> <chr> <chr> <chr>          <chr>            
1 fin   fin   pen   fin            DISAGREE         
2 fin   pen   other DISAGREE       DISAGREE         
3 pen   fin   fin   fin            DISAGREE         
4 ante  pen   ante  ante           DISAGREE         
5 fin   fin   fin   fin            fin              

data

df <- structure(list(C1 = c("fin", "fin", "pen", "ante", "fin"), C2 = c("fin", 
"pen", "fin", "pen", "fin"), C3 = c("pen", "other", "fin", "ante", 
"fin")), row.names = c(NA, -5L), class = "data.frame")
  •  Tags:  
  • Related