I have a small data.frame in which I am trying to compare the columns rowwise. My df looks like this:
The three columns Coder1, Coder2 and Coder3 inhibit data on which syllable they think is stressed. The column concat is a concatenated version of their answers.
I am trying to add two new columns:
- This column should be named StressAgreeLax. For this I would need a code which checks whether Coder1 is identical to Coder2, Coder1 to Coder3 or Coder2 to Coder3. And if so: adds the agreed syllable and if not adds "DISAGREE"
e.g. Coder1 Coder2 Coder3 StressAgreeLax
C1 C2 C3 StressAgreeLax
fin fin pen fin
fin pen other DISAGREE
pen fin fin fin
ante pen ante ante
fin fin fin fin
- The second column should be easier. The Column name should be StressAgreeStrict. Here all three coders have to agree with the stressed syllable. Again the agreed stress would need to be put in the new column and if not "DISAGREE".
e.g: Coder1, Coder2, Coder3, StressAgreeLax, StressAgreeStrict
C1 C2 C3 StressAgreeLax StressAgreeStrict
fin fin pen fin DISAGREE
fin pen other DISAGREE DISAGREE
pen fin fin fin DISAGREE
ante pen ante ante DISAGREE
fin fin fin fin fin
This is beyond my R-Knowledge.. I tried various ifelse() combinations and case_when() as well as match(), but nothing worked..
CodePudding user response:
One solution with ifelse could be :
library(dplyr)
data=data.frame(C1=c("fin","fin","pen","ante","fin"),
C2=c("fin","pen","fin","pen","fin"),
C3=c("pen","other","fin","ante","fin"))
data = data %>%
mutate("StressAgreeLax"=ifelse(C1==C2,C1,ifelse(C1==C3,C1,ifelse(C2==C3,C2,"DISAGREE")))) %>%
mutate("StressAgreeStrict"=ifelse(C1==C2 & C2==C3, C1,"DISAGREE"))
data
C1 C2 C3 StressAgreeLax StressAgreeStrict
1 fin fin pen fin DISAGREE
2 fin pen other DISAGREE DISAGREE
3 pen fin fin fin DISAGREE
4 ante pen ante ante DISAGREE
5 fin fin fin fin fin
CodePudding user response:
Here are the data with a couple of functions that should help with the matching.
These will only work with the 3 coder set up. If the number of coders isn't fixed you'd need a few more changes (and possibly a new method of determining agreement) but this should be fine with what you provided.
Additional notes in the inline comments:
data <- structure(
list(
C1 = c("fin", "fin", "pen", "ante", "fin"),
C2 = c("fin", "pen", "fin", "pen", "fin"),
C3 = c("pen", "other", "fin", "ante", "fin"),
StressAgreeLax = c("fin", "DISAGREE", "fin", "ante","fin")
),
row.names = c(NA, -5L),
class = "data.frame"
)
stress_agree <- function(x, y, z) {
# check for equal lengths
n <- length(x)
stopifnot(n == length(y), n == length(z))
# set up output to default to disagree
out <- rep("DISAGREE", n)
# find the the possible matches (positions)
agree_x <- which(x == y | x == z)
agree_y <- which(z == y)
# replace output with those values
out[agree_x] <- x[agree_x]
out[agree_y] <- y[agree_y]
# outupt
out
}
stress_agree_strict <- function(x, y, z) {
# check for equal lengths
n <- length(x)
stopifnot(n == length(y), n == length(z))
# set up output
out <- rep("DISAGREE", n)
# find agreements
agree <- x == y & x == z
# replace output with value
out[agree] <- x[agree]
# return
out
}
data$StressAgreeLax_check <- with(data, stress_agree(C1, C2, C3))
data$StressAgreeStrict <- with(data, stress_agree_strict(C1, C2, C3))
data
#> C1 C2 C3 StressAgreeLax StressAgreeLax_check StressAgreeStrict
#> 1 fin fin pen fin fin DISAGREE
#> 2 fin pen other DISAGREE DISAGREE DISAGREE
#> 3 pen fin fin fin fin DISAGREE
#> 4 ante pen ante ante ante DISAGREE
#> 5 fin fin fin fin fin fin
Created on 2022-02-04 by the reprex package (v2.0.1)
Tip: try using dput() with your data to return R code to make it easier for us to work with your data:
data <- structure(
list(
C1 = c("fin", "fin", "pen", "ante", "fin"),
C2 = c("fin", "pen", "fin", "pen", "fin"),
C3 = c("pen", "other", "fin", "ante", "fin"),
StressAgreeLax = c("fin", "DISAGREE", "fin", "ante","fin")
),
row.names = c(NA, -5L),
class = "data.frame"
)
dput(data)
#> structure(list(C1 = c("fin", "fin", "pen", "ante", "fin"), C2 = c("fin",
#> "pen", "fin", "pen", "fin"), C3 = c("pen", "other", "fin", "ante",
#> "fin"), StressAgreeLax = c("fin", "DISAGREE", "fin", "ante",
#> "fin")), row.names = c(NA, -5L), class = "data.frame")
Created on 2022-02-04 by the reprex package (v2.0.1)
CodePudding user response:
We may use rowwise to get the Mode value if there are duplicates or else return 'DISAGREE'
library(dplyr)
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
df %>%
rowwise %>%
mutate(StressAgreeLax = replace(Mode(c_across(everything())),
n_distinct(c_across(everything())) == 3, 'DISAGREE'),
StressAgreeStrict = if(n_distinct(c_across(everything())) > 1 )
"DISAGREE" else
StressAgreeLax) %>%
ungroup
-output
# A tibble: 5 × 5
C1 C2 C3 StressAgreeLax StressAgreeStrict
<chr> <chr> <chr> <chr> <chr>
1 fin fin pen fin DISAGREE
2 fin pen other DISAGREE DISAGREE
3 pen fin fin fin DISAGREE
4 ante pen ante ante DISAGREE
5 fin fin fin fin fin
data
df <- structure(list(C1 = c("fin", "fin", "pen", "ante", "fin"), C2 = c("fin",
"pen", "fin", "pen", "fin"), C3 = c("pen", "other", "fin", "ante",
"fin")), row.names = c(NA, -5L), class = "data.frame")

