Consider that we have the below data and would like to derive variables z1,z2,z3 from AB * sys, CC * dia and AD * hr.
could you please help me how i can achieve this in R.
we can try the below approach with mutate(), but i do not want this, is there a more robust approach to follow when working with more variable, may be a for loop
AB <- c(1,2,3,4,5,6)
CC <- c(2,3,4,5,6,7)
AD <- c(3,4,5,6,7,8)
x4 <- c('A','B','C','D','E','F')
sys <- c(1,2,3,4,5,6)
dia <- c(2,3,4,5,6,7)
hr <- c(3,4,5,6,7,8)
testa <- data.frame(AB,CC,AD,x4,sys,dia,hr)
testa <- data.frame(AB,CC,AD,x4,sys,dia,hr) %>% mutate(xy=AB*sys, zy=CC*dia, yy=AD*hr)
CodePudding user response:
Let's define the columns needed with a list:
needs <- list(z1=c("AB", "sys"), z2=c("CC", "dia"), z3=c("AD", "hr"))
From here ...
base R
testa <- cbind(testa, lapply(needs, function(z) apply(testa[z], 1, prod)))
testa
# AB CC AD x4 sys dia hr z1 z2 z3
# 1 1 2 3 A 1 2 3 1 4 9
# 2 2 3 4 B 2 3 4 4 9 16
# 3 3 4 5 C 3 4 5 9 16 25
# 4 4 5 6 D 4 5 6 16 25 36
# 5 5 6 7 E 5 6 7 25 36 49
# 6 6 7 8 F 6 7 8 36 49 64
dplyr
mutate accepts a data.frame of columns to assign all columns at once. It's a little more verbose than the base R version (which does not require the data.frame(.) wrapper), but it's not horrible.
testa %>%
mutate(data.frame(lapply(needs, function(z) apply(testa[z], 1, prod))))
# AB CC AD x4 sys dia hr z1 z2 z3
# 1 1 2 3 A 1 2 3 1 4 9
# 2 2 3 4 B 2 3 4 4 9 16
# 3 3 4 5 C 3 4 5 9 16 25
# 4 4 5 6 D 4 5 6 16 25 36
# 5 5 6 7 E 5 6 7 25 36 49
# 6 6 7 8 F 6 7 8 36 49 64
CodePudding user response:
I am not sure where variable "x4" fits in the equation but here is an option assuming the number of columns on LHS = number of columns on RHS.
library(purrr)
testa <- data.frame(AB,CC,AD) # x4 is removed from here
testb <- data.frame(sys,dia,hr) # this is required for your equation
tmp_df<- purrr::map2_dfr(testa,testb,.f=~ .x *.y) # create dummy dataframe
names(tmp_df)<- c("xy","zy","yy") # set the column names
bind_cols(testa,testb,tmp_df) # final summary dataframe
Output:
AB CC AD sys dia hr xy zy yy
1 1 2 3 1 2 3 1 4 9
2 2 3 4 2 3 4 4 9 16
3 3 4 5 3 4 5 9 16 25
4 4 5 6 4 5 6 16 25 36
5 5 6 7 5 6 7 25 36 49
6 6 7 8 6 7 8 36 49 64

