Home > Enterprise >  derive multiple columns from multiple columns with different names in r (2)
derive multiple columns from multiple columns with different names in r (2)

Time:01-21

Consider that we have the below data and would like to derive variables z1,z2,z3 from AB * sys, CC * dia and AD * hr.

could you please help me how i can achieve this in R.

we can try the below approach with mutate(), but i do not want this, is there a more robust approach to follow when working with more variable, may be a for loop

AB <- c(1,2,3,4,5,6)
CC <- c(2,3,4,5,6,7)
AD <- c(3,4,5,6,7,8)
x4 <- c('A','B','C','D','E','F')
sys <- c(1,2,3,4,5,6)
dia <- c(2,3,4,5,6,7)
hr <- c(3,4,5,6,7,8)

testa <- data.frame(AB,CC,AD,x4,sys,dia,hr)

testa <- data.frame(AB,CC,AD,x4,sys,dia,hr) %>% mutate(xy=AB*sys, zy=CC*dia, yy=AD*hr)

output

CodePudding user response:

Let's define the columns needed with a list:

needs <- list(z1=c("AB", "sys"), z2=c("CC", "dia"), z3=c("AD", "hr"))

From here ...

base R

testa <- cbind(testa, lapply(needs, function(z) apply(testa[z], 1, prod)))
testa
#   AB CC AD x4 sys dia hr z1 z2 z3
# 1  1  2  3  A   1   2  3  1  4  9
# 2  2  3  4  B   2   3  4  4  9 16
# 3  3  4  5  C   3   4  5  9 16 25
# 4  4  5  6  D   4   5  6 16 25 36
# 5  5  6  7  E   5   6  7 25 36 49
# 6  6  7  8  F   6   7  8 36 49 64

dplyr

mutate accepts a data.frame of columns to assign all columns at once. It's a little more verbose than the base R version (which does not require the data.frame(.) wrapper), but it's not horrible.

testa %>%
  mutate(data.frame(lapply(needs, function(z) apply(testa[z], 1, prod))))
#   AB CC AD x4 sys dia hr z1 z2 z3
# 1  1  2  3  A   1   2  3  1  4  9
# 2  2  3  4  B   2   3  4  4  9 16
# 3  3  4  5  C   3   4  5  9 16 25
# 4  4  5  6  D   4   5  6 16 25 36
# 5  5  6  7  E   5   6  7 25 36 49
# 6  6  7  8  F   6   7  8 36 49 64

CodePudding user response:

I am not sure where variable "x4" fits in the equation but here is an option assuming the number of columns on LHS = number of columns on RHS.

library(purrr)
testa <- data.frame(AB,CC,AD) # x4 is removed from here
testb <- data.frame(sys,dia,hr) # this is required for your equation

tmp_df<- purrr::map2_dfr(testa,testb,.f=~ .x *.y) # create dummy dataframe
names(tmp_df)<- c("xy","zy","yy") # set the column names
bind_cols(testa,testb,tmp_df) # final summary dataframe

Output:

  AB CC AD sys dia hr xy zy yy
1  1  2  3   1   2  3  1  4  9
2  2  3  4   2   3  4  4  9 16
3  3  4  5   3   4  5  9 16 25
4  4  5  6   4   5  6 16 25 36
5  5  6  7   5   6  7 25 36 49
6  6  7  8   6   7  8 36 49 64
  •  Tags:  
  • Related