I have an initial value as well as three datasets in a list:
value <- 10
df1 <- data.frame(id = c("a", "b", "c"), quantity = c(3, 7, -2))
df2 <- data.frame(id = c("d", "e", "f"), quantity = c(0, -3, 9))
df3 <- data.frame(id = c("g", "h", "i"), quantity = c(-2, 0, 4))
df_list <- list(df1, df2, df3)
I would like to apply multiple functions to each data frame in the list, dynamically update the value at the end of the operations, and then run the same procedure on the next list item. The challenge is that the functions themselves take value as an input.
Here is how I would accomplish this without using the apply function:
# Example function 1: Generate `outcome` by multiplying quantity times a random number
df_list[[1]]$outcome <- df_list[[1]]$quantity * sample(1:10, 1)
# Example function 2: Multiply `value` by `quantity` to update `outcome`
df_list[[1]]$outcome <- value * df_list[[1]]$quantity
# Updates `value` by adding the old `value` to the sum of the outcome column:
value <- value as.numeric(colSums(df_list[[1]]["outcome"]))
# Repeats operations for df_list[[2]] and df_list[[3]]
df_list[[2]]$outcome <- df_list[[2]]$quantity * sample(1:10, 1)
df_list[[2]]$outcome <- value * df_list[[2]]$quantity
value <- value as.numeric(colSums(df_list[[2]]["outcome"]))
df_list[[3]]$outcome <- df_list[[3]]$quantity * sample(1:10, 1)
df_list[[3]]$outcome <- value * df_list[[3]]$quantity
value <- value as.numeric(colSums(df_list[[3]]["outcome"]))
I can use dplyr's lapply to run the functions on each list item, but how do I access (and dynamically update) the non-list object value before proceeding to the next list item?
CodePudding user response:
if we need to update, use a for loop i.e loop over the sequence of list and change the index
for(i in seq_along(df_list)) {
# Multiplies `value` by `quantity` to obtain `outcome` for each row in df_list[[1]]
df_list[[i]]$outcome <- value * df_list[[i]]$quantity
# Updates `outcome` by multiplying by a random number
df_list[[i]]$outcome <- df_list[[i]]$quantity * sample(1:10, 1)
value <- value as.numeric(colSums(df_list[[i]]["outcome"]))
}
-output
> value
[1] 84
