Home > Back-end >  What functions in R can recursively "reduce" the rows of a dataframe?
What functions in R can recursively "reduce" the rows of a dataframe?

Time:02-02

What functions in R can recursively "reduce" the rows of a dataframe? I'm thinking of a function like Reduce(), but that accepts a dataframe instead of a vector, and a function that accepts each row of the dataframe and an accumulator.

Consider the following example that creates a dataframe that contains the price and quantity of a list of purchases and uses Reduce() to calculate the running total cost.

purchases = data.frame(
  price = c(1.50, 1.75, 2.00, 2.10, 1.80),
  quantity = c(100, 80, 50, 20, 90)
)
print(purchases)
#>   price quantity
#> 1  1.50      100
#> 2  1.75       80
#> 3  2.00       50
#> 4  2.10       20
#> 5  1.80       90
purchase_costs <- purchases$quantity * purchases$price
print(purchase_costs)
#> [1] 150 140 100  42 162
total_cost <- Reduce(
  function(total_cost, cost) { total_cost   cost },
  purchase_costs,
  accumulate = TRUE
)
print(total_cost)
#> [1] 150 290 390 432 594

Created on 2022-02-01 by the reprex package (v2.0.1)

What functions in R similar to Reduce() might calculate this running total cost by recursively processing each purchase in the dataframe rather than each cost in a vector of costs? Such a Reduce() function might resemble the following:

total_cost <- Reduce(
  function(total_cost, purchase) { total_cost   purchase["quantity"] * purchase["price"] },
  purchases,
  accumulate = TRUE
)

CodePudding user response:

Reduce by itself isn't going to operate row-wise like you want: it works well on a simple vector or list, but not on rows of a frame.

Try this frame-aware function:

Reduce_frame <- function(data, expr, init) {
  expr <- substitute(expr)
  out <- rep(init[1][NA], nrow(data))
  for (rn in seq_len(nrow(data))) {
    out[rn] <- init <- eval(expr, envir = data[rn,])
  }
  out
}

Reduce_frame(purchases, init   quantity*price, init=0)
# [1] 150 290 390 432 594

CodePudding user response:

How's this?

library(tidyverse)

purchases %>% 
  mutate(cost = price * quantity) %>% 
  mutate(total_cost = cost   lag(cumsum(cost)))

  price quantity cost total_cost
1  1.50      100  150         NA
2  1.75       80  140        290
3  2.00       50  100        390
4  2.10       20   42        432
5  1.80       90  162        594
  •  Tags:  
  • Related