R Summarise Last N Columns with Dynamic Headers-CodePudding

I'm trying to summarize a table that has N number of date columns that will always exist at the end of the data set. Is there a way to summarize the last N columns instead of calling each column by name?

df <- df %>% group_by(Project) %>% summarise(???)

Project	Sep-21	Oct-21	Nov-2021
1234	4	5	6
5678	10	11	5
1234	23	1	87
5678	8	16	8

Expected Output:

Project	Sep-21	Oct-21	Nov-2021
1234	27	6	93
5678	18	27	13

CodePudding user response：

A base R solution.

df <- data.frame(
  Project = c(1234,5678,1234,5678),
  Sep = c(4,10,23,8),
  Oct = c(5,11,1,16),
  Nov = c(6,5,87,8)
)

t(sapply(split(df, df$Project), function(x) apply(x[,-1L], 2, sum)))

Output

     Sep Oct Nov
1234  27   6  93
5678  18  27  13

CodePudding user response：

Indexing with summarise and across may be a bit buggy as the index have to consider the grouping column as well. A neater approach is to select the columns of interest first and then do a group_by and use everything() to select the remaining columns

library(dplyr)
df %>% 
 select(Project, 7:ncol(.)) %>% 
 group_by(Project) %>% 
 summarise(across(everything(), sum, na.rm = TRUE), .groups = 'drop')

If we want to directly do without select, subtract the column index from the number of columns used in the group_by for the range

df %>%
    group_by(Project) %>%
    summarise(across((7-1):(ncol(.)-1), sum, na.rm = TRUE), .groups = 'drop')