My dataset:
dt<-data.frame(GrossIncome=seq(0, 10000, by = 1000),
Turnover= seq(0, 100000, by = 10000),
Sellers= seq(0, 1, by = 0.1),
Buyers=seq(0, 1, by = 0.1))
So I now I want to summarize this data and divide by 1000 GrossIncome and Turnover.
OUTPUT<-data.frame(
"GrossIncome"=round(sum(dt$GrossIncome)/1000,1),
"Turnover"=round(sum(dt$Turnover)/1000,1),
"GrossIncomeAndTurnover"=round(((sum(dt$Turnover) sum(dt$Turnover))/1000),1),
"Sellers"=round(sum(dt$Sellers),1),
"Buyers"=round(sum(dt$Buyers),1))
Output
GrossIncome Turnover GrossIncomeAndTurnover Sellers Buyers
1 55 550 1100 5.5 5.5
So any suggestion for a more elegant solution then solution above ? I tried with the code below but this code only works for first two items (GrossIncome and Turnover) but not for rest of items.
dt %>%
dplyr::select(GrossIncome,Turnover)%>%
dplyr:: summarise_all(sum,na.rm=TRUE)/1000
So can anybody help me how to solve this problem?
CodePudding user response:
We can use across() to apply different functions to different columns.
dt %>%
summarize(
across(c(GrossIncome, Turnover), ~ round(sum(.) / 1000, 1)),
GrossIncomeAndTurnover = GrossIncome Turnover,
across(c(Sellers, Buyers), ~round(sum(.), 1))
)
# GrossIncome Turnover GrossIncomeAndTurnover Sellers Buyers
# 1 55 550 605 5.5 5.5
Note that in both our codes, the GrossIncome and Turnover summaries are computed first and these newly created variables are used in the GrossIncomeAndTurnover calculation. My code accounts for this, simply adding them.
CodePudding user response:
Something like this?
round_fun <- \(DF) {
out <- apply(DF, 2, sum)
out <- ifelse(out > 1e3, out/1e3, out)
out <- c(out, out['GrossIncome'] out['Turnover'])
names(out)[5] <- 'GrossIncomeAndTurnover'
return(out)
}
round_fun(dt)
# -------------------------------
> round_fun(dt)
GrossIncome Turnover Sellers Buyers
55.0 550.0 5.5 5.5
GrossIncomeAndTurnover
605.0
CodePudding user response:
Another way of doing it is first summarise all your data and then "format" it.
dt %>%
summarise_all(sum, na.rm = TRUE) %>%
mutate_at(c("GrossIncome", "Turnover"), ~(.) / 1000) %>%
mutate(GrossIncomeAndTurnover = GrossIncome Turnover) %>%
mutate_all(round, digits = 1)
