Home > OS >  How to create a function to get summary statistics as columns?
How to create a function to get summary statistics as columns?

Time:01-05

I have three workflows to get Mean, Standard Deviation, and Variance. Would it be possible to simplify this by creating one function with one table with all the summaries as the result?

Mean

iris %>% 
  select(-Species) %>% 
  summarise_all( , mean, na.rm = TRUE) %>% 
  t() %>% 
  as.data.frame() %>% 
  rownames_to_column("Name") %>% 
  rename(Mean = V1)

Standard Deviation

iris %>% 
  select(-Species) %>% 
  summarise_all(., sd, na.rm = TRUE) %>% 
  t() %>% 
  as.data.frame() %>% 
  rownames_to_column("Name") %>% 
  rename(SD = V1)

Variance

iris %>% 
  select(-Species) %>% 
  summarise_all(., var, na.rm = TRUE) %>% 
  t() %>% 
  as.data.frame() %>% 
  rownames_to_column("Name") %>% 
  rename(Variance = V1)

CodePudding user response:

We could reshape to 'long' format and then do a group by operation to create the three summarise columns

library(dplyr)
library(tidyr)
iris %>% 
   select(where(is.numeric)) %>% 
   pivot_longer(cols = everything(), names_to = "Name") %>% 
   group_by(Name) %>% 
   summarise(Mean = mean(value, na.rm = TRUE),
            SD = sd(value, na.rm = TRUE), 
            Variance = var(value, na.rm = TRUE))

-output

# A tibble: 4 × 4
  Name          Mean    SD Variance
  <chr>        <dbl> <dbl>    <dbl>
1 Petal.Length  3.76 1.77     3.12 
2 Petal.Width   1.20 0.762    0.581
3 Sepal.Length  5.84 0.828    0.686
4 Sepal.Width   3.06 0.436    0.190

CodePudding user response:

iris %>% 
  select(-Species) %>% 
  summarise_all(list(mean = mean,sd = sd, var = var), na.rm = TRUE)%>%
  pivot_longer(everything(), names_sep = '_', names_to = c('Name','.value'))

# A tibble: 4 x 4
  Name          mean    sd   var
  <chr>        <dbl> <dbl> <dbl>
1 Sepal.Length  5.84 0.828 0.686
2 Sepal.Width   3.06 0.436 0.190
3 Petal.Length  3.76 1.77  3.12 
4 Petal.Width   1.20 0.762 0.581
  •  Tags:  
  • Related