Home > Mobile >  Find mean of counts within groups
Find mean of counts within groups

Time:01-21

I have a dataframe that looks like this:

library(tidyverse)    
x <- tibble(
   batch = rep(c(1,2), each=10),
   exp_id = c(rep('a',3),rep('b',2),rep('c',5),rep('d',6),rep('e',4))
 )

I can run the code below to get the count perexp_id:

x %>% group_by(batch,exp_id) %>% 
  summarise(count=n())  

which generates:

  batch exp_id count
  <dbl> <chr>  <dbl>
1     1 a          3
2     1 b          2
3     1 c          5
4     2 d          6
5     2 e          4

A really ugly way to generate the mean of these counts is:

x %>% group_by(batch,exp_id) %>% 
  summarise(count=n()) %>% 
  ungroup() %>% 
  group_by(batch) %>% 
  summarise(avg_exp = mean(count))

which generates:

  batch avg_exp
  <dbl>   <dbl>
1     1    3.33
2     2    5 

Is there a more succinct and "tidy" way generate this?

CodePudding user response:

library(dplyr)
group_by(x, batch) %>%
  summarize(avg_exp = mean(table(exp_id)))
# # A tibble: 2 x 2
#   batch avg_exp
#   <dbl>   <dbl>
# 1     1    3.33
# 2     2    5   

CodePudding user response:

Here's another way -

library(dplyr)

x %>%
  count(batch, exp_id, name = "count") %>%
  group_by(batch) %>%
  summarise(count = mean(count))

#  batch count
#  <dbl> <dbl>
#1     1  3.33
#2     2  5   
  •  Tags:  
  • Related