Home > database >  Display mean plus or minus standard error of the mean
Display mean plus or minus standard error of the mean

Time:01-27

I was trying to represent a dot plot with my data, including all the dots and the mean /- standard error of the mean (SEM). I build a new data frame, called "Descriptive" with the values of the mean and SEM, and I manage to represent the dots and the mean, but not the SEM as error bars. Could anyone help me providing some advice or sugesting any idea for the script? I give you an example with iris data.

library(ggplot2)
library(dplyr)
iris<- iris
SEM <- function(x){sd(x)/sqrt(length(x))}
iris %>% 
  group_by(Species) %>% 
  summarise(MEAN=mean(Sepal.Length), SEM=SEM(Sepal.Length)) ->   
  Descriptive
print(Descriptive)
ggplot(iris, aes(x=Species, y=Sepal.Length)) 
  geom_point() 
  stat_summary(fun=mean, shape=95, size=8, color=2)

CodePudding user response:

Here is a way. Use the data argument to geom_errorbar to pass the data set Descriptive. But this means you will have to change the rest of the data argument values in the other layers.

ggplot()  
  geom_point(
    data = iris, 
    mapping = aes(x=Species, y=Sepal.Length))  
  stat_summary(
    data = iris,
    mapping = aes(x=Species, y=Sepal.Length),
    fun=mean, shape=95, size=8, color=2)  
  geom_errorbar(
    data = Descriptive,
    mapping = aes(x = Species, ymin = MEAN - SEM, ymax = MEAN   SEM), 
    width = 0.2
  )

enter image description here

CodePudding user response:

The stat_summary function can handle all of this. The fun argument sets the middle value, but you can instead use the fun.data argument, with expects a dataframe with a ymin, y, and ymax value - letting it generate error bars:

SEM <- function(x){data.frame(y=mean(x),
                              ymin=mean(x)-sd(x)/sqrt(length(x)),
                              ymax=mean(x) sd(x)/sqrt(length(x)))}

Now we can just use this with geom='errorbar' to get what you want:

ggplot(iris, aes(x=Species, y=Sepal.Length)) 
    geom_point()  
    stat_summary(fun.data=SEM, color = 'red', geom='errorbar', width=0.2)

enter image description here

One other issue is that you've got a lot of overplotting. You might want to use geom_jitter instead:

ggplot(iris, aes(x=Species, y=Sepal.Length)) 
    geom_jitter(width=0.05)  
    stat_summary(fun.data=SEM, color = 'red', geom='errorbar', width=0.2)

enter image description here

Depending on what you want, you can play around with the different geom's.

"errorbar", used above, gives the nice bars, but doesn't have the mean marked on it's own (you'd need add the mean in as you did before separately:

ggplot(iris, aes(x=Species, y=Sepal.Length)) 
    geom_jitter() 
    stat_summary(fun=mean, shape=95, size=6, color=2)  
    stat_summary(fun.data=SEM, color = 'red', geom='errorbar', width=0.2)

enter image description here

'pointrange' gives a dot for mean lines for the range. If you increase the size to make the line thicker, you need to reduce the fatten parameter, which makes the dot bigger than the line.

ggplot(iris, aes(x=Species, y=Sepal.Length)) 
    geom_jitter(width=0.05)  
    stat_summary(fun.data=SEM, color = 'red', geom='pointrange', size= 1, fatten=1)

enter image description here

'crossbar' gives a box with mean and error bars:

ggplot(iris, aes(x=Species, y=Sepal.Length)) 
    geom_jitter(width=0.05)  
    stat_summary(fun.data=SEM, color = 'red', geom='crossbar', width=0.2)

enter image description here

  •  Tags:  
  • Related