Home > database >  Elegant ggplot to report summary data and trend at each time point in an RCT
Elegant ggplot to report summary data and trend at each time point in an RCT

Time:02-02

I am analysing an RCT and I wish to report summary statistics (mean with 95%CI) for a number of variables at three time points stratified by treatment allocation. Below is my code so far which only yields this figure.

set.seed(42)
n <- 100
dat1 <- data.frame(id=1:n,
                   treat = factor(sample(c('Trt','Ctrl'), n, rep=TRUE, prob=c(.5, .5))),
                   time = factor("T1"),
                   outcome1=rbinom(n = 100, size = 1, prob = 0.3),
                   st=runif(n, min=24, max=60),
                   qt=runif(n, min=.24, max=.60),
                   zt=runif(n, min=124, max=360)
)
dat2 <- data.frame(id=1:n,
                   treat = dat1$treat,
                   time = factor("T2"),
                   outcome1=dat1$outcome1,
                   st=runif(n, min=34, max=80),
                   qt=runif(n, min=.44, max=.90),
                   zt=runif(n, min=214, max=460)
)
dat3 <- data.frame(id=1:n,
                   treat = dat1$treat,
                   time = factor("T3"),
                   outcome1=dat1$outcome1,
                   st=runif(n, min=44, max=90),
                   qt=runif(n, min=.74, max=1.60),
                   zt=runif(n, min=324, max=1760)
)

dat <- rbind(dat1,dat2, dat3)

ggplot(dat,aes(x=mean(zt), y=time))   geom_point(aes(colour=treat))   coord_flip()   geom_line(aes(colour=treat)) 

I have three questions

  1. can a line be added connecting T1 to T2 to T3 showing the trend
  2. can the 95%CI for the mean be added to each point without having to calculate a "ymin" and "ymax" for all my response variables
  3. if I have multiple response variables (in this example "st", "qt" and "zt") is there a way to produce these all at one as some sort of facet?

example plot

CodePudding user response:

Pivot_longer should do most of what you need. Pivot your st, qt, and zt (and whatever other response variables you need). Here I've labeled them "response_variables" and their values as value. You can then facet_wrap by response_variable. Stat_summary will add a line and the mean and ci (se), after group and color by treat. I opted for scales = "free" in facet_wrap otherwise you won't see much going on as zt dominates with its larger range

library(dplyr)
library(ggplot2)
library(Hmisc)
library(tidyr)
dat %>% 
  pivot_longer(-(1:4), names_to = "response_variables") %>% 
  ggplot(.,aes(x=value, y=time, group = treat, color = treat))  
  facet_wrap(~response_variables, scales = "free")   
  coord_flip()  
  stat_summary(fun.data = mean_cl_normal,
                 geom = "errorbar")  
  stat_summary(fun = mean,
               geom = "line")  
    stat_summary(fun = mean,
               geom = "point")

plot_output

  •  Tags:  
  • Related