Home > Back-end >  How to plot time interval data in R using ggplot?
How to plot time interval data in R using ggplot?

Time:01-23

I have a dataframe similar to the following:

> library(lubridate)
> df <- data.frame(name = c("george", "sara", "sam", "bill"),
                   start_date = mdy(c("January 1, 2022", "January 2, 2022", "January 5, 2022", "January 6, 2022")),
                   end_date = mdy(c("January 3, 2022", "January 4, 2022", "January 6, 2022", "January 8, 2022")),
                  group = c(1,1,2,2))

> df <- df %>% 
   mutate(date_range = interval(start_date,
                          end_date))
> df
    name start_date   end_date group                     date_range
1 george 2022-01-01 2022-01-03     1 2022-01-01 UTC--2022-01-03 UTC
2   sara 2022-01-02 2022-01-04     1 2022-01-02 UTC--2022-01-04 UTC
3    sam 2022-01-05 2022-01-06     2 2022-01-05 UTC--2022-01-06 UTC
4   bill 2022-01-06 2022-01-08     2 2022-01-06 UTC--2022-01-08 UTC

I would like to create two plots using ggplot if possible:

  1. The first plot I want to display the date range for each person. It's easier to just show you what I mean, see photo. plot 1

  2. The second plot I want to average the range for each group and display a boxplot or similar to show the distribution of dates for each group. See photo. plot 2

Any thoughts? I'm new to this hence drawing out what I want, I hope that it's helpful and clear.

CodePudding user response:

Allan is completely right when it comes to the first plot using geom_segment, I just thought I'd add that there actually is a geom to exactly do this in the ggalt package.

It's called a dumbbell plot and looks like this:

Dummbbell Plot

Here is the code I used to create it:

library(ggalt)
df %>%
  ggplot(
    aes(
      x = start_date,
      xend = end_date,
      y = name
    )
  )  
  geom_dumbbell(
    colour = "#a3c4dc",
    colour_xend = "#0e668b",
    size = 4
  )

You can then use all the normal functions to make it look prettier. More on geom_dumbbell can be found through the help documentation or on this blog post

CodePudding user response:

You can achieve the first plot with geom_segment

library(ggplot2)

ggplot(df, aes(x = start_date, y = name, colour = name))  
  geom_segment(aes(xend = end_date, yend = name), colour = "black")  
  geom_point(size = 3)  
  geom_point(aes(x = end_date), size = 3)  
  theme_bw()  
  theme(legend.position = "none")

The second requires a bit of data reshaping, as akrun points out:

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(2:3, names_to = "type", values_to = "date") %>%
  ggplot(aes(date, factor(group)))  
  geom_boxplot(aes(colour = factor(group)))  
  theme_bw()  
  theme(legend.position = "none")

Created on 2022-01-22 by the reprex package (v2.0.1)

  •  Tags:  
  • Related