Home > Blockchain >  How to aggregate data from years to decades and plot them?
How to aggregate data from years to decades and plot them?

Time:01-24

This is the graph that I would like to reproduce:

but for that I have to change the years column because on the graph the x axis is in decades. By what means could I accomplish this ?

This is what I did to extract the data from the site (enter image description here

CodePudding user response:

First, strsplit, make a proper year matrix, combine back with famines divided by number of years and reshape to long format (lines 1:6). Next, aggregate sums by decade and barplot it.

r <- strsplit(data1$Year, '-|–|, ') |>
  rapply(\(y) unlist(lapply(y, \(x) f(max(as.numeric(y)), x))), how='r') |>
  {\(.) t(sapply(., \(x) `length<-`(x, max(lengths(.)))))}() |>
  {\(.) cbind(`colnames<-`(., paste0('year.', seq_len(dim(.)[2]))),
         n=dim(.)[2] - rowSums(is.na(.)))}() |>
  {\(.) data.frame(., f=as.numeric(gsub('\\D', '', 
                                        data1$`Excess Mortality midpoint`))/
               .[, 'n'])}()|>
  reshape(1:3, direction='long') |>
  stats:::aggregate.formula(formula=f ~ as.integer(substr(year, 1, 3)), 
                            FUN=sum) |>
  t() 


## plot
op <- par(mar=c(5, 5, 4, 2) .1)  ## set/store old pars

b <- barplot(r, axes=FALSE, ylim=c(0, max(r[2, ])*1.05),
        main='Famine victims', )
abline(h=asq, col='lightgrey', lty=3)
barplot(r, names.arg=paste0(r[1, ], '0s'), col='#20254c',
        cex.names=.8, axes=FALSE, add=TRUE)
asq <- seq(0, max(axTicks(2)), 2e6)
axis(2, asq, labels=FALSE)
mtext(paste(asq/1e6, 'Million'), 2, 1, at=asq, las=2)
text(b, r[2, ]   5e5, labels=formatC(r[2, ], format='d', big.mark=','), cex=.7)
box()

par(op)  ## restore old pars

enter image description here

In line 2, I used this helper function f() to fill up the pseudo-years:

f <- \(x1, x2, n1=nchar(x1)) {
  u <- lapply(list(x1, x2), as.character)
  s <- c(n1 - nchar(u[[2]])   1L, n1)
  as.integer(`substr<-`(u[[1]], s[1], s[2], u[[2]]))
}

You can refine the aggregation method yourself to make the result exactly look like the original, but maybe this is better :)

  •  Tags:  
  • Related