I have a multi level list of rasters and it always gives me a headache how to operate on such structures efficiently. I want to take averages from the rasters in 3rd level of the list, going by the indexes. For example:
# make some dummy rasters
a <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
a[] <- sample(1:5,25,replace=T)
b <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
b[] <- sample(1:5,25,replace=T)
c <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
c[] <- sample(1:5,25,replace=T)
d <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
d[] <- sample(1:5,25,replace=T)
e <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
e[] <- sample(1:5,25,replace=T)
f <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
f[] <- sample(1:5,25,replace=T)
g <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
g[] <- sample(1:5,25,replace=T)
h <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
h[] <- sample(1:5,25,replace=T)
i <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
i[] <- sample(1:5,25,replace=T)
j <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
j[] <- sample(1:5,25,replace=T)
k <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
k[] <- sample(1:5,25,replace=T)
l <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
l[] <- sample(1:5,25,replace=T)
)
list_abcd <- list(a,b,c,d)
list_efgh <- list(e,f,g,h)
list_ijkl <- list(i,j,k,l)
list_all <- list(list_abcd,list_efgh,list_ijkl)
I want to perform calculations on the rasters to obtain the averages of the respective 1st rasters in the lists (average of a,e and i),
2nd (b,f and j), 3rd (c,g,k) and 4th (d, h and l).
The result will be a list containing 4 rasters.
In reality, I have a list structured like this for 10 groups, each with over 100 rasters, so the solution needs to be scalable.
Additional info I work with NetCDF files and their structure looks following (below). I create multi-level lists to handle it, but then have a headache how to handle it well without too many laborious for loops.
3 variables:
double var1[lon,lat,years,irr]
double var2[lon,lat,years,irr]
double var3[lon,lat,years,irr]
4 dimensions:
lon Size:720
units: degrees_east
long_name: lon
lat Size:360
units: degrees_north
long_name: lat
years Size:100
units: mapping
long_name: years
irr Size:2
units: mapping
long_name: rainfed/irrigated
CodePudding user response:
Your example data
library(raster)
r <- raster(xmn=0,xmx=5,ymn=0,ymx=5,res=1)
set.seed(123)
list_abcd <- replicate(4, setValues(r, sample(1:5,25,replace=T)))
list_efgh <- replicate(4, setValues(r, sample(1:5,25,replace=T)))
list_ijkl <- replicate(4, setValues(r, sample(1:5,25,replace=T)))
list_all <- list(list_abcd,list_efgh,list_ijkl)
names(list_all) <- c("abcd", "efgh", "ijkl")
Solution, create a single RasterStack and use stackApply
x <- stack(unlist(list_all))
i <- rep(1:4, 3)
s <- stackApply(x, i, mean)
s
#class : RasterBrick
#dimensions : 5, 5, 25, 4 (nrow, ncol, ncell, nlayers)
#resolution : 1, 1 (x, y)
#extent : 0, 5, 0, 5 (xmin, xmax, ymin, ymax)
#crs : proj=longlat datum=WGS84 no_defs
#source : memory
#names : index_1, index_2, index_3, index_4
#min values : 1.333333, 2.000000, 1.000000, 1.666667
#max values : 4.666667, 5.000000, 4.666667, 4.333333
Now let's do this with terra (the replacement of raster). We start with three SpatRasters (similar to a RasterStack)
library(terra)
rr <- rast(xmin=0,xmax=5,ymin=0,ymax=5,res=1)
set.seed(123)
abcd <- replicate(4, setValues(rr, sample(1:5,25,replace=T))) |> rast()
efgh <- replicate(4, setValues(rr, sample(1:5,25,replace=T))) |> rast()
ijkl <- replicate(4, setValues(rr, sample(1:5,25,replace=T))) |> rast()
# the above is equivalent to: abcd <- c(a,b,c,d)
Solution 1: Combine into a single SpatRaster and use tapp
rall <- c(abcd, efgh, ijkl)
z1 <- tapp(rall, rep(1:4, 3), "mean")
Solution 2: Combine into a SpatRasterDataset and use app
rsd <- sds(abcd, efgh, ijkl)
z2 <- app(rsd, "mean")
Later: from your comment I gather the mistake you make is indeed in creating these lists. If I understand you well you have n files representing models, each with 2*100 layers; and you want to average the values of each layer (year & water status) across models; so that you end up with a single raster dataset of 200 layers (or 2 datasets with 100 layers). You can achieve that with something like this:
library(terra)
ff <- list.files(pattern="nc$")
sd <- sds(ff)
x <- app(sd, mean)
Or, if irrigated/rainfed are separate sub-datasets, something like
library(terra)
ff <- list.files(pattern="nc$")
sd <- lapply(ff, \(i) rast(i, "irrigated")) |> sds()
# or sd <- lapply(ff, \(i) rast(i, 1)) |> sds()
x <- app(sd, mean)
With some example data
f <- system.file("ex/logo.tif", package="terra")
sd <- sds(c(f,f,f))
x <- app(sd, mean)
