I'm working with multi-temporal, multispectral satellite imagery. The data are stored as geotiffs in the shape ((n_bands x n_timesteps), height, width). I need to reshape the array for ML model training where each pixel in the image is a "sample". The training array would therefore be shape (n_samples x n_timesteps x n_bands).
Assume the following array and associated variables. Also assume the number of bands in the dataset is 12 and the number if time steps is 4. Therefore, the first dimension of the array is 12*4 = 48.
x = np.random.rand(48, 512, 512)
width = x.shape[1]
height = x.shape[2]
n_samples = width * height
n_bands = 12
n_timesteps = x.shape[0] / n_bands
What is the best way to reshape the array x to x_reshape such that x_reshape.shape returns:
(n_samples, n_timesteps, n_bands)
Making sure to preserve the correct ordering of the data such that a slice of x_reshape[0] is a single "sample" of the dataset of shape (n_timesteps x n_features)?
CodePudding user response:
You can use numpy.reshape() and after that numpy.moveaxis() to achieve what you want:
import numpy as np
x = np.random.rand(48, 512, 512)
width = x.shape[1]
height = x.shape[2]
n_samples = width * height
n_bands = 12
n_timesteps = int(x.shape[0] / n_bands)
x = np.reshape(x, (n_timesteps,n_bands,n_samples))
x = np.moveaxis(x, -1, 0)
