Home > database >  Loop over netCDF datetime format and calculate mean based on month
Loop over netCDF datetime format and calculate mean based on month

Time:01-18

I have a dataset (input_file) with the dimensions (504, 720, 500) where the first is a datetime value:

0     1979-01-15
1     1979-02-15
2     1979-03-15
3     1979-04-15
4     1979-05-15
         ...    
499   2020-08-15
500   2020-09-15
501   2020-10-15
502   2020-11-15
503   2020-12-15
Length: 504, dtype: datetime64[ns]

There is a variable with values I want to average per month. So ultimately I would like 12 values with the average of the variable based on the month in the first dimension.

I tried looping over it like such:

# empty dataframe
df = pd.DataFrame(columns = ['Month', 'Value'])

for i in range(size(df['time'])):
    month = input_file['time'][i].month # get the current month
    avg = np.average(input_file['values'][i, :, :]) # average for the month of that year

    # append to df
    df = df.append(pd.DataFrame({'Month' : month,
                                 'Value' : avg})   

But up until here I am a bit lost, this doesn't work (invalid syntax) and I would still need to loop over the values again to get the average for each month seperately.

CodePudding user response:

Assuming the 2nd and 3rd dimensions are lat and lon, it seems what you are trying to do is just:

input_file.mean(dim = ['lat', 'lon'])

Then you can convert to a dataframe with .to_dataframe()

  •  Tags:  
  • Related