How get the average from three different datasets in R-CodePudding

I've started using R 3 months ago, and working with information from weather stations. Now I want to get the average from three different datasets of the station located in the same place.

My data frames look like this :

Date_Measure	Id	Tmin	Tmax	Tmedia	Smin	Smax	Smedia	Hrmin	Hrmax	Hrmedia
2020-06-21 06:00:00	M03100245	19.10	19.97	19.335	25.04	71.14	44.930	83.78	84.28	84.280
2020-06-21 07:00:00	M03100245	20.31	21.69	20.915	89.69	144.34	108.220	81.76	83.78	82.510
2020-06-21 08:00:00	M03100245	21.87	24.64	22.840	115.56	263.31	183.035	76.70	81.76	79.985

Date_Measure	Id	Tmin	Tmax	Tmedia	Smin	Smax	Smedia	Hrmin	Hrmax	Hrmedia
2020-06-21 06:00:00	M03100247	19.43	20.20	19.560	17.46	68.28	37.880	91.69	92.72	92.720
2020-06-21 07:00:00	M03100247	20.64	21.81	21.050	87.55	139.98	105.535	87.07	91.18	88.865
2020-06-21 08:00:00	M03100247	21.85	24.75	23.015	112.42	266.65	183.440	75.24	86.55	82.695

Date_Measure	Id	Tmin	Tmax	Tmedia	Smin	Smax	Smedia	Hrmin	Hrmax	Hrmedia
2020-06-21 06:00:00	M03100296	19.39	20.16	19.560	29.18	77.94	48.900	89.61	90.69	90.690
2020-06-21 07:00:00	M03100296	20.55	21.53	20.805	95.95	154.43	116.065	85.82	89.07	86.370
2020-06-21 08:00:00	M03100296	21.62	24.88	23.035	122.26	280.18	194.635	71.22	85.28	79.870

I want to take an average of each date and each hour for them in a single general table that combines the three, how can I do this process?

CodePudding user response：

You can combine the 3 dataframes, extract date and hour from Date_Measure variable and take average of numeric variables using across.

library(dplyr)
library(lubridate)

bind_rows(d1, d2, d3) %>%
  #Or instead of writing them individually, we can also use
  #bind_rows(mget(paste0('d', 1:3))) %>%
  mutate(Date_Measure = ymd_hms(Date_Measure), 
         Date = as.Date(Date_Measure),
         Hour = hour(Date_Measure)) %>%
  group_by(Date, Hour) %>%
  summarise(across(Tmin:Hrmedia, mean), .groups = 'drop')

#   Date        Hour  Tmin  Tmax Tmedia  Smin  Smax Smedia Hrmin Hrmax Hrmedia
#  <date>     <int> <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl> <dbl> <dbl>   <dbl>
#1 2020-06-21     6  19.3  20.1   19.5  23.9  72.5   43.9  88.4  89.2    89.2
#2 2020-06-21     7  20.5  21.7   20.9  91.1 146.   110.   84.9  88.0    85.9
#3 2020-06-21     8  21.8  24.8   23.0 117.  270.   187.   74.4  84.5    80.8

CodePudding user response：

First make the number of dataset entries to be the same in all of them.

Then create another dataset by iterating over the column of date and hour.

I guess you need to generate the remaining columns by appropriate formulas.