I'm having trouble building a data table that matches numbers based on two conditions (ID and date). Below is an example of a table snippet containing batch data.
| ID | Power | Fuel | Starting_date | Shutting_down_date |
|---|---|---|---|---|
| El_Bel | 344 | WB | 1983 | 2030 |
| El_Opo | 256 | WK | 1987 | 2027 |
| El_Tur | 400 | WB | 2019 | 2049 |
The question is how do I effectively match this data so that the data in the "Power" column is matched until the last year of operation by column "Shutting_down_date" is reached.
| Date | El_Bel | El_Opo | El_Tur |
|---|---|---|---|
| 2017 | |||
| 2018 | |||
| 2019 | |||
| 2020 | |||
| 2021 |
Many thanks for any suggestions.
CodePudding user response:
Let us call the first dataframe x and the second data frame y and let us further assume that each ID only occurs once in the first table. The problem is that you have a different number of years for each ID which means that they cannot be stored in a data.frame (requires all columns to have the same length). You can use a list, though:
result <- list()
for (i in 1:nrow(x)) {
id <- x[i,"ID"]
end_date <- x[i,"Shutting_down_date"]
result[[id]] <- subset(y[,c("Date",id)], Date <= end_date)
}
Then you can query the results as result[["El_Bel"]] or result$El_Bel etc.
