The lengths of two datasets are unequal but they have the same variables. I want to sum the "value" variables of these two datasets by "Date".
Dataset 1:
| Date | value |
|---|---|
| 1/1/2000 | 1 |
| 2/1/2000 | 1 |
| 3/1/2000 | 2 |
| 4/1/2000 | 3 |
| 5/1/2000 | 4 |
| 6/1/2000 | 5 |
| 7/1/2000 | 2 |
Dataset 2:
| Date | value |
|---|---|
| 2/1/2000 | 5 |
| 3/1/2000 | 7 |
| 5/1/2000 | 2 |
| 7/1/2000 | 9 |
Expected outcome:
| Date | value |
|---|---|
| 1/1/2000 | 1 |
| 2/1/2000 | 6 |
| 3/1/2000 | 9 |
| 4/1/2000 | 3 |
| 5/1/2000 | 6 |
| 6/1/2000 | 5 |
| 7/1/2000 | 11 |
CodePudding user response:
The safest option would be a powerjoin:
library(powerjoin)
power_inner_join(
df1, df2,
by = "Date",
conflict = sum
)
But here, a simple match should suffice as well:
df1$value <- df1$value df2$value[match(df1$Date, df2$Date)]
CodePudding user response:
You can aggregate the combined data frames by sum:
df1 <- structure(list(Date = structure(c(10957, 10958, 10959, 10960,
10961, 10962, 10963), class = "Date"), value = c(1, 1, 2, 3,
4, 5, 2)), class = "data.frame", row.names = c(NA, -7L))
df2 <- structure(list(Date = structure(c(10958, 10959, 10961, 10963), class = "Date"),
value = c(5, 7, 2, 9)), class = "data.frame", row.names = c(NA, -4L))
aggregate(value ~ Date, rbind(df1, df2), sum)
Date value
1 2000-01-01 1
2 2000-01-02 6
3 2000-01-03 9
4 2000-01-04 3
5 2000-01-05 6
6 2000-01-06 5
7 2000-01-07 11
