I have a data frame delineated by ownership, private(50) and state(30). Looking to create 5 new rows that are the sum of ownership 50 and ownership 30 as long as they have a matching area value. Desired result is below.
naics <- c(611,611,611,611,611,611,611,611,611,611)
ownership <- c(50,50,50,50,50,30,30,30,30,10)
area <- c(001,003,005,009,011,001,003,005,011,001)
d200201 <- c(14,17,20,23,26,3,5,7,9,100)
d200202 <- c(15,18,21,24,28,9,11,13,15,105)
private <- data.frame(naics,ownership,area,d200201,d200202)
naics ownership area d200201 d200202
611 50 001 17 24
611 50 003 22 29
611 50 005 27 34
611 50 009 23 24 (no sum because no 30 value)
611 50 011 35 43
CodePudding user response:
Is this what you are looking for?
library(dplyr)
private %>%
group_by(naics, area) %>%
summarize(
across(c(d200201, d200202), ~sum(.x[ownership %in% c(30, 50)])),
ownership = 50, .groups = "drop"
)
Output
# A tibble: 5 x 5
naics area d200201 d200202 ownership
<dbl> <dbl> <dbl> <dbl> <dbl>
1 611 1 17 24 50
2 611 3 22 29 50
3 611 5 27 34 50
4 611 9 23 24 50
5 611 11 35 43 50
CodePudding user response:
library(tidyverse)
private %>%
filter(ownership %in% c(50, 30)) %>%
group_by(area) %>%
summarize(across(starts_with("d200"), sum))
#> # A tibble: 5 × 3
#> area d200201 d200202
#> <dbl> <dbl> <dbl>
#> 1 1 17 24
#> 2 3 22 29
#> 3 5 27 34
#> 4 9 23 24
#> 5 11 35 43
Created on 2022-01-08 by the reprex package (v2.0.1)
