How to perform a conditional count in R? (Equivalent of "COUNTIFS" function in Excel)-CodePudding

I have a data set that looks like the following:

Member ID #	Location	Donation Amount	Donation Month
1	A	5 Toys	Jan
1	B	2 Toys	Jan
2	A	6 Toys	Mar
3	B	4 Toys	Feb
3	B	9 Toys	June

I want to create a column that will sum the number of locations that a member donated toys per period. An example output would be as follows:

Member ID #	Location	Donation Amount	Donation Month	Count
1	A	5 Toys	Jan	2
1	B	2 Toys	Jan	2
2	A	6 Toys	Mar	1
3	B	4 Toys	Feb	1
3	B	9 Toys	June	1

This output lets me know that Member 1 was the only one to donate toys to two different centers in the same month.

I've tried the following code to accomplish this output:

output2 <- output1 %>%
           group_by(`Member ID #`, `Location`, `Donation Amount`, `Donation Month`, 
           .groups='Keep') %>%
           mutate(count = n())

But this just counts every single entry and not entries per the conditions I listed above. How can I create a conditional count to accomplish my goal?

CodePudding user response：

try to do this

df %>% 
  group_by(`Member ID #`) %>% 
  summarise(Count = n_distinct(Location))

CodePudding user response：

We may use split() to split our data by ID and Duration_month. Then determine the unique amount of locations per split.

out <- lapply(split(df, f = ~ df$ID   df$Donation_month), \(x) {
  cbind(x, no_of_locations = rep(length(unique(x$Location)), times = nrow(x)))
})
df_new <- do.call(rbind, out)
rownames(df_new) <- NULL

Output

> df_new
  ID Location Donation_amount Donation_month no_of_locations
1  3        B          4 Toys            Feb               1
2  1        A          5 Toys            Jan               2
3  1        B          2 Toys            Jan               2
4  3        B          9 Toys           June               1
5  2        A          6 Toys            Mar               1

Data

df <- data.frame(ID = c(1,1,2,3,3), Location = c('A','B','A','B','B'),
                 Donation_amount = paste0(c(5,2,6,4,9),' Toys'),
                 Donation_month = c('Jan','Jan','Mar','Feb','June'))