Home > Back-end >  How to perform a conditional count in R? (Equivalent of "COUNTIFS" function in Excel)
How to perform a conditional count in R? (Equivalent of "COUNTIFS" function in Excel)

Time:01-15

I have a data set that looks like the following:

Member ID # Location Donation Amount Donation Month
1 A 5 Toys Jan
1 B 2 Toys Jan
2 A 6 Toys Mar
3 B 4 Toys Feb
3 B 9 Toys June

I want to create a column that will sum the number of locations that a member donated toys per period. An example output would be as follows:

Member ID # Location Donation Amount Donation Month Count
1 A 5 Toys Jan 2
1 B 2 Toys Jan 2
2 A 6 Toys Mar 1
3 B 4 Toys Feb 1
3 B 9 Toys June 1

This output lets me know that Member 1 was the only one to donate toys to two different centers in the same month.

I've tried the following code to accomplish this output:

output2 <- output1 %>%
           group_by(`Member ID #`, `Location`, `Donation Amount`, `Donation Month`, 
           .groups='Keep') %>%
           mutate(count = n())

But this just counts every single entry and not entries per the conditions I listed above. How can I create a conditional count to accomplish my goal?

CodePudding user response:

try to do this

df %>% 
  group_by(`Member ID #`) %>% 
  summarise(Count = n_distinct(Location))

CodePudding user response:

We may use split() to split our data by ID and Duration_month. Then determine the unique amount of locations per split.

out <- lapply(split(df, f = ~ df$ID   df$Donation_month), \(x) {
  cbind(x, no_of_locations = rep(length(unique(x$Location)), times = nrow(x)))
})
df_new <- do.call(rbind, out)
rownames(df_new) <- NULL

Output

> df_new
  ID Location Donation_amount Donation_month no_of_locations
1  3        B          4 Toys            Feb               1
2  1        A          5 Toys            Jan               2
3  1        B          2 Toys            Jan               2
4  3        B          9 Toys           June               1
5  2        A          6 Toys            Mar               1

Data

df <- data.frame(ID = c(1,1,2,3,3), Location = c('A','B','A','B','B'),
                 Donation_amount = paste0(c(5,2,6,4,9),' Toys'),
                 Donation_month = c('Jan','Jan','Mar','Feb','June'))
  •  Tags:  
  • Related