I have a list of a few companies with each line as a deal. My table indicates whether they had a Initial Public Offering (IPO) or not, and what bank lead the deal:
df <- data.frame(Issuer=c("ABC Inc.", "ABC Inc.","ABC Inc.", "XYZ Co.","XYZ Co","123 Inc.","123 Inc."),
IPO=c("Yes","No","No","Yes","No","Yes","No"),
Bank =c("RBC","RBC","CIBC","RBC","Goldman","Citi","Citi"))
I want to create a list of all Issuers where they have the same Bank when IPO=="Yes" AND IPO=="No" like so:
| Issuer | Bank |
|---|---|
| ABC Inc. | RBC |
| 123 Inc. | Citi |
I've been circling the problem like this:
temp = df %>% group_by(Issuer) %>% filter(Bank[IPO=="Yes"] == Bank[IPO=="No"])
Thank you!
CodePudding user response:
The following groups the data by "Bank" and "Issuer", and checks to see if that combination has exactly two unique values for "IPO" (i.e., Yes and No). In that case, the Issuer/Bank combination meets your criteria. The rest of the dplyr chain filters down to the requested output information.
library(tidyverse)
df <- data.frame(Issuer=c("ABC Inc.", "ABC Inc.","ABC Inc.", "XYZ Co.","XYZ Co","123 Inc.","123 Inc."),
IPO=c("Yes","No","No","Yes","No","Yes","No"),
Bank =c("RBC","RBC","CIBC","RBC","Goldman","Citi","Citi"))
df.new <- df %>%
group_by(Issuer, Bank) %>%
mutate(
both_ipo = length(unique(IPO)) == 2
) %>%
filter(both_ipo) %>%
select(Issuer, Bank) %>%
distinct
Issuer Bank
<chr> <chr>
1 ABC Inc. RBC
2 123 Inc. Citi
You don't necessarily need to create the both_ipo flag, and could use it directly in filter like below, but I usually prefer to create any new attributes in a separate step so that I can check them.
df.new <- df %>%
group_by(Issuer, Bank) %>%
filter(length(unique(IPO)) == 2) %>%
select(Issuer, Bank) %>%
distinct
CodePudding user response:
Here's another way -
library(dplyr)
df %>%
group_by(Issuer, Bank) %>%
filter(all(c('Yes', 'No') %in% IPO)) %>%
ungroup %>%
distinct(Issuer, Bank)
# Issuer Bank
# <chr> <chr>
#1 ABC Inc. RBC
#2 123 Inc. Citi
