I would like to make a grouped boxplot like this 
with elements of a data frame of three columns, ID, an issuer_bank and a risk score for every transaction.
My starting data frame look like this:
| ID | issuer_name | risk_score |
|---|---|---|
| 1 | POSTEPAY | 0 |
| 2 | BANCA SELLA | 10 |
| 3 | BANCA SELLA | 10 |
| 4 | BANCA SELLA | 10 |
| 5 | BANCA SELLA | 33 |
| 6 | POSTEPAY | 0 |
| 7 | POSTEPAY | 0 |
| 8 | POSTEPAY | 10 |
| 9 | CASSA CENTRALE BANCA - CREDITO | 0 |
| 10 | BANCA NAZIONALE DEL LAVORO SP | 0 |
| 11 | BANCO BPM SOCIETA PER AZIONI | 0 |
| 12 | NEXI PAYMENTS | 10 |
| 13 | UNICREDIT | 10 |
| 14 | CREDITO EMILIANO | 0 |
| 15 | POSTEPAY | 0 |
| 16 | POSTEPAY | 10 |
| 17 | POSTEPAY | 0 |
| 18 | N26 GMBH | 10 |
| 19 | BANCO BPM SOCIETA PER AZIONI | 25 |
| 20 | NEXI PAYMENTS | 10 |
| 21 | POSTEPAY | 10 |
| 22 | INTESA SANPAOLO | 0 |
| 23 | CREDIT AGRICOLE CARIPARMA | 10 |
| 24 | CREDIT AGRICOLE CARIPARMA | 10 |
| 25 | FINECOBANK | 0 |
| 26 | UNICREDIT | 44 |
| 27 | BANCA MONTE DEI PASCHI DI SIENA | 0 |
| 28 | UNICREDIT | 0 |
| 29 | FINECOBANK | 25 |
| 30 | CASSA CENTRALE BANCA - CREDITO | 10 |
| 31 | NEXI PAYMENTS | 10 |
| 32 | INTESA SANPAOLO | 25 |
| 33 | INTESA SANPAOLO | 0 |
| 34 | POSTEPAY | 10 |
| 35 | INTESA SANPAOLO | 10 |
| 36 | INTESA SANPAOLO | 20 |
| 37 | POSTEPAY | 10 |
| 38 | INTESA SANPAOLO | 10 |
| 39 | POSTEPAY | 0 |
| 40 | POSTEPAY | 0 |
| 41 | UNICREDIT | 10 |
| 42 | POSTEPAY | 10 |
| 43 | BIBANCA | 0 |
| 44 | POSTEPAY | 0 |
| 45 | INTESA SANPAOLO | 10 |
| 46 | FINDOMESTIC BANCA | 10 |
| 47 | POSTEPAY | 0 |
| 48 | POSTEPAY | 10 |
I would like to group for Issuer_name and make a boxplot of the risk score and compute all in a single chart.
The manual process that I have done is the following
df_risk_score_single <- df_risk_score %>% filter(issuer_name == "POSTEPAY")
ggplot(data = df_risk_score_single, aes(x= issuer_name, y=risk_score)) geom_boxplot()
any suggestion?
CodePudding user response:
I think you are looking for something like the following code snippet, which results in the graph shown below.
I used the sample data from your question editing one value. Of course you'll have to improve this code for your requirements and by using your complete data.
ggplot(data = df_risk_score, aes(x= issuer_name, y=risk_score, group=issuer_name, fill = issuer_name))
geom_boxplot()
theme(axis.text.x=element_blank(), axis.ticks.x=element_blank())
xlab("Issuer")
ylab("Risk score")
guides(fill=guide_legend(title="Legend Title: Issuer"))


