I have a data frame that looks like
| Nicknames | Names |
|---|---|
| Fonse, Fons | Alfons |
| Fonse, Fonsi | Alfons |
| Gustel, Gustl, Guste, | August |
| Baldi | Balthasar |
| Hausl, Baldi | Balthasar |
| Flore, Flori | Florian |
I would like to merge the duplicated rows to be :
| Nicknames | Names |
|---|---|
| Fonse, Fons,Fonse, Fonsi | Alfons |
| Gustel, Gustl, Guste, | August |
| Baldi, Hausl, Baldi | Balthasar |
| Flore, Flori | Florian |
I was able to creat a subset of the duplicate but I don't know how to combine them
nick2 <- subset(nick, any(duplicated(nick$Names)))
Here is the data as a csv file https://github.com/Garybertrand/nick
CodePudding user response:
This should solve your problem
library(data.table)
library(dplyr)
setDT(df)[, list(Nicknames = paste(Nicknames, collapse = ', ')),
by = c('Names')] %>%
select(Nicknames,Names)
CodePudding user response:
You can also use base R.
aggregate(Nicknames ~ Names, unique(df), paste, collapse = ", ")
CodePudding user response:
The short tidyverse solution would be like this:
library(tidyverse)
df %>%
group_by(Names) %>%
summarize(Nicknames = paste0(Nicknames, collapse = ", "))
