Let's say I have two dataframes like below, (real dataset has many more rows and cols)
df = data.frame("Worker" = c("JBB","JDD","MB","JBB"),
"Age" = c(4,5,6,4))
df2 = data.frame("Initials" = c("JBB","JDD","MB","JOD"),
"Worker" = c("Joe Bloggs/JBB", "Jane Doe/JDD",
"Mr. Big/MB", "John Doe/JOD"))
I would like to replace the Worker col in df with the Worker col from df2
In the future more workers will be added to both dataframes so it would be nice if there was a quick and easy way to do this rather than manually doing something like this for each set of initials
df$Worker<-paste(gsub("JBB", "Joe Bloggs/JBB", df$Worker, perl=TRUE))
Perhaps a loop or simply some kind of tidyverse::replace solution
I have tried various joins but they don't work for me.
Have also tried
df %>%
mutate(new_Worker = case_when(df$Worker == df2$Initials ~ df2$Worker)
This gives errors too.
CodePudding user response:
A possible solution:
library(dplyr)
inner_join(df, df2, by = c("Worker" = "Initials"))
#> Worker Age Worker.y
#> 1 JBB 4 Joe Bloggs/JBB
#> 2 JDD 5 Jane Doe/JDD
#> 3 MB 6 Mr. Big/MB
#> 4 JBB 4 Joe Bloggs/JBB
CodePudding user response:
A simple solution is to rename the Worker column in df as you do the join
left_join(rename(df, Initials = Worker),
df2)
this results in the data.frame with columns 'Initials', 'Worker' and 'Age'. It also assumes that df is the data and that df2 is the lookup list.
I don't think you would use case_when for this example. Presumably there is a large number of Initials.
The other option is to filter and pull the values from df2.
df |>
mutate(Worker = map_chr(Worker,
~ filter(df2, Initials == .x) |>
pull(Worker)
)
)
The map_chr above is needed otherwise the nested filter does not work
