Home > Software design >  Change purrr reduce column naming to numeric?
Change purrr reduce column naming to numeric?

Time:01-14

When using purrr::reduce() to left join a large list of data frames the default is to names the columns with .x, .y, .x.x, .y.y etc..

Is it possible to change this behaviour to obtain column names with numeric endings? e.g.

.1, .2, .3, .4 ?

# test data
data.frame(id= rep(letters[1:5],3,20), visits = rep(1:4,each=5), data =rep(1:5,3,20)) %>%
  # split df into a list for each visit
  group_split(visits) %>%
  # left join each nhs_joint_side
  reduce(left_join, by = c("id"))


# A tibble: 5 x 9
  id    visits.x data.x visits.y data.y visits.x.x data.x.x visits.y.y data.y.y
  <chr>    <int>  <int>    <int>  <int>      <int>    <int>      <int>    <int>
1 a            1      1        2      1          3        1          4        1
2 b            1      2        2      2          3        2          4        2
3 c            1      3        2      3          3        3          4        3
4 d            1      4        2      4          3        4          4        4
5 e            1      5        2      5          3        5          4        5

# ideal data
# A tibble: 5 x 9
  id    visits_1  data_1  visits_2  data_2 visits_3    data_3    visits_4  data_4
  <chr>    <int>  <int>    <int>  <int>      <int>    <int>      <int>    <int>
1 a            1      1        2      1          3        1          4        1
2 b            1      2        2      2          3        2          4        2
3 c            1      3        2      3          3        3          4        3
4 d            1      4        2      4          3        4          4        4
5 e            1      5        2      5          3        5          4        5

CodePudding user response:

You can rename the columns first, and then merge:

df %>%
  group_split(visits) %>%
  imap(function(x, y) x %>% 
         rename_with(~paste(., y, sep = '_'), -id)) %>%
  reduce(left_join, by = 'id')

output

  id    visits_1 data_1 visits_2 data_2 visits_3 data_3 visits_4 data_4
  <chr>    <int>  <int>    <int>  <int>    <int>  <int>    <int>  <int>
1 a            1      1        2      1        3      1        4      1
2 b            1      2        2      2        3      2        4      2
3 c            1      3        2      3        3      3        4      3
4 d            1      4        2      4        3      4        4      4
5 e            1      5        2      5        3      5        4      5

data

df <- data.frame(id= rep(letters[1:5],3,20), visits = rep(1:4,each=5), data =rep(1:5,3,20))
  •  Tags:  
  • Related