Conditionally assign values from one dataframe to another [R]-CodePudding

I am learning R. I know how to join dataframes based on various criteria and using various methods - all of that if they have something in common (e.g. column). But what if they do not?

I have a two example dataframes: df_data and df_categories (code provided below). I would like to modify df_data by adding new columns based on the values in df_categories. The dataframes do not share a common column.

#first dataframe (df_category):
group_vec <- c(rep("nerd", 5),
               rep("bum", 5),
               rep("hipster", 5),
               rep("metalhead", 5),
               rep("geek", 5),
               rep("hooligan", 5))
sample_vec <- c("n1", "n2", "n3", "n4", "n5",
                "b1", "b2", "b3", "b4", "b5",
                "h1", "h2", "h3", "h4", "h5",
                "m1", "m2", "m3", "m4", "m5",
                "g1", "g2", "g3", "g4", "g5",
                "ho1", "ho2", "ho3", "ho4", "ho5")
df_category <- data.frame(group_vec, sample_vec)

#second dataframe (df_data):
group <- c(rep("A", 2), 
           rep("B", 5), 
           rep("C", 4), 
           rep("D", 3))
value <- c(20, 19, 11, 8, 9, 13, 10, 7, 6, 7, 5, 17, 16, 18)
sample <- c("one1", "two1", 
            "one2", "two2","thr2", "fou2", "fiv2", 
            "one3", "two3","thr3", "fou3",
            "one4", "two4","thr4")

df_data <- data.frame(group, sample, value)

In a df_data, I would like to create a column named "gr_vec", which would contain values from df_category$group_vec - for each unique value in df_data$group.

In a df_data, I would also like to create a column named "sam_vec", which would contain values from df_category$sample_vec - for each unique value in df_data$sample.

Here is the desired output:

#output dataframe: 
group <- c(rep("A", 2), 
           rep("B", 5), 
           rep("C", 4), 
           rep("D", 3))
gr_vec <- c(rep("nerd", 2),
            rep("bum", 5),
            rep("hipster", 4),
            rep("metalhead", 3))
sample <- c("one1", "two1", 
            "one2", "two2","thr2", "fou2", "fiv2", 
            "one3", "two3","thr3", "fou3",
            "one4", "two4","thr4")
sam_vec <- c("n1", "n2",
                "b1", "b2", "b3", "b4", "b5",
                "h1", "h2", "h3", "h4",
                "m1", "m2", "m3")
value <- c(20, 19, 11, 8, 9, 13, 10, 7, 6, 7, 5, 17, 16, 18)

df_out <- data.frame(group, gr_vec, sample, sam_vec, value)

I tried with match, rle and case_when, but was unable to assign the values correctly. I did not find a similar question, so I posted it here.

CodePudding user response：

I think you can try this approach with merging the two data.frames. Since they have no columns in common, you can create columns to merge on, based on your groups and position within each group.

First, use rleid to provide unique id's for each group in both data.frames. Then, use rowid to enumerate rows within each group. Then merge.

library(data.table)

df_data$group_num <- rleid(df_data$group)
df_data$sub_num <- rowid(df_data$group)

df_category$group_num <- rleid(df_category$group_vec)
df_category$sub_num <- rowid(df_category$group_vec)

merge(df_data, df_category, by = c("group_num", "sub_num"))

Output

   group_num sub_num group sample value group_vec sample_vec
1          1       1     A   one1    20      nerd         n1
2          1       2     A   two1    19      nerd         n2
3          2       1     B   one2    11       bum         b1
4          2       2     B   two2     8       bum         b2
5          2       3     B   thr2     9       bum         b3
6          2       4     B   fou2    13       bum         b4
7          2       5     B   fiv2    10       bum         b5
8          3       1     C   one3     7   hipster         h1
9          3       2     C   two3     6   hipster         h2
10         3       3     C   thr3     7   hipster         h3
11         3       4     C   fou3     5   hipster         h4
12         4       1     D   one4    17 metalhead         m1
13         4       2     D   two4    16 metalhead         m2
14         4       3     D   thr4    18 metalhead         m3