Home > Mobile >  How to expand dataframe based on two columns?
How to expand dataframe based on two columns?

Time:01-04

I am not sure how to solve the following. I am having a dataframe which looks like this:

df <- structure(list(category = c(1, 2, 3, 4, 5), f1 = c(2, 3, 2, 3, 1), 
           f2 = c(1, 2, 4 ,1, 2)),  row.names = c(NA, -5L),
           class = c("tbl_df", "tbl", "data.frame"))

The numbers in the variables (except the variable named "category") represent subjects which belong in one of those categories and f1 and f2 are a variable split into these two variables. I would like to "melt" them and those 2 in f1 variable for example would become 2 rows within a variable called "f" and they will have the number 1 while keeping the category next to them which is 1 in that case. F2, category 1 has 1 observation and would become 1 row within again the variable "f" and category 1 but it will have the number 2 this time.

So f1 will always produce rows which will have 1 in the column "f" and f2 will produce rows which will always have "2" in column "f".

Below is a visual example of what I am trying to achieve:

      category    f 
    1     1       1
    2     1       1
    3     1       2
    4     2       1       
    5     2       1
    6     2       1
    7     2       2      
    8     2       2
    9     3       1
   10     3       1
   11     3       2
   12     3       2
   13     3       2
   14     3       2

Thank you in advance.

CodePudding user response:

df %>% 
  pivot_longer(cols=c("f1", "f2")) %>% 
  mutate(name=replace(name, name=="f1", 1)) %>%
  mutate(name=replace(name, name=="f2", 2)) %>%
  group_by(category,name,value) %>%
  expand(value=seq(1:value)) %>%
  rename(f=name) %>%
  as.data.frame() %>%
  select(-value)

    category f
1         1 1
2         1 1
3         1 2
4         2 1
5         2 1
6         2 1
7         2 2
8         2 2
9         3 1
10        3 1
11        3 2
12        3 2
13        3 2
14        3 2
15        4 1
16        4 1
17        4 1
18        4 2
19        5 1
20        5 2
21        5 2

CodePudding user response:

This is a perfect task for tidyr::uncount:

library(tidyr)
df %>% 
  pivot_longer(cols= c("f1", "f2"), names_to = "f", names_transform = list(f = seq)) %>% 
  uncount(value)

Output:

df
   category f
1         1 1
2         1 1
3         1 2
4         2 1
5         2 1
6         2 1
7         2 2
8         2 2
9         3 1
10        3 1
11        3 2
12        3 2
13        3 2
14        3 2
15        4 1
16        4 1
17        4 1
18        4 2
19        5 1
20        5 2
21        5 2

CodePudding user response:

With base R:

nGroups = length(df$category)
df <- data.frame(category = rep(seq(nGroups), df$f1   df$f2),
                 f = rep(rep(1:2, nGroups), c(rbind(df$f1, df$f2))))

   category f
1         1 1
2         1 1
3         1 2
4         2 1
5         2 1
6         2 1
7         2 2
8         2 2
9         3 1
10        3 1
11        3 2
12        3 2
13        3 2
14        3 2
15        4 1
16        4 1
17        4 1
18        4 2
19        5 1
20        5 2
21        5 2
  •  Tags:  
  • Related