Is it possible to merge columns with the same identifier and stack these on top of one another from the same dataframe. Also to then remove NA values as seen in the Desired DataFrame output?
Current DataFrame
id Col1 Col2 Col3
1 A NA C
1 NA B NA
2 NA B NA
2 A NA C
3 A NA C
Desired DataFrame
id NewCol
1 A
1 B
1 C
2 A
2 B
2 C
3 A
3 C
CodePudding user response:
Reshape from 'wide' to 'long' with pivot_longer while removing the NAs with values_drop_na and then do the arrange on all the columns
library(tidyr)
library(dplyr)
pivot_longer(df1, cols = starts_with('Col'), values_to = 'NewCol',
names_to = NULL, values_drop_na = TRUE) %>%
arrange(across(everything()))
-output
# A tibble: 8 × 2
id NewCol
<int> <chr>
1 1 A
2 1 B
3 1 C
4 2 A
5 2 B
6 2 C
7 3 A
8 3 C
data
df1 <- structure(list(id = c(1L, 1L, 2L, 2L, 3L), Col1 = c("A", NA,
NA, "A", "A"), Col2 = c(NA, "B", "B", NA, NA), Col3 = c("C",
NA, NA, "C", "C")), class = "data.frame", row.names = c(NA, -5L
))
CodePudding user response:
As an alternative we could use splitshakeshape packackge:
library(splitstackshape)
library(dplyr)
df %>%
merged.stack(df, id.vars="id",
var.stubs="Col", sep = ".") %>%
na.omit() %>%
select(-.time_1)
id Col
1: 1 A
2: 1 B
3: 1 C
4: 2 A
5: 2 B
6: 2 C
7: 3 A
8: 3 C
