I have a data frame that looks like this
library(tidyverse)
data=data.frame(POS=c(172367,10), SNP=c("ATCG","AG"), QUAL=c(30,20))
data
#> POS SNP QUAL
#> 1 172367 ATCG 30
#> 2 10 AG 20
Created on 2022-02-02 by the reprex package (v2.0.1)
and I want to make it look like this
POS SNP QUAL
172367 A 30
172368 T 30
172369 C 30
172370 G 30
10 A 20
11 G 20
I want to break the multistring into rows with single string and then change the position as well.
Any help is highly appreciated
CodePudding user response:
You can do:
library(dplyr)
library(tidyr)
data %>%
separate_rows(SNP, sep = "(?<=[ACGT])") %>%
mutate(POS = ave(POS, POS, FUN = \(x) x seq_along(x) - 1))
# A tibble: 6 x 3
POS SNP QUAL
<dbl> <chr> <dbl>
1 172367 A 30
2 172368 T 30
3 172369 C 30
4 172370 G 30
5 10 A 20
6 11 G 20
