Consider the following dataframe
> df=tibble(x1=c(0,100,250,500,1000),x2=lead(x1),y1=c(-20,-10,20,40,60),y2=lead(y1),p=(y2-y1)/(x2-x1))
> df
# A tibble: 5 x 5
x1 x2 y1 y2 p
<dbl> <dbl> <dbl> <dbl> <dbl>
1 0 100 -20 -10 0.1
2 100 250 -10 20 0.2
3 250 500 20 40 0.08
4 500 1000 40 60 0.04
5 1000 NA 60 NA NA
## alternatively
df <- structure(list(x1 = c(0, 100, 250, 500, 1000), x2 = c(100, 250, 500, 1000, NA), y1 = c(-20, -10, 20, 40, 60), y2 = c(-10, 20, 40, 60, NA), p = c(0.1, 0.2, 0.08, 0.04, NA)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))
where each record can be interpreted as the coordinates of a segment with slope p.
I need to modify df so that whenever y1<0 and y2>0 the row is replaced by two rows derived from the one I am replacing:
# A tibble: 5 x 5
x1 x2 y1 y2 p
<dbl> <dbl> <dbl> <dbl> <dbl>
1 0 100 -20 -10 0.1
2 100 150 -10 0 0.2
3 150 250 0 20 0.2
4 250 500 20 40 0.1
4 500 1000 40 60 0.04
5 1000 NA 60 NA NA
Namely: x1 in row 2 and x2 in row 3 of the final dataframe are equal to 150, which is given by x1-y1/p or some other equivalent formula applied to row 2 in the original dataframe.
The difficult part is that I cannot find an efficient way (possibly in a pipe with dplyr) to extract a row, duplicate it, manipulate the duplicated rows and substitute for the original row.
CodePudding user response:
Here is a suggestion how you could do it:
library(dplyr)
df %>%
mutate(id = row_number()) %>% # ad row id
filter(y1 < 0 & y2 > 0) %>% # filter by condition
slice(rep(1:n(), each = 2)) %>% # duplicate each row
group_by(id) %>% # group by id
mutate(x2 = ifelse(row_number()==1, x2-x1, x2), # some ifelse for the condtions
y2 = ifelse(row_number()==1, 0,y2)) %>%
mutate(x1 = ifelse(row_number()==2, lag(x2), x1),
y1 = ifelse(row_number()==2, lag(y2), y1)) %>%
bind_rows(df %>% mutate(id = row_number())) %>% # bind with original df by adding again id
arrange(id) %>% # arrang id
slice(-3) %>% # remove the third of each group
ungroup()
x1 x2 y1 y2 p id
<dbl> <dbl> <dbl> <dbl> <dbl> <int>
1 0 100 -20 -10 0.1 1
2 100 150 -10 0 0.2 2
3 150 250 0 20 0.2 2
4 250 500 20 40 0.08 3
5 500 1000 40 60 0.04 4
6 1000 NA 60 NA NA 5
CodePudding user response:
Not knowing how you calculate the new y1 and y2, here's a start:
library(dplyr)
df %>%
mutate(nextx1 = lead(x1)) %>%
filter(y1 < 0, y2 > 0) %>%
rowwise() %>%
summarize(
x1 = c( x1 , x1 - y1/p ),
x2 = c( x1[2] , nextx1[1] ),
y1 = c( y1 , NA ), # you need to fix thesem, idk
y2 = c( y2 , NA ), #
p
) %>%
bind_rows(filter(df, y1 >= 0 | y2 <= 0)) %>%
arrange(x1, x2)
# # A tibble: 6 x 5
# x1 x2 y1 y2 p
# <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 0 100 -20 -10 0.1
# 2 100 150 -10 20 0.2
# 3 150 250 NA NA 0.2
# 4 250 500 20 40 0.08
# 5 500 1000 40 60 0.04
# 6 1000 NA 60 NA NA
