Home > Back-end >  identify increase in value in a sequence and revert it based on condition R
identify increase in value in a sequence and revert it based on condition R

Time:01-25

I am working with a sequence of entries that occurred over a span of days. A slice of the data for a given id essentially looks like something below. Where there is an increase in event_order number and the type_of_event in the same row is 6 or 7 or 8, I would like the value in the event_order column to remain as the previous (e.g. if there is an event_order increase from 2 to 3 but the type_of_event is 7, then change the 3 back to 2).

id <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
event_order <- c(1,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4)
type_of_event <- c(0,1,3,4,0,0,6,7,7,7,8,7,7,7,8,8)

df <- data.frame(id, event_order, type_of_event)
df

id        event_order    type_of_event
1          1              0
1          1              1
1          1              3
1          1              4
1          2              0
1          2              0
1          2              6
1          2              7
1          3              7    <--
1          3              7
1          3              8
1          3              7
1          3              7
1          4              7    <--
1          4              8
1          4              8

The desired output is below. The change that occurs is in the event_order column.

id        event_order    type_of_event
1          1              0
1          1              1
1          1              3
1          1              4
1          2              0
1          2              0
1          2              6
1          2              7
1          2              7   
1          2              7
1          2              8
1          2              7
1          2              7
1          2              7
1          2              8
1          2              8

I was searching for a similar q/a to my specific problem, but I was unable to come to a conclusion. Any help or guidance would be appreciated.

Edit: When I try the following code provided

df[df$type_of_event %in% c(6:8),]$event_order <- head(df[df$type_of_event %in% c(6:8), ]$event_order, 1)

The code works for the example data I gave. However, I don't want the first value of event_order to be applied, but rather the most recent (this may not have been clear in my initial wording). Below is an example of an instance in the data where the code does not necessarily provide the desired output.

Data:

id <- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
event_order <- c(1,1,1,1,2,2,2,2,3,3,3,3,3,4,4,4)
type_of_event <- c(0,1,3,4,0,0,6,4,0,7,8,7,7,7,8,8)

df <- data.frame(id, event_order, type_of_event)
df

  id        event_order    type_of_event
    1          1              0
    1          1              1
    1          1              3
    1          1              4
    1          2              0
    1          2              0
    1          2              6
    1          2              4
    1          3              0    <-- changed to non-6/7/8 value
    1          3              7
    1          3              8
    1          3              7
    1          3              7
    1          4              7    <--
    1          4              8
    1          4              8

When I apply the code, the following occurs:

id        event_order    type_of_event
        1          1              0
        1          1              1
        1          1              3
        1          1              4
        1          2              0
        1          2              0
        1          2              6
        1          2              4
        1          3              0    <--
        1          2              7
        1          2              8
        1          2              7
        1          2              7
        1          2              7    
        1          2              8
        1          2              8

This is the desired output:

id        event_order    type_of_event
        1          1              0
        1          1              1
        1          1              3
        1          1              4
        1          2              0
        1          2              0
        1          2              6
        1          2              4
        1          3              0    
        1          3              7
        1          3              8
        1          3              7
        1          3              7
        1          3              7    
        1          3              8
        1          3              8

CodePudding user response:

Here is a possible base R solution.
First create a binary variable telling where in event_order there is a change. Then use it as a logical index and get the values in type_of_event satisfying the condition (to be 6, 7 or 8). And assign the entire segment of event_order to the previous one in case there's a match.

d <- c(1, diff(df$event_order))
inx <- which(df$type_of_event[as.logical(d)] %in% 6:7)
sp <- split(df$event_order, df$event_order)
for(i in inx){
  if(i > 1L){
    sp[[i]] <- sp[[i - 1]]
  }
}
df$event_order <- unlist(sp)
df

CodePudding user response:

Is this what you're looking for? Applying the first value of event_order when type_of_event is 6, 7 or 8.

df[df$type_of_event %in% c(6:8),]$event_order <- head(df[df$type_of_event %in% c(6:8), ]$event_order, 1)

   id event_order type_of_event
1   1           1             0
2   1           1             1
3   1           1             3
4   1           1             4
5   1           2             0
6   1           2             0
7   1           2             6
8   1           2             7
9   1           2             7
10  1           2             7
11  1           2             8
12  1           2             7
13  1           2             7
14  1           2             7
15  1           2             8
16  1           2             8
  •  Tags:  
  • Related