Home > Mobile >  pandas conditionally shift or forward fill another column by group
pandas conditionally shift or forward fill another column by group

Time:02-05

I have this data:

# df
ID     Col1 
A      -4   
A      -3
A      -2
A      -1
A       1
A       2
A    9992   
B      -4
B      -3   
B      -2
B      -1   
B       9   
B      13
B      21
B      38   
B    9992   

I want to create a new column Col_NEW with forward fill or rolling calculation based on Col1 by ID.

The final result:

# df
ID     Col1     Col_NEW
A      -4            -4
A      -3            -3
A      -2            -2
A      -1            -1
A       1             0
A       2             2
A    9992             3
B      -4            -4
B      -3            -3
B      -2            -2
B      -1            -1 
B       9             0
B      13            10
B      21            14
B      38            22
B    9992            39

The logic is as below:

  1. If Col is -4, -3, -2, -1, then Col_NEW are the same.
  2. Otherwise, Col_NEW will be the Col1 forward filling and plus 1.

I tried this but is there any way to create Col_NEW directly?

df["Col_NEW"] = df.groupby('ID')['Col1'].transform(lambda x: (x.shift(1)   1))

Any idea?

CodePudding user response:

You can first do the shift plus one

df["Col_NEW"] = df.groupby('ID')['Col1'].transform(lambda x: (x.shift(1)   1))

Then you correct for the negative values in 'Col1' as:

df['Col_NEW'] = df.apply(
    lambda x: x['Col1'] if xx['Col1']<0 else x['Col_NEW'], axis=1)

I assume the only negative values in your dataframe are [-1, -2, -3, -4]. If it is not the case, you will need to edit the condition in the last apply()call.

CodePudding user response:

Something like this, without any apply or any other iterations:

retain_mask = df['Col1'].isin([-4,-3,-2,-1])
df["Col_NEW"] =df.groupby('ID')['Col1'].shift() 1
df.loc[retain_mask,"Col_NEW"] = df.loc[retain_mask,'Col1']

output:

    ID      Col1    Col_NEW
--  ----  ------  ---------
 0  A         -4         -4
 1  A         -3         -3
 2  A         -2         -2
 3  A         -1         -1
 4  A          1          0
 5  A          2          2
 6  A       9992          3
 7  B         -4         -4
 8  B         -3         -3
 9  B         -2         -2
10  B         -1         -1
11  B          9          0
12  B         13         10
13  B         21         14
14  B         38         22
15  B       9992         39

Using where

Same idea, slightly more concise:

retain_mask = df['Col1'].isin([-4,-3,-2,-1])
df["Col_NEW"] = df['Col1'].where(retain_mask, df.groupby('ID')['Col1'].shift() 1)

CodePudding user response:

Combine groupby.shift with Series.where. This keeps the values where cond is met, and otherwise uses other:

cond = df['Col1'].isin([-1, -2, -3, -4])
other = df.groupby('ID')['Col1'].shift().add(1)

df['Col_NEW'] = df['Col1'].where(cond, other)

Output:

   ID  Col1  Col_NEW
0   A    -4       -4
1   A    -3       -3
2   A    -2       -2
3   A    -1       -1
4   A     1        0
5   A     2        2
6   A  9992        3
7   B    -4       -4
8   B    -3       -3
9   B    -2       -2
10  B    -1       -1
11  B     9        0
12  B    13       10
13  B    21       14
14  B    38       22
15  B  9992       39
  •  Tags:  
  • Related