Pandas get min / max value of last X rows based on condition-CodePudding

I have a table of market data where each row represents a bar. One column is Accumulated Volume and the other is the Trend during the bar. My goal here is to create a third column to hold the min/max of the Accumulated Volume column based on the following criteria:

For consecutive 'Up' Trend bars, get the Max Accumulated Volume for those X number of consecutive bars.

For consecutive 'Down' Trend bars, get the Min Accumulated Volume for those X number of consecutive bars.

Here's an example with the original two columns and the expected outcome for the third column.

Expected Output

I tried to use group by with min / max aggregation but did not get the expected resulted.

CodePudding user response：

You need to create a new column to group by:

df['group'] = (df.Trend.shift(0) != df.Trend.shift(1)).cumsum()

Result:

Trend   group
Up      1
Up      1
Up      1
Up      1
Down    2
Down    2
Down    2
Down    2
Up      3
Up      3
Down    4
Up      5
Up      5

Then use those group values to min/max Accumulated volume based on the Trend value:

group_values = df.groupby(group).apply(
    lambda tdf: tdf['Accumulated Volume'].max() if tdf['Trend'].unique()[0] == 'Up' else tdf['Accumulated Volume'].min())
df = df.merge(
    group_values.to_frame('min/max'),
    how='left',
    left_on='group',
    right_index=True
)
df.drop('group',axis=1, inplace=True)

Final result is equal to your desired output:

Accumulated Volume  Trend   min/max
11                  Up      55
8                   Up      55
55                  Up      55
15                  Up      55
125                 Down    -140
-87                 Down    -140
-121                Down    -140
-140                Down    -140
-160                Up      86
86                  Up      86
8                   Down    8
119                 Up      119
-8                  Up      119