I have a table of market data where each row represents a bar. One column is Accumulated Volume and the other is the Trend during the bar. My goal here is to create a third column to hold the min/max of the Accumulated Volume column based on the following criteria:
For consecutive 'Up' Trend bars, get the Max Accumulated Volume for those X number of consecutive bars.
For consecutive 'Down' Trend bars, get the Min Accumulated Volume for those X number of consecutive bars.
Here's an example with the original two columns and the expected outcome for the third column.
I tried to use group by with min / max aggregation but did not get the expected resulted.
CodePudding user response:
You need to create a new column to group by:
df['group'] = (df.Trend.shift(0) != df.Trend.shift(1)).cumsum()
Result:
Trend group
Up 1
Up 1
Up 1
Up 1
Down 2
Down 2
Down 2
Down 2
Up 3
Up 3
Down 4
Up 5
Up 5
Then use those group values to min/max Accumulated volume based on the Trend value:
group_values = df.groupby(group).apply(
lambda tdf: tdf['Accumulated Volume'].max() if tdf['Trend'].unique()[0] == 'Up' else tdf['Accumulated Volume'].min())
df = df.merge(
group_values.to_frame('min/max'),
how='left',
left_on='group',
right_index=True
)
df.drop('group',axis=1, inplace=True)
Final result is equal to your desired output:
Accumulated Volume Trend min/max
11 Up 55
8 Up 55
55 Up 55
15 Up 55
125 Down -140
-87 Down -140
-121 Down -140
-140 Down -140
-160 Up 86
86 Up 86
8 Down 8
119 Up 119
-8 Up 119
