I have data of cooking oil and its boiling temp and try to rank it by higher boiling temp. I'm using these code below:
df['ranking']=df['boil_temp'].rank(ascending=False)
df = df.set_index('ranking')
df_ranked = df.sort_values(by=['ranking'],ascending=True)
print(df_ranked)
oil boil_temp
Ranking
1.0 avocado 270
2.0 sunflower 252
4.0 beef_tallow 250
4.0 butter_clarified 250
4.0 mustard 250
6.0 palm 235
7.0 corn 230
8.0 grapeseed 216
9.0 canola 204
10.0 coconut 200
11.0 olive_ev 160
12.0 butter 150
But I want the rank to be like this:
oil boil_temp
Ranking
1.0 avocado 270
2.0 sunflower 252
3.0 beef_tallow 250
3.0 butter_clarified 250
3.0 mustard 250
4.0 palm 235
5.0 corn 230
6.0 grapeseed 216
7.0 canola 204
8.0 coconut 200
9.0 olive_ev 160
10.0 butter 150
What should I do?
CodePudding user response:
An interesting thing about the rank function is that if there is a tie between N previous records for the value in the column, the rank function skips the next N-1 positions before incrementing the counter.
However the dense rank function does not skip any ranks if there is a tie between the ranks of the preceding records
Addition to what @bb1 has suggested -
>>> df['Ranking'] = df['boil_temp'].rank(ascending=False, method='dense')
>>> df = df.sort_values(by='Ranking', ascending=True)
>>> df.reset_index(drop=True, inplace=True)
>>> print(df)
Ranking Oil boil_temp
0 1.0 avocado 270
1 2.0 canola 260
2 3.0 sunflower 252
3 4.0 beef_tallow 250
4 4.0 butter_clarified 250
5 4.0 mustard 250
6 5.0 palm 235
7 6.0 corn 230
8 7.0 grapeseed 216
9 8.0 coconut 200
10 9.0 olive_ev 160
11 10.0 butter 150
