Is there an easy way to sort a DataFrame based on a linear combination of two columns without creating a new column for that value? Given
df = pd.DataFrame([[4,1],[2,3]], columns=list('AB'))
| A | B | |
|---|---|---|
| 0 | 4 | 1 |
| 1 | 2 | 3 |
I would want to sort df by a given linear combination of columns A and B (e.g. A*B). Calling sort_values with a key function does not work, because it applies the function to each column individually.
Ideally, I would do something like:
df.sort_values(by=['A','B'], key=lambda a,b: a*b) # does not work
Right now I am creating an extra column sort like this and I am wondering whether that is necessary.
df['sort'] = df['A']*df['B']
df.sort_values(['sort'])
Thanks in advance.
CodePudding user response:
Use DataFrame.sort_index with multiplied Series and .get:
df1 = df.sort_index(key=(df.A*df.B).get)
Or Series.argsort with DataFrame.iloc:
df1 = df.iloc[(df.A*df.B).argsort()]
