Given this sample dataframe named df:
df = pd.DataFrame({'name': ['Mary', 'Joe', 'Jessie'], 'score': [10, 3, 13]})
name score
Mary 10
Joe 3
Jessie 13
Now trying to sort the dataframe instead of by column, by the column via its location:
Typical way to sort dataframe:
df = df.sort_values['score']
Trying to sort it like this (which is not working):
df = df.sort_values[df.iloc[:, 1]]
This raises an otherwise unintelligible "Key Error" with no explanation what it is referring to.
I need to do this because the function containing this code will have a different name for the second column each time it runs so I cannot hard code a column name for sorting and instead need to sort by whatever the second column is, no matter its name.
Thanks for taking a moment to check this out.
CodePudding user response:
sort_values is not an indexer but a method. You use it with [] instead of () but it doesn't seem the problem.
If you want to sort your dataframe by the second column whatever the name, use:
>>> df.sort_values(df.columns[1])
name score
1 Joe 3
0 Mary 10
2 Jessie 13
CodePudding user response:
One way could be to set_index by the desired column, sort_index and change the index back to original:
df = df.set_index(df.iloc[:,1]).sort_index().reset_index(drop=True)
As @Neither suggests, we could ignore_index when using sort_index to skip resetting the index:
df = df.set_index(df.iloc[:,1]).sort_index(ignore_index=True)
Output:
name score
0 Joe 3
1 Mary 10
2 Jessie 13
