An example dataset I'm working with
df = pd.DataFrame({"competitorname": ["3 Musketeers", "Almond Joy"], "winpercent": [67.602936, 50.347546] }, index = [1, 2])
I am trying to see whether 3 Musketeers or Almond Joy has a higher winpercent. The code I wrote is:
more_popular = '3 Musketeers' if df.loc[df["competitorname"] == '3 Musketeers', 'winpercent'].values[0] > df.loc[df["competitorname"] == 'Almond Joy', 'winpercent'].values[0] else 'Almond Joy'
My question is
Can I select the values I am interested in without python returning a Series? Is there a way to just do
df[df["competitorname"] == 'Almond Joy', 'winpercent']
and then it would return a simple
50.347546
?
I know this doesn't make my code significantly shorter but I feel like I am missing something about getting values from pandas that would help me avoid constantly adding
.values[0]
CodePudding user response:
How about simply sorting the dataframe by "winpercent" and then taking the top row?
df.sort_values(by="winpercent", ascending=False, inplace=True)
then to see the winner's row
df.head(1)
or to get the values
df.iloc[0]["winpercent"]
CodePudding user response:
If you're sure that the returned Series has a single element, you can simply use .item() to get it:
import pandas as pd
df = pd.DataFrame({
"competitorname": ["3 Musketeers", "Almond Joy"],
"winpercent": [67.602936, 50.347546]
}, index = [1, 2])
s = df.loc[df["competitorname"] == 'Almond Joy', 'winpercent'] # a pandas Series
print(s)
# output
# 2 50.347546
# Name: winpercent, dtype: float64
v = df.loc[df["competitorname"] == 'Almond Joy', 'winpercent'].item() # a scalar value
print(v)
# output
# 50.347546
CodePudding user response:
The underlying issue is that there could be multiple matches, so we will always need to extract the match(es) at some point in the pipeline:
Use
Series.idxmaxon the boolean maskSince
Falseis 0 andTrueis 1, usingSeries.idxmaxon the boolean mask will give you the index of the firstTrue:df.loc[df['competitorname'].eq('Almond Joy').idxmax(), 'winpercent'] # 50.347546This assumes there is at least 1
Truematch, otherwise it will return the firstFalse.Or use
Series.itemon the resultThis is basically just an alias for
Series.values[0]:df.loc[df['competitorname'].eq('Almond Joy'), 'winpercent'].item() # 50.347546This assumes there is exactly 1
Truematch, otherwise it will throw a ValueError.
