I'm not sure what the reason might be for having to specify the name of the dataframe twice when selecting rows using conditional statements in Pandas. For example, if I have a dataframe df:
| name | age |
|---|---|
| Alice | 31 |
| Bob | 21 |
when I want to select rows with people over 30 I have to write
over_thirty = df[df.age > 30]. Why not simply df['age' > 30]]?
CodePudding user response:
Use .query
over_thirty = df.query("age > 30")
CodePudding user response:
so if you write df[age>3] it will give you output in true or false. I am sure which you not needed
