Home > database >  List comprehension with multiple conditions on different columns
List comprehension with multiple conditions on different columns

Time:01-24

I have the following df,

data = [['Male', 'Agree'], ['Male', 'Agree'], ['Male', 'Disagree'], ['Female','Neutral']]
 
df = pd.DataFrame(data, columns = ['Sex', 'Opinion'])
df

& would like to get the total number of Male who either Agree or Disagree. I expect the answer to be 3 but instead get 9.

sum([True for x in df['Opinion'] for y in df['Sex'] if x in ['Agree','Disagree'] if y=='Male' ] 

I have done this through other methods and I'm trying to understand list comprehension better.

CodePudding user response:

Let's unpack this a bit. The original statement

total = sum([True for x in df['Opinion'] for y in df['Sex'] if x in ['Agree','Disagree'] if y=='Male' ]

is equivalent to

total = 0
for x in df['Opinion']:
    for y in df['Sex']:
        if x in ['Agree', 'Disagree']:
            if y=='Male':
                total  = 1

I think it should be clear in this case why you get 9.

What you actually want is to only consider corresponding pairs of two equal sized iterables. There's the handy zip built-in in python which does just this,

total = 0
for x,y in zip(df['Opinion'], df['Sex']):
    if x in ['Agree', 'Disagree'] and y=='Male':
        total  = 1

or as a comprehension

total = sum(1 for x,y in zip(df['Opinion'], df['Sex']) if x in ['Agree', 'Disagree'] and y=='Male')

CodePudding user response:

Use:

In [109]: df[df.Sex.eq('Male') & df.Opinion.isin(['Agree', 'Disagree'])]['Sex'].count()
Out[109]: 3
  •  Tags:  
  • Related