Home > Software design >  Using for loop with pandas DataFrame
Using for loop with pandas DataFrame

Time:01-24

I'm trying to make a loop like the following:

x_list = df['Column1'].unique()
for x in x_list:
    y = df.query('Column1 == "x" and Column2 == "No"')
    y_count = y['Column1'].count()
    print ('Total number of {} is {}.' .format(x, y_count))

However, always the y_count results in zero!!

e.g.,
Total number of x1 is 0.
Total number of x2 is 0.
Total number of x3 is 0.
etc.

What would be the problem?

Thanks in advance.

CodePudding user response:

I seldom use query, my guess is because "Column1 == 'x'" was understood as choosing rows which the Column1 is equal to a 'x' as a string, not the value of your x variable.

Try this instead:

x_list = df['Column1'].unique()
for x in x_list:
    y = df.query('Column1 == {} and Column2 == "No"'.format(x))
    y_count = y['Column1'].count()
    print ('Total number of {} is {}.' .format(x, y_count))
Or consider this
for x, sub_df in df[df['Column2']=='No'].groupby('Column1'):
    y_count = len(sub_df)
    print('Total number of {} is {}.' .format(x, y_count))
  •  Tags:  
  • Related