If I do a conditional match in my df the .sum() function works perfectly:
print(((df_sheet6[df_sheet6.columns[2]].isin(["Strongly agree","Agree"])) & (df_sheet6[df_sheet6.columns[1]] =='Female')).sum(skipna=True))
Where if I do a .sum() without any conditional match on my df, it does not. Instead if just prints out a concat like this: MaleFemaleFemale... etc.
print((df_sheet6[df_sheet6.columns[1]].sum()))
I fixed the problem for what I wanted by count, shown below, but like to learn the reason why.
print((df_sheet6[df_sheet6.columns[1]].count()))
Thank you!
CodePudding user response:
It seems that your second dataframe column df.columns[1] is compose by strings (e.g. 'Male' or 'Female'), so you can't perform a pd.sum(), but can do a pd.count(), which will basically count the number of rows associated with this column. As for your first example, the conditional argument pd.isin() will give you a boolean index of True and False, which as already pointed by Henry Ecker on comments, is interpreted as 1 and 0 values and consequently can be summed.
