I have the DataFrame below and want to find the count of y and n for each column:
| ID | var1 | var2 |
|---|---|---|
| 1 | y | |
| 2 | n | y |
| 3 | y | n |
| 4 | y | n |
| 5 | y |
the result would be like this:
| var1_N | var2_N | |
|---|---|---|
| y | 3 | 2 |
| n | 1 | 2 |
I used transform function but was wondering there is a better way to get the results.
Thanks!
CodePudding user response:
You can just do value_counts on all columns you need to count using apply method, the results will be automatically joined on the index (or var value in your case):
df.filter(like='var').apply(lambda s: s.value_counts())
var1 var2
y 3 2
n 1 2
Or use pd.value_counts directly:
df.filter(like='var').apply(pd.value_counts)
var1 var2
y 3 2
n 1 2
CodePudding user response:
You can use melt to flatten your dataframe then use value_counts and unstack the variable column:
>>> df.melt('ID').value_counts(['variable', 'value']).unstack('variable')
variable var1 var2
value
n 1 2
y 3 2
You can remove index and column names: by appending ``
Output:
>>> df.melt('ID').value_counts(['variable', 'value']).unstack('variable') \
.rename_axis(index=None, columns=None)
var1 var2
n 1 2
y 3 2
