Is there a way to have pandas.get_dummies output the numerical representation in one column rather than a separate column for each option?
Concretely, currently when using pandas.get_dummies it gives me a column for every option:
| Size | Size_Big | Size_Medium | Size_Small |
|---|---|---|---|
| Big | 1 | 0 | 0 |
| Medium | 0 | 1 | 0 |
| Small | 0 | 0 | 1 |
But I'm looking for more of the following output:
| Size | Size_Numerical |
|---|---|
| Big | 1 |
| Medium | 2 |
| Small | 3 |
CodePudding user response:
You don't want dummies, you want factors/categories.
CodePudding user response:
You can convert it to the Categorical type and get codes:
pd.Categorical(['A', 'B', 'C', 'A', 'C']).codes
Output:
array([0, 1, 2, 0, 2], dtype=int8)

