| Name | yr |
|---|---|
| Dave | 2019 |
| Dave | 2020 |
| stu | 2019 |
| stu | 2021 |
and I want to result like
| Name | 2019_yr | 2020_yr | 2021_yr |
|---|---|---|---|
| Dave. | 1. | 1. | 0 |
| stu. | 1. | 0. | 1 |
I tried to do use get_dummies but the result is like
| Name | yr_2018 | yr_2019 | yr_2021 | yr_2022 |
|---|---|---|---|---|
| Dave | 0 | 1 | 0 | 0 |
| Dave | 1 | 0 | 0 | 0 |
| stu | 0 | 0 | 0 | 1 |
| stu | 0 | 0 | 1 | 0 |
CodePudding user response:
If need 0, 1 ouput is necessary aggregate max:
#Name is column
df = pd.get_dummies(df.set_index('Name')).groupby('Name').max()
#Name is index
df = pd.get_dummies(df).groupby('Name').max()
Or:
df = pd.crosstab(df['Name'], df['yr']).add_suffix('_yr').clip(upper=1)
