I had a dataframe with 2 categorical features like this:
Then use get_dummies to one hot vector these columns:
Now I want to get back from one hot vector to first columns, actually a reverse action of get_dummies. Is there any way to do this?
CodePudding user response:
Use DataFrame.melt for unpivot with filter 1 in DataFrame.query, then splitting varible column and reshape by DataFrame.set_index with Series.unstack:
df = pd.get_dummies(df1.astype(str))
df = df.melt(ignore_index=False).query('value == 1')
df[['a','b']] = df['variable'].str.rsplit('_', n=1, expand=True)
df = df.set_index('a', append=True)['b'].unstack().rename_axis(None, axis=1)
Or use DataFrame.stack with filter in Series.loc, convert multiIndex to DataFrame by MultiIndex.to_frame, splitting and pivoting by DataFrame.pivot:
df = df.stack().loc[lambda x: x.eq(1)].index.to_frame()
df[['a','b']] = df[1].str.rsplit('_', n=1, expand=True)
df = df.pivot(0,'a','b').rename_axis(index=None, columns=None)
CodePudding user response:
Use from_dummies (pandas 1.5 ):
df_original = pd.from_dummies(df_dummies, sep='_')
Output:
EstateTypes AdverTypes
0 1 1
1 1 2
2 1 2
3 1 3
4 1 2


