I'm going to add MultiIndex column to my dataframe. What I got:

What would I like to get:

full code:
import pandas as pd
data = pd.DataFrame(columns=[1, 2, 3, 4, 5])
dictionary = {1: 'row1', 2: 'row2', 3: 'row2', 4: 'row2', 5: 'row3'}
dictionary1 = {1: 'row4', 2: 'row5', 3: 'row6', 4: 'row7', 5: 'row3'}
dictionary2 = {1: 5, 2: 4, 3: 3, 4: 2, 5: 1}
data = data.append(dictionary, ignore_index=True)
data = data.append(dictionary1, ignore_index=True)
data = data.append(dictionary2, ignore_index=True)
data = data.append(dictionary2, ignore_index=True)
data = data.append(dictionary2, ignore_index=True)
What I did:
arrays = [['row1', 'row2', 'row2', 'row2', 'row3'],
['row4', 'row5', 'row6', 'row7', 'row3']]
data.columns = pd.MultiIndex.from_arrays(arrays)
data
CodePudding user response:
First thing, note that hiding a label is probably a bad idea if you're going to work with the data. The will prevent you from logically selecting your data.
That said, if you really want to do this, you could convert the MultiIndex to DataFrame and use duplicated to mask the duplicated labels:
idx = data.columns.to_frame()
data.columns = pd.MultiIndex.from_frame(idx.mask(idx.apply(pd.Series.duplicated,
axis=1)
).fillna(''),
names=[None]*data.columns.nlevels)
output:
row1 row2 row3
row4 row5 row6 row7
0 row1 row2 row2 row2 row3
1 row4 row5 row6 row7 row3
2 5 4 3 2 1
3 5 4 3 2 1
4 5 4 3 2 1
