Is there a way to add a new column to a pandas multiindex that only aligns with one level?-CodePudding

I'm looking for a clean way to add a column to a multiindex dataframe, where the value is only repeated once per level=0.

For example,

I want to add a column to this:

Index level=0	Index level=1	Value (x)
A	1	300
	2	850
	3	2000
B	1	100
	2	70
	3	400

In order to get to this:

Index level=0	Index level=1	Value (x)	Value (y)
A	1	300	Yellow
	2	850
	3	2000
B	1	100	Red
	2	70
	3	400

I do NOT want this:

Index level=0	Index level=1	Value (x)	Value (y)
A	1	300	Yellow
	2	850	Yellow
	3	2000	Yellow
B	1	100	Red
	2	70	Red
	3	400	Red

I'm not sure how best to create a table here that shows what I'm hoping for, but the important part to me is that y corresponds to all rows of index level=0, but is not repeated for every increment of index level=1. I'm sure that I could the additional rows in the y column with null values but I thought there might be a more elegant way.

CodePudding user response：

Use pd.IndexSlice:

idx = pd.IndexSlice
df.loc[idx[:, 1], 'Color'] = ['Yellow', 'Red']
print(df)

# Output
     Value   Color
A 1    300  Yellow
  2    850     NaN
  3   2000     NaN
B 1    100     Red
  2     70     NaN
  3    400     NaN

Or only with slice:

df.loc[(slice(None), 1), 'Color'] = ['Yellow', 'Red']
print(df)

# Output
     Value   Color
A 1    300  Yellow
  2    850     NaN
  3   2000     NaN
B 1    100     Red
  2     70     NaN
  3    400     NaN

CodePudding user response：

You haven't specified how you High / Low are determined so I'm hard coding them in a dictionary:

# Create the data frame
index = pd.MultiIndex.from_product([['A', 'B'], [1,2,3]])
df = pd.DataFrame({
    'x': [300, 850, 2000, 100, 70, 400]
}, index=index)

# The transformation
def f(col):
    y_values = {
        'A': 'High',
        'B': 'Low'
    }
    level0 = col.index[0][0]
    return [y_values[level0]]   [''] * (len(col) - 1)

df['y'] = df.groupby(level=0).transform(f)

CodePudding user response：

iterate thru the index0 and assign value to new column for every new type of data in index0:

c=0
for i in df.index:
    if i[0] != c:
        #put your code here to add values to new column
        c = i[0]