Home > Back-end >  Is there a way to add a new column to a pandas multiindex that only aligns with one level?
Is there a way to add a new column to a pandas multiindex that only aligns with one level?

Time:02-04

I'm looking for a clean way to add a column to a multiindex dataframe, where the value is only repeated once per level=0.

For example,

I want to add a column to this:

Index level=0 Index level=1 Value (x)
A 1 300
2 850
3 2000
B 1 100
2 70
3 400

In order to get to this:

Index level=0 Index level=1 Value (x) Value (y)
A 1 300 Yellow
2 850
3 2000
B 1 100 Red
2 70
3 400

I do NOT want this:

Index level=0 Index level=1 Value (x) Value (y)
A 1 300 Yellow
2 850 Yellow
3 2000 Yellow
B 1 100 Red
2 70 Red
3 400 Red

I'm not sure how best to create a table here that shows what I'm hoping for, but the important part to me is that y corresponds to all rows of index level=0, but is not repeated for every increment of index level=1. I'm sure that I could the additional rows in the y column with null values but I thought there might be a more elegant way.

CodePudding user response:

Use pd.IndexSlice:

idx = pd.IndexSlice
df.loc[idx[:, 1], 'Color'] = ['Yellow', 'Red']
print(df)

# Output
     Value   Color
A 1    300  Yellow
  2    850     NaN
  3   2000     NaN
B 1    100     Red
  2     70     NaN
  3    400     NaN

Or only with slice:

df.loc[(slice(None), 1), 'Color'] = ['Yellow', 'Red']
print(df)

# Output
     Value   Color
A 1    300  Yellow
  2    850     NaN
  3   2000     NaN
B 1    100     Red
  2     70     NaN
  3    400     NaN

CodePudding user response:

You haven't specified how you High / Low are determined so I'm hard coding them in a dictionary:

# Create the data frame
index = pd.MultiIndex.from_product([['A', 'B'], [1,2,3]])
df = pd.DataFrame({
    'x': [300, 850, 2000, 100, 70, 400]
}, index=index)

# The transformation
def f(col):
    y_values = {
        'A': 'High',
        'B': 'Low'
    }
    level0 = col.index[0][0]
    return [y_values[level0]]   [''] * (len(col) - 1)

df['y'] = df.groupby(level=0).transform(f)

CodePudding user response:

iterate thru the index0 and assign value to new column for every new type of data in index0:

c=0
for i in df.index:
    if i[0] != c:
        #put your code here to add values to new column
        c = i[0]
  •  Tags:  
  • Related