How can two pandas.IndexSlice s be combined into one?
Set up of the problem:
import pandas as pd
import numpy as np
idx = pd.IndexSlice
cols = pd.MultiIndex.from_product([['A', 'B', 'C'], ['x', 'y'], ['a', 'b']])
df = pd.DataFrame(np.arange(len(cols)*2).reshape((2, len(cols))), columns=cols)
df:
A B C
x y x y x y
a b a b a b a b a b a b
0 0 1 2 3 4 5 6 7 8 9 10 11
1 12 13 14 15 16 17 18 19 20 21 22 23
How can the two slices idx['A', 'y', :] and idx[['B', 'C'], 'x', :], be combined to show in one dataframe?
Separately they are:
df.loc[:, idx['A', 'y',:]]
A
y
a b
0 2 3
1 14 15
df.loc[:, idx[['B', 'C'], 'x', :]]
B C
x x
a b a b
0 4 5 8 9
1 16 17 20 21
Simply combining them as a list does not play nicely:
df.loc[:, [idx['A', 'y',:], idx[['B', 'C'], 'x',:]]]
....
TypeError: unhashable type: 'slice'
My current solution is incredibly clunky, but gives the sub df that I'm looking for:
df.loc[:, df.loc[:, idx['A', 'y', :]].columns.to_list() df.loc[:,
idx[['B', 'C'], 'x', :]].columns.to_list()]
A B C
y x x
a b a b a b
0 2 3 4 5 8 9
1 14 15 16 17 20 21
However this doesn't work when one of the slices is just a series (as expected), which is less fun:
df.loc[:, df.loc[:, idx['A', 'y', 'a']].columns.to_list() df.loc[:,
idx[['B', 'C'], 'x', :]].columns.to_list()]
...
AttributeError: 'Series' object has no attribute 'columns'
Are there any better alternatives to what I'm currently doing that would ideally work with dataframe slices and series slices?
CodePudding user response:
General solution is join together both slice:
a = df.loc[:, idx['A', 'y', 'a']]
b = df.loc[:, idx[['B', 'C'], 'x', :]]
df = pd.concat([a, b], axis=1)
print (df)
A B C
y x x
a a b a b
0 2 4 5 8 9
1 14 16 17 20 21
