Suppose the below simplified dataframe. (The actual df is much, much bigger.) How does one assign values to a new column f such that f is a function of another column (e.,g. e)? I'm pretty sure one needs to use apply or map but never done this with a dataframe that has multiindex columns?
df = pd.DataFrame([[1,2,3,4], [5,6,7,8], [9,10,11,12], [13,14,15,16]])
df.columns = pd.MultiIndex.from_tuples((("a", "d"), ("a", "e"), ("b", "d"), ("b","e")))
df
a b
d e d e
0 1 2 3 4
1 5 6 7 8
2 9 10 11 12
3 13 14 15 16
Desired output:
a b
d e f d e f
0 1 2 1 3 4 1
1 5 6 1 7 8 -1
2 9 10 -1 11 12 -1
3 13 14 -1 15 16 -1
Would like to be able to apply the following lines and assign them to a new column f. Two problems: First, the last line that contains the apply doesn't work but hopefully my intent is clear. Second, I'm unsure how to assign values to a new column of a dataframe with a multi index column structure. Would like to be able use functional programming methods.
lt = df.loc(axis=1)[:,'e'] < 8
gt = df.loc(axis=1)[:,'e'] >= 8
conditions = [lt, gt]
choices = [1, -1]
df.loc(axis=1)[:,'f'] = df.loc(axis=1)[:,'e'].apply(np.select(conditions, choices))
CodePudding user response:
nms = [(i, 'f')for i, j in df.columns if j == 'e']
df[nms] = (df.iloc[:, [j == 'e' for i, j in df.columns]] < 8) * 2 - 1
df = df.sort_index(axis=1)
df
a b
d e f d e f
0 1 2 1 3 4 1
1 5 6 1 7 8 -1
2 9 10 -1 11 12 -1
3 13 14 -1 15 16 -1
EDIT:
for a custom ordering:
d = {i:j for j, i in enumerate(df.columns.levels[0])}
df1 = df.loc[:, sorted(df.columns, key = lambda x: d[x[0]])]
IF the whole data is in a way symmetric, you could do:
df.stack(0).assign(f = lambda x: 2*(x.e < 8) - 1).stack().unstack([1,2])
Out[]:
a b
d e f d e f
0 1 2 1 3 4 1
1 5 6 1 7 8 -1
2 9 10 -1 11 12 -1
3 13 14 -1 15 16 -1
