Create a dataframe from another df which selected only those columns having min value of each row ex-CodePudding

I have a datafrme

In = pd.DataFrame([["V",13,33,45,75],["W",13,0.00,45,65], ["X",0.00,45,12,78], ["Y",12,0.00,56,89],["Z",18,0.00,86,89]],columns=["A","B","C","D","E"])

If we exclude 0, W row has min value 13, X row has min value 12, and Y row has min value 12. Keep only those columns that have min value of all the rows. The operation should exclude V and Z.

Expected Output:

Out = pd.DataFrame([["W",13,45], ["X",0,12], ["Y",12,56]],columns=["A","B","D"])

How to do it?

CodePudding user response：

Use:

#convert string column 
df1 = In.set_index('A').replace(0, np.nan)
#get all minimal per columns, rows
m1 = df1.eq(df1.min(axis=1), axis=0)
m2 = df1.eq(df1.min())
#filtering
df = df1.loc[m2.any(axis=1), m1.any()].reset_index()
print (df)
   A     B   D
0  V  13.0  45
1  W  13.0  45
2  X   NaN  12
3  Y  12.0  56

CodePudding user response：

I don't understand the logic between remove V and Z but you can try:

Out = In.set_index('A').drop(['V', 'Z']).replace(0, np.nan) \
        .loc[:, lambda x: x.idxmin(1).unique()].reset_index()
print(Out)

# Output
   A     B   D
0  W  13.0  45
1  X   NaN  12
2  Y  12.0  56