Pandas turn last N columns into NA based on another dataframe-CodePudding

I have the following dataframes:

df1 = pd.DataFrame(data={'col1': ['a', 'd', 'g', 'j'], 
                        'col2': ['b', 'c', 'i', np.nan], 
                        'col3': ['c', 'f', 'i', np.nan],
                        'col4': ['x', np.nan, np.nan, np.nan]},
                index=pd.Series(['ind1', 'ind2', 'ind3', 'ind4'], name='index'))

index	col1	col2	col3	col4
ind1	a	b	c	x
ind2	d	c	f	NaN
ind3	g	i	i	NaN
ind4	j	NaN	NaN	NaN

df2 = pd.Series(data=[True, False, True, False],
                index=pd.Series(['ind1', 'ind2', 'ind3', 'ind4']))


ind1	True
ind2	False
ind3	True
ind4	False

How do I make the last 2 values for each row in df1 into NA, based on the boolean values of df2?

In this case, since ind1 and ind3 are True, it would impact the same indices in df1.

index	col1	col2	col3	col4
ind1	a	b	NaN	NaN
ind2	d	c	f	NaN
ind3	g	i	NaN	NaN
ind4	j	NaN	NaN	NaN

CodePudding user response：

A possible solution, based on pandas.DataFrame.mask:

df1[['col3', 'col4']] = df1[['col3', 'col4']].mask(df2)

Output:

      col1 col2 col3 col4
index                    
ind1     a    b  NaN  NaN
ind2     d    c    f  NaN
ind3     g    i  NaN  NaN
ind4     j  NaN  NaN  NaN

CodePudding user response：

You can use boolean indexing:

N = 2
df1.iloc[df2, -N:] = np.nan

NB. what you call df2 is actually a Series, s/ser might be more appropriate as a name.

output:

      col1 col2 col3 col4
index                    
ind1     a    b  NaN  NaN
ind2     d    c    f  NaN
ind3     g    i  NaN  NaN
ind4     j  NaN  NaN  NaN