Get the inverse of a dataframe column in terms of rows with NaN values-CodePudding

I have an original dataframe df0 with a number of values, based on this dataframe I have a second dateframe where some the original values are NaN, df1.

import pandas as pd
df0 = pd.DataFrame({'col1': [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]})
df1 = pd.DataFrame({'col1': [1,2,None,4,5,6,None,8,None,10,11,None,13,None,None]})

I need a df2 to be the inverse of df1 in terms of rows with NaN. Every row which is NaN in df1 should get its original value back from df0 and rows which are not NaN must become NaN such that I get the following dataframe:

df2 = pd.DataFrame({'col1': [None,None,3,None,None,None,7,None,9,None,None,12,None,14,15]})

What is the best way to go about this if it were a larger dataset?

CodePudding user response：

masking all columns

If you need to mask all columns, use mask notna OR where isna:

df2 = df0.mask(df1['col1'].notna())
# or
df2 = df0.where(df1['col1'].isna())

output:

    col1
0    NaN
1    NaN
2    3.0
3    NaN
4    NaN
5    NaN
6    7.0
7    NaN
8    9.0
9    NaN
10   NaN
11  12.0
12   NaN
13  14.0
14  15.0

masking only "col1"

If you just need to replace col1 and leave potential other columns intact rather use assign and Series.mask:

df2 = df0.assign(col1=df0['col1'].mask(df1['col1'].notna()))

CodePudding user response：

Use Series.where with Series.isna for replace one column by another DataFrame, only necessary same index in both:

df0['col1'] = df0['col1'].where(df1['col1'].isna())
print (df0)
    col1
0    NaN
1    NaN
2    3.0
3    NaN
4    NaN
5    NaN
6    7.0
7    NaN
8    9.0
9    NaN
10   NaN
11  12.0
12   NaN
13  14.0
14  15.0

Alternative with DataFrame.loc and Series.notna:

df0.loc[df1['col1'].notna(), 'col1'] = np.nan