How to replace nan or empty strings (e.g. "") with zero if it exists in any column. the values in any column can be a combination of lists and scalar values as follows
col1 col2 col3 col4
nan Jhon [nan, 1, 2] ['k', 'j']
1 nan [1, 1, 5] 3
2 "" nan nan
3 Samy [1, 1, nan] ['b', '']
CodePudding user response:
You have to handle the three cases (empty string, NaN, NaN in list) separately.
For the NaN in list you need to loop over each occurrence and replace the elements one by one.
NB. applymap is slow, so if you know in advance the columns to use you can subset them
For the empty string, replace them to NaN, then fillna.
sub = 'X'
(df.applymap(lambda x: [sub if (pd.isna(e) or e=='')
else e for e in x]
if isinstance(x, list) else x)
.replace('', float('nan'))
.fillna(sub)
)
Output:
col1 col2 col3 col4
0 X Jhon [X, 1, 2] [k, j]
1 1.0 X [1, 1, 5] 3
2 2.0 X X X
3 3.0 Samy [1, 1, X] [b, X]
Used input:
from numpy import nan
df = pd.DataFrame({'col1': {0: nan, 1: 1.0, 2: 2.0, 3: 3.0},
'col2': {0: 'Jhon', 1: nan, 2: '', 3: 'Samy'},
'col3': {0: [nan, 1, 2], 1: [1, 1, 5], 2: nan, 3: [1, 1, nan]},
'col4': {0: ['k', 'j'], 1: '3', 2: nan, 3: ['b', '']}})
