Home > Mobile >  Replacing nan values in a Pandas data frame with lists
Replacing nan values in a Pandas data frame with lists

Time:01-09

How to replace nan or empty strings (e.g. "") with zero if it exists in any column. the values in any column can be a combination of lists and scalar values as follows

col1  col2    col3            col4 
nan   Jhon    [nan, 1, 2]    ['k', 'j']
1     nan     [1, 1, 5]       3
2     ""      nan             nan
3     Samy    [1, 1, nan]    ['b', '']

CodePudding user response:

You have to handle the three cases (empty string, NaN, NaN in list) separately.

For the NaN in list you need to loop over each occurrence and replace the elements one by one.

NB. applymap is slow, so if you know in advance the columns to use you can subset them

For the empty string, replace them to NaN, then fillna.

sub = 'X'
(df.applymap(lambda x: [sub if (pd.isna(e) or e=='')
                        else e for e in x]
                       if isinstance(x, list) else x)
   .replace('', float('nan'))
   .fillna(sub)
 )

Output:

  col1  col2       col3    col4
0    X  Jhon  [X, 1, 2]  [k, j]
1  1.0     X  [1, 1, 5]       3
2  2.0     X          X       X
3  3.0  Samy  [1, 1, X]  [b, X]

Used input:

from numpy import nan
df = pd.DataFrame({'col1': {0: nan, 1: 1.0, 2: 2.0, 3: 3.0},
                   'col2': {0: 'Jhon', 1: nan, 2: '', 3: 'Samy'},
                   'col3': {0: [nan, 1, 2], 1: [1, 1, 5], 2: nan, 3: [1, 1, nan]},
                   'col4': {0: ['k', 'j'], 1: '3', 2: nan, 3: ['b', '']}})
  •  Tags:  
  • Related