Home > database >  Choose empty rows only from a dataframe
Choose empty rows only from a dataframe

Time:02-02

I have got the following df

lst=[['01012021','A',10,''],['01012021','B',20,''],['02012021','A',12,'ss'],['02012021','B',23,'sv']]
df2=pd.DataFrame(lst,columns=['Date','FN','AuM','Alpha'])

Is there a loc / iloc function which can return the following dataframe (empty rows only)

lst=[['01012021','A',10,''],['01012021','B',20,'']]
df3=pd.DataFrame(lst,columns=['Date','FN','AuM','Alpha'])

Thank you for your help.

CodePudding user response:

See below for a table summarizing various methods to identify null/False/empty elements in pandas

Depending on what other values you can have in "Alpha" you could use:

Keep only spaces

df2.loc[df2['Alpha'].eq('')]

Keep all "Falsy" values

df2.loc[~df2['Alpha'].astype(bool)]

output:

       Date FN  AuM Alpha
0  01012021  A   10      
1  01012021  B   20      

table of empty/falsy/zero-length values

df = pd.DataFrame({'data': [1, 'abc', True, 0, '', None,
                            float('nan'), False, [],
                            {}, set(), tuple([])]
                    })

df['type']   = df['data'].apply(lambda x: type(x).__name__)
df['isna']   = df['data'].isna().map({True: 'X', False: ''})
df['isnull'] = df['data'].isnull().map({True: 'X', False: ''})
df['~bool']  = (~df['data'].astype(bool)).map({True: 'X', False: ''})
df['len']    = df['data'].str.len().convert_dtypes()
df['len==0'] = df['data'].str.len().eq(0).map({True: 'X', False: ''})
print(df)

output:

     data      type isna isnull ~bool   len len==0
0       1       int                    <NA>       
1     abc       str                       3       
2    True      bool                    <NA>       
3       0       int                 X  <NA>       
4               str                 X     0      X
5    None  NoneType    X      X     X  <NA>       
6     NaN     float    X      X        <NA>       
7   False      bool                 X  <NA>       
8      []      list                 X     0      X
9      {}      dict                 X     0      X
10     {}       set                 X     0      X
11     ()     tuple                 X     0      X

CodePudding user response:

You can use .loc to filter on where Alpha equals ''

df2.loc[df2['Alpha'] == '']

Returns:

       Date FN  AuM Alpha
0  01012021  A   10      
1  01012021  B   20      

This will work as long as your blanks is alpha are always ''

  •  Tags:  
  • Related