Home > Software engineering >  removing rows from dataframe if column value condition is met
removing rows from dataframe if column value condition is met

Time:01-19

stables=["usdt","usdc","tusd"]

my dataframe called "df" looks like this (actual dataframe has hundreds of rows):

               pairs           apy  network            pool name
0        [eth, wbtc]  1.150699e 01   cometh      cometh-eth-wbtc
1        [usdt, usdc]  1.814333e 01  cometh      cometh-usdt-usdc
2     [must, pickle]  2.175891e 01   cometh      cometh-must-pickle
3        [usdt, eth]  1.237610e 02   cometh      cometh-usdt-eth
4       [eth, matic]  2.181968e 01   cometh      cometh-eth-matic

the 'pairs' column has lists of items. Going row by row, I want to remove the whole row if any of the items in the 'pairs' column list is not included in the 'stables' list provided above.

in this case only the second row (indexed 1) would stay in the dataframe as both items are in the 'stables' list provided above.

any help is welcome.

thx

CodePudding user response:

Assuming that the pairs column actually contains list of strings, you could explode that column, use isin to test which elements are members of the stables list, ang then use a groupby agg np.all to test if all elements of an intial row are members of stables:

temp = df.explode('pairs')
temp['keep'] = temp['pairs'].isin(stables)
temp = temp.groupby(level=0)['keep'].agg(np.all)

to_keep = df[temp]
  •  Tags:  
  • Related