Home > Net >  pandas: create dataframe based on two conditions (is my solution optimal?)
pandas: create dataframe based on two conditions (is my solution optimal?)

Time:02-01

I have a hard time trying to put into words what I'm trying to do (apologies for the generic title) so I'll show the code first:

I've got this dataframe "mydf":

import pandas as pd
d = {'email': ['[email protected]', '[email protected]', None], 'code':[100, 101, 102], 'filtercode':[None, None, 100]}
mydf=pd.DataFrame(data=d)

From this dataframe I need to create a new dataframe based on two conditions: First I have a list of emails called "emails" in a dataframe "match" that is used to select rows from the dataframe "mydf".

emails={'email':['[email protected]']}
match=pd.DataFrame(data=emails)
out = mydf[mydf['email'].isin([x for sublist in match.values.tolist() for x in sublist])]

The second condition is whether there is a row in my original dataframe "mydf" where the "filtercode" is in "code" of my new dataframe "out" and append it if that is the case:

out = out.append(mydf[mydf['filtercode'].isin(out['code'])])

This results in the intended dataframe which contains the rows 0 and 2 from the original dataframe. Had I filtered by "[email protected]" it should have only shown row 1 in the dataframe "out".

Now, I'm new to pandas and this code works, but I wonder if this is the most elegant solution or if there is a simpler way to do this. It just feels like my solution is a little clunky and maybe there is a way to do both of these steps in one go instead of first creating the output dataframe and then append rows from the original dataframe. Any feedback would be appreciated!

CodePudding user response:

The first step could be done somewhat more elegantly with a merge. Not much to do with the second step, although we can combine the two steps into one:

df1 = mydf.merge(match.assign(matched = True), how = 'left', on = 'email')
out = df1[(df1['matched'] == True) | (df1['filtercode'].isin(mydf['code']))]

out looks like this:

    email              code    filtercode    matched
--  ---------------  ------  ------------  ---------
 0  [email protected]     100           nan          1
 2                      102           100        nan
  •  Tags:  
  • Related