Home > Software design >  Subsetting pandas dataframe if column contains only ONE instance of string
Subsetting pandas dataframe if column contains only ONE instance of string

Time:01-28

I have the following data frame, I only want to grab rows where the summary column only contains ONE instance of '->'. How can I do this in pandas?

Input:

idx  summary
0    McDonalds -> Wendys -> Popeyes
1    Popeyes -> Taco Bell
2    Carls Jr -> Arbys
3    Arbys -> Popeyes -> Taco Bell -> KFC
4    KFC -> Popeyes -> Boston Market

Expected Output:

idx  summary
1    Popeyes -> Taco Bell
2    Carls Jr -> Arbys

CodePudding user response:

str.count('->')==1 will grab the -> that occurs only once. Using the loc helps to identify which row it is located in, so the expected results will be the actual message, instead of True or False.

df_new = []
df_new.append(df.loc[df["summary"].str.count('->')==1])
print(df_new)

CodePudding user response:

If your input is saved as variable df, this would produce that result:

ct_arrow = df.apply(lambda x: x.summary.count('->'), axis=1)
df = df.loc[ct_arrow==1]
print(df)

CodePudding user response:

You can do that with the following

df[df["summary"].str.count('->')==1]

  •  Tags:  
  • Related