Home > OS >  How to get rows of Pandas Dataframe where the column value starts with any of given characters
How to get rows of Pandas Dataframe where the column value starts with any of given characters

Time:01-31

I have the following Pandas Dataframe:

d = {'col1': ["aaa", "bbb", "ccc", "ddd", "acc", "bcc"]}
df = pd.DataFrame(data=d)
df

Output of the above code:

    col1
0   aaa
1   bbb
2   ccc
3   ddd
4   abb
5   bcc

I need to get the rows where the column value starts with - say - either "a" or "c". After the filtering, the result should look as the following:

    col1
0   aaa
1   ccc
2   abb

How can I achieve this without using a for loop?

CodePudding user response:

Just for a and c

df = df[df["col1"].str.match('^[ac]')]

You can include as many letters in the square brackets that you need to match on.

df = df[df["col1"].str.match('^[abcdefg]')]

CodePudding user response:

Something like this?

filtered = df[df['col1'].str.startswith('a') | df['col1'].str.startswith('c')]

Output:

>>> filtered
  col1
0  aaa
2  ccc
4  acc

Edit: as @Mustafa Aydın pointed out, you can pass a tuple to .str.startswith:`

df[df['col1'].str.startswith(('aaa', 'bbb', 'ccc', 'ddd', 'acc', 'bcc'))]

CodePudding user response:

Try with

out = df.loc[df.col1.str.startswith(('a','c'))]
Out[11]: 
  col1
0  aaa
2  ccc
4  acc
  •  Tags:  
  • Related