How to get rows of Pandas Dataframe where the column value starts with any of given characters-CodePudding

I have the following Pandas Dataframe:

d = {'col1': ["aaa", "bbb", "ccc", "ddd", "acc", "bcc"]}
df = pd.DataFrame(data=d)
df

Output of the above code:

    col1
0   aaa
1   bbb
2   ccc
3   ddd
4   abb
5   bcc

I need to get the rows where the column value starts with - say - either "a" or "c". After the filtering, the result should look as the following:

    col1
0   aaa
1   ccc
2   abb

How can I achieve this without using a for loop?

CodePudding user response：

Just for a and c

df = df[df["col1"].str.match('^[ac]')]

You can include as many letters in the square brackets that you need to match on.

df = df[df["col1"].str.match('^[abcdefg]')]

CodePudding user response：

Something like this?

filtered = df[df['col1'].str.startswith('a') | df['col1'].str.startswith('c')]

Output:

>>> filtered
  col1
0  aaa
2  ccc
4  acc

Edit: as @Mustafa Aydın pointed out, you can pass a tuple to .str.startswith:`

df[df['col1'].str.startswith(('aaa', 'bbb', 'ccc', 'ddd', 'acc', 'bcc'))]

CodePudding user response：

Try with

out = df.loc[df.col1.str.startswith(('a','c'))]
Out[11]: 
  col1
0  aaa
2  ccc
4  acc