I have the following Pandas Dataframe:
d = {'col1': ["aaa", "bbb", "ccc", "ddd", "acc", "bcc"]}
df = pd.DataFrame(data=d)
df
Output of the above code:
col1
0 aaa
1 bbb
2 ccc
3 ddd
4 abb
5 bcc
I need to get the rows where the column value starts with - say - either "a" or "c". After the filtering, the result should look as the following:
col1
0 aaa
1 ccc
2 abb
How can I achieve this without using a for loop?
CodePudding user response:
Just for a and c
df = df[df["col1"].str.match('^[ac]')]
You can include as many letters in the square brackets that you need to match on.
df = df[df["col1"].str.match('^[abcdefg]')]
CodePudding user response:
Something like this?
filtered = df[df['col1'].str.startswith('a') | df['col1'].str.startswith('c')]
Output:
>>> filtered
col1
0 aaa
2 ccc
4 acc
Edit: as @Mustafa Aydın pointed out, you can pass a tuple to .str.startswith:`
df[df['col1'].str.startswith(('aaa', 'bbb', 'ccc', 'ddd', 'acc', 'bcc'))]
CodePudding user response:
Try with
out = df.loc[df.col1.str.startswith(('a','c'))]
Out[11]:
col1
0 aaa
2 ccc
4 acc
