I have a column of a dataframe consisting of strings, which are either a date (e.g. "12-10-2020") or a string starting with 4 digits (e.g. "4030 - random name"). I would like to write an if statement to capture the strings which are starting with 4 digits, which is similar to this code:
string[0].isdigit()
but instead of isdigit, it should be something like:
is string which starts with 4 digits
I hope I clarified my question and let me know if it is not clear. I am btw working in pandas.
CodePudding user response:
Use str.contains:
col"
df[df["col"].str.contains(r'^[0-9]{4}')]
CodePudding user response:
You can use str.match that is anchored by default to the start of the string:
Example:
df = pd.DataFrame({'col': ['4030 - random name', 'other', '07-02-2022']})
df[df['col'].str.match('\d{4}')]
output:
col
0 4030 - random name
