Home > Back-end >  If statement: String starts with exactly 4 digitis in Python/pandas
If statement: String starts with exactly 4 digitis in Python/pandas

Time:02-07

I have a column of a dataframe consisting of strings, which are either a date (e.g. "12-10-2020") or a string starting with 4 digits (e.g. "4030 - random name"). I would like to write an if statement to capture the strings which are starting with 4 digits, which is similar to this code:

string[0].isdigit()

but instead of isdigit, it should be something like:

is string which starts with 4 digits

I hope I clarified my question and let me know if it is not clear. I am btw working in pandas.

CodePudding user response:

Use str.contains: col"

df[df["col"].str.contains(r'^[0-9]{4}')]

CodePudding user response:

You can use str.match that is anchored by default to the start of the string:

Example:

df = pd.DataFrame({'col': ['4030 - random name', 'other', '07-02-2022']})

df[df['col'].str.match('\d{4}')]

output:

                  col
0  4030 - random name
  •  Tags:  
  • Related