I'm trying to find a regular expression that matches a floating point or a string expression.
I.e. a text to match might look like this:
ABC 3.101
DEF 5.0
HIJ ?Error
KLM 1.0
NOP Range
My current version is:
fp_word = r"(?:[- ]?\d .\d |\w \?)"
but its not matching the ?Error or Range case.
It should match
3.101
5.0
?Error (including the question mark)
1.0
Range
CodePudding user response:
You can use
(?<= ).
See this regex demo. It matches any one or more chars other than line break chars till the end of a line after the first space.
If your regex should only match a number or some word optionally preceded with a ? char and you want to use your regex, but only match at a (non)word boundary you can use
(?:\b(?=\w)|\B(?=\W))(?!^)(?:[- ]?\d (?:\.\d )?|\??\w )
See the regex demo. Here,
(?:\b(?=\w)|\B(?=\W))- an adaptive dynamic word boundary of Type 2 (YouTube video explanation): it matches a word boundary if the next char is a word char, else, the position must be a non-word boundary position(?!^)- not the start of string position(?:[- ]?\d (?:\.\d )?|\??\w )- either[- ]?\d (?:\.\d )?- an optionalor-and then one or more digits followed with an optional sequence of a.and one or more digits|- or\??\w- an optional?and one or more word chars.
CodePudding user response:
Your regex is this:
(?:[- ]?\d .\d |\w \?)
It is not matching non-numeric strings because you are trying to match 1 word characters followed by a literal ? i.e. ? after the string. Whereas in your input you have just one value that starts with ? and other one doesn't even have a ? so both are failing to match.
If I understand your requirements correctly you can just use this regex:
[ ]([- ]?\d .\d |\S )
It starts matching with a space and matched either a signed floating point number or 1 of non-whitespace i.e. \S .
