Home > Mobile >  Find integers in string using regular expresions without numbers in float
Find integers in string using regular expresions without numbers in float

Time:01-18

So I have this text: 11 3.2 / 5 ^ 2 % 6 * 2.0 - 10.1

And using my regex for floats works fine. However when I try to use mine for ints: [0-9] it selects the digits in the floats as well.

The code should look something like but with a different regex in the INT finditer

text = "11   3.2 / 5 ^ 2 % 6 * 2.0 - 10.1" or it can be "3  2" or "5*2"

allToks = []
allToks.extend([{"type": token.INT, "match": i} for i in list(re.finditer("[0-9] ", text))])
allToks.extend([{"type": token.FLOAT, "match": i} for i in list(re.finditer("([0-9] ([.][0-9]*))", text))])

allToks.sort(key=lambda x: x["match"].start())
print(allToks)

What I am trying to say is that I want my INTs regex to not overlap with the FLOATs regex

Any help would be appreciated!

CodePudding user response:

Here is a regex approach using re.findall:

inp = "11   3.2 /5^ 2 % 6 * 2.0 - 10.1"
parts = [x for x in re.findall(r'[*/%^ -]|\d (?:\.\d )?', inp) if re.search(r'^\d $', x)]
print(parts)  # ['11', '5', '2', '6']

The strategy is to eagerly try to match symbols first, followed second my integers/floats. The list comprehension removes anything which is not a pure integer.

CodePudding user response:

This may be a brute force solution but how about:

import re
text = '11   3.2 / 5 ^2 % 6* 2.0 - 10.1'

m = [x.strip() for x in re.findall(r' \d  ', re.sub(r'([\d.] )', r' \1 ', text))]
print(m)

Output:

['11', '5', '2', '6']

It first wraps numbers (integers and/or floats) with whitespaces, then find digits (w/o dots) surrounded by whitespaces, then finish up stripping the whitespaces.

  •  Tags:  
  • Related