So I have this text: 11 3.2 / 5 ^ 2 % 6 * 2.0 - 10.1
And using my regex for floats works fine. However when I try to use mine for ints: [0-9] it selects the digits in the floats as well.
The code should look something like but with a different regex in the INT finditer
text = "11 3.2 / 5 ^ 2 % 6 * 2.0 - 10.1" or it can be "3 2" or "5*2"
allToks = []
allToks.extend([{"type": token.INT, "match": i} for i in list(re.finditer("[0-9] ", text))])
allToks.extend([{"type": token.FLOAT, "match": i} for i in list(re.finditer("([0-9] ([.][0-9]*))", text))])
allToks.sort(key=lambda x: x["match"].start())
print(allToks)
What I am trying to say is that I want my INTs regex to not overlap with the FLOATs regex
Any help would be appreciated!
CodePudding user response:
Here is a regex approach using re.findall:
inp = "11 3.2 /5^ 2 % 6 * 2.0 - 10.1"
parts = [x for x in re.findall(r'[*/%^ -]|\d (?:\.\d )?', inp) if re.search(r'^\d $', x)]
print(parts) # ['11', '5', '2', '6']
The strategy is to eagerly try to match symbols first, followed second my integers/floats. The list comprehension removes anything which is not a pure integer.
CodePudding user response:
This may be a brute force solution but how about:
import re
text = '11 3.2 / 5 ^2 % 6* 2.0 - 10.1'
m = [x.strip() for x in re.findall(r' \d ', re.sub(r'([\d.] )', r' \1 ', text))]
print(m)
Output:
['11', '5', '2', '6']
It first wraps numbers (integers and/or floats) with whitespaces, then find digits (w/o dots) surrounded by whitespaces, then finish up stripping the whitespaces.
