i use the following regex to extract values that appear before certain units:
([.\d] )\s*(?:kg|gr|g)
What i want, is to include the unit of that specific value for example from this string :
"some text 5kg another text 3 g more text 11.5gr end"
i should be getting :
["5kg", "3 g", "11.5gr"]
can't wrap my head on how to modify the above expression to get the wanted result. Thank you.
CodePudding user response:
import re
p = re.compile('(?<!\d|\.)\d (?:\.\d )?\s*?(?:gr|kg|g)(?!\w)')
print(p.findall("some text 5kg another text 3 g more text 11.5gr end"))
CodePudding user response:
Other solution (regex demo):
(?i)\b\d \.?\d*\s*(?:kg|gr?)\b
(?i)- case insensitive\b- word boundary\d \.?\d*- match the amount\s*- any number of spaces(?:kg|gr?)- matchkg,gorgr
\b- word boundary
import re
p = re.compile(r"(?i)\b\d \.?\d*\s*(?:kg|gr?)\b")
print(p.findall("some text 5kg another text 3 g more text 11.5gr end"))
Prints:
['5kg', '3 g', '11.5gr']
