I have a text example like
0s11 0s12 0s33 my name is 0sgfh 0s1 0s22 0s87
I want to detect the consecutive sequences that start 0s.
So, the expected output should be 0s11 0s12 0s33, 0sgfh 0s1 0s22 0s87
I tried using regex
(0s\w )
but that would detect each 0s11, 0s12, 0s33, etc. individually.
Any idea on how to modify the pattern?
CodePudding user response:
To get those 2 matches where there are at least 2 consecutive parts:
\b0s\w (?:\s 0s\w )
Explanation
\bA word boundary to prevent a partial word match0s\wMatchosand 1 word chars(?:\s 0s\w )Repeat 1 or more times whitespace chars followed by0sand 1 word chars
If you also want to match a single occurrence:
\b0s\w (?:\s 0s\w )*
Note that \w matches 1 or more word characters so it would not match only 0s
CodePudding user response:
Should be doable with re.findall(). Your pattern was correct! :)
import re
testString = "0s11 0s12 0s33 my name is 0sgfh 0s1 0s22 0s87"
print(re.findall('0s\w', testString))
['0s11', '0s12', '0s33', '0sgfh', '0s1', '0s22', '0s87']
Hope this helps!
