Home > Software engineering >  How to split a string with parentheses and spaces into a list
How to split a string with parentheses and spaces into a list

Time:01-14

I want to split strings like:

(so) what (are you trying to say)
 
what (do you mean)

Into lists like:

[(so), what, (are you trying to say)]

[what, (do you mean)]

The code that I tried is below. In the site regexr, the regex expression match the parts that I want but gives a warning, so... I'm not a expert in regex, I don't know what I'm doing wrong.

import re
string = "(so) what (are you trying to say)?"

rx = re.compile(r"((\([\w \w]*\)|[\w]*))")

print(re.split(rx, string ))

CodePudding user response:

Using [\w \w]* is the same as [\w ]* and also matches an empty string.

Instead of using split, you can use re.findall without any capture groups and write the pattern like:

\(\w (?:[^\S\n] \w )*\)|\w 
  • \( Match (
    • \w Match 1 word chars
    • (?:[^\S\n] \w )* Optionally repeat matching spaces and 1 word chars
  • \) Match )
  • | Or
  • \w Match 1 word chars

Regex demo

import re
string = "(so) what (are you trying to say)? what (do you mean)"

rx = re.compile(r"\(\w (?:[^\S\n] \w )*\)|\w ")

print(re.findall(rx, string))

Output

['(so)', 'what', '(are you trying to say)', 'what', '(do you mean)']

CodePudding user response:

For your two examples you can write:

re.split(r'(?<=\))  |  (?=\()', str)

Python regex<¯\(ツ)>Python code

This does not work, however, for string defined in the OP's code, which contains a question mark, which is contrary to the statement of the question in terms of the two examples.

The regular expression can be broken down as follows.

(?<=\)) # positive lookbehind asserts that location in the
        # string is preceded by ')'
[ ]     # match one or more spaces
|       # or
[ ]     # match one or more spaces
(?=\()  # positive lookahead asserts that location in the
        # string is followed by '('

In the above I've put each of two space characters in a character class merely to make it visible.

  •  Tags:  
  • Related