Want to remove everything except # NewLine, complete bracket set and numbers inside braces.
Sample input:
# (1296) {20} [529] [1496] [411]
# (MONDAY ) (1296)
# (646) {20} (BEACH 7) [20 Mtrs] { 03 Foot }
# {19} [455] [721] (1296) (SUNDAY ) [2741] (MONDAY (WEDNESDAY {20}
# {19} (1296)
Code which does not work:
$re = '/(?:\[[^][]*]|\([^()]*\)|{[^{}]*})(*SKIP)(*F)|[^][(){}@#] /m';
$result = preg_replace($re, '', $input);
Incorrect output:
#(1296){20}[529][1496][411]
#(1296)
#(646){20}(BEACH 7)[20 Mtrs]{ 03 Foot }
#{19}[455][721](1296)[2741](({20}
#{19}(1296)
Desired output:
#(1296) {20} [529] [1496] [411]
#(1296)
#(646) {20}
#{19} [455] [721] (1296) [2741] {20}
#{19} (1296)
CodePudding user response:
You could match at least 1 digit between the brackets and then skip that match.
Then match any char except a newline or # to be replaced with an empty string.
(?:\[\h*\d[\h\d]*]|\(\h*\d[\h\d]*\)|{\h*\d[\h\d]*})\h*(*SKIP)(*F)|[^#\n]
Explanation
(?:Non capture group\[\h*\d[\h\d]*]Match at least 1 digit between square brackets, where\hmatches horizontal whitespace characters (no newlines)|Or\(\h*\d[\h\d]*\)1 digit between parenthesis|Or{\h*\d[\h\d]*}1 digit between curly braces
)\h*Close the non capture group and match 1 spaces(*SKIP)(*F)Skip and fail the match (to leave it untouched in the output)|Or[^#\n]Match any character except#or a newline
CodePudding user response:
You may match using this regex:
(?:(\()|({)|\[)[\h\d]* ([^])}\s\d])(?(1)[^()]*\)|(?(2)[^{}]*}|[^][]*]))\h*|(?<=#)\h |\([^\s)] \h
and replace with an empty string.
RegEx Details:
(?:(\()|({)|\[)[\h\d]* ([^])}\s\d])(?(1)[^()]*\)|(?(2)[^{}]*}|[^][]*])): Match(...)or{...}or[...]if they contain at least one non-digit\h*: Match 0 or more whitespace|: OR(?<=#)\h: Match 1 whitespaces after#|: OR\([^\s)] \h: Match(and 1 of non-whitespace text followed by 1 whitespaces
