original line in file sed.txt:
outer_string_PATTERN_string(PATTERN_And_PATTERN_PATTERN_i)PATTERN_outer_string(i_PATTERN_inner)_outer_string
only need to replace PATTERN to pattern which in brackets, not lowercase, it could replace to other word.
expect result:
outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
I could use ([^)]*) pattern to find the substring which would be replace some worlds in. But I can't use this pattern to index the substring's position, and it will replace the whole line's PATTERN to pattern.
:/tmp$ sed 's/([^)]*)/---/g' sed.txt
outer_string_PATTERN_string---PATTERN_outer_string---_outer_string
:/tmp$ sed '/([^)]*)/s/PATTERN/pattern/g' sed.txt
outer_string_pattern_string(pattern_And_pattern_pattern_i)pattern_outer_string(i_pattern_inner)_outer_string
I also tried to use the regex group in sed to capture and replace the words, but I can't figure out the command.
Can sed implement that? And how to achieve that? THX.
CodePudding user response:
Can sed implement that?
Yes. But you do not want to do it in sed. Use other programming language, like Python, Perl, or awk.
how to achieve that?
Implementing non-greedy regex is not simple in sed. Basically, generally, it consists of:
- taking chunk of the input
- process the chunk
- put it in hold space
- shuffle hold with pattern space - extract what been already processed, what's not
- repeat
- shuffle with hold space
- output
Anyway, the following script:
#!/bin/bash
sed <<<'outer_string_PATTERN_string(PATTERN_i_PATTERN_PATTERN_i)PATTERN_outer_string(i_PATTERN_inner)_outer_string' '
:loop;
/\([^(]*\)\(([^)]*)\)\(.*\)/{
# Lowercase the second part.
s//\1\L\2\E\n\3/;
# Mix with hold space.
G;
s/\(.*\)\n\(.*\)\n\(.*\)/\3\1\n\2/;
# Put processed stuff into hold spcae
h; s/\n.*//; x;
# Process the other stuff again.
s/.*\n//;
bloop;
};
# Is hold space empty?
x; /^$/!{
# Pattern space has trailing stuff - add it.
G; s/\n//;
# We will print it.
h;
# Clear hold space
s/.*//
};x;
'
outputs:
PATTERN_outer_string(i_pattern_inner)outer_string_PATTERN_string(pattern_i_pattern_pattern_i)_outer_string
CodePudding user response:
As an alternative, it is easier to do this in gnu awk with RS that matches (...) substring:
awk -v RS='\\([^)] )' '{gsub(/PATTERN/, "pattern", RT); ORS=RT} 1' file
outer_string_PATTERN_string(pattern_i_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
Steps:
RS='\\([^)] )'captures a(...)string as record separatorgsubfunction then replacesPATTERNwithpatternin matched text i.e.RTORS=RTsetsORSas the new modifiedRT1prints each record to stdout
Another alternative solution using lookahead assertion in a perl regex:
perl -pe 's/PATTERN(?=[^()]*\))/pattern/g' file
CodePudding user response:
Solved by this:
:/tmp$ sed 's/(/\n(/g' sed.txt | sed 's/)/)\n/g' | sed '/([^)]*)/s/PATTERN/pattern/g' | sed ':a;N;$!ba;s/\n//g'
outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
- make pattern
()in a new line - find the
()lines and replace thePATTERNtopattern - merge multiple lines in one line
thanks for How can I replace a newline (\n) using sed?
CodePudding user response:
Can
sedimplement that?
It can be done using GNU sed and basic regular expressions
(BRE):
sed '
s/)/)\n/g
:1
s/\(([^)]*\)PATTERN\([^)]*)\n\)/\1pattern\2/
t1
s/\n//g
' < file
where
- 1st
sinserts a newline after each) - 2nd
sreplaces the last (*is greedy)PATTERNinside()s withpattern tloops back if a substitution was made- 3rd
sstrips all inserted newlines
EDIT
2nd substitute command edited according to OP's suggestion
since there is no need to match \n inside ().
CodePudding user response:
You can try this sed
sed -E 's/\(.?PATTERN.?[^)]*\)/\L&/g'
Here, we are looking to match the word PATTERN only if it resides within brackets.
Output
outer_string_PATTERN_string(pattern_i_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
New Example Output
echo "outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string" | sed -E 's/\(.?PATTERN.?[^)]*\)/\L&/g'
outer_string_PATTERN_string(pattern_And_pattern_pattern_i)PATTERN_outer_string(i_pattern_inner)_outer_string
