I have many text files that contain a series of strings as shown below:
(JJJ_gef_14775:0.0802204549,Abc_gef_9331:0.0755012077, Abc_abc_9331:0.0755012077)
I need to add "#2" after each number that follows gef, so I end up with
(JJJ_gef_14775#2:0.0802204549,Abc_gef_9331#2:0.0755012077, Abc_abc_9331:0.0755012077)
Is there a way I can do this using a bash script?
I thought maybe I could use sed to replace the ":" with "#2:" following the gef pattern, but I am unsure how to skip the numbers in between. The length of the number in between varies.
CodePudding user response:
This might work for you (GNU sed):
sed 's/gef_[0-9]\ //g' file
Replace gef_ followed by one more digits by itself and #2.
CodePudding user response:
Use this Perl one-liner:
perl -pe 's{(gef\w ):}{$1#2:}g' infile > outfile
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration.
The regex uses this modifier:
/g : Match the pattern repeatedly.
\w : any word character (letter, digit or underscore), repeated 1 or more times.
$1 : capture group 1 = whatever was captured between parentheses.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlre: Perl regular expressions (regexes)
