I have log file similar to this format
test {
seq-cont {
0,
67,
266
},
grp-id 505
}
}
test{
test1{
val
}
}
Here is the echo command to produce that output
$ echo -e "test {\nseq-cont {\n\t\t\t0,\n\t\t\t67,\n\t\t\t266\n\t\t\t},\n\t\tgrp-id 505\n\t}\n}\ntest{\n\ttest1{\n\t\tval\n\t}\n}\n"
Question is how to remove all whitespace between seq-cont { and the next } that may be multiple in the file.
I want the output to be like this. Preferably use sed to produce the output.
test{seq-cont{0,67,266},
grp-id 505
}
}
test{
test1{
val
}
}
Efforts by OP: Here is the one somewhat worked but not exactly what I wanted:
sed ':a;N;/{/s/[[:space:]]\ //;/}/s/}/}/;ta;P;D' logfile
CodePudding user response:
It can be done using gnu-awk with a custom RS regex that matches { and closing }:
awk -v RS='{[^}] }' 'NR==1 {gsub(/[[:space:]] /, "", RT)} {ORS=RT} 1' file
test {seq-cont{0,67,266},
grp-id 505
}
}
test{
test1{
val
}
}
Here:
NR==1 {gsub(/[[:space:]] /, "", RT)}: For the first record replace all whitespaces (including line breaks) with empty string.{ORS=RT}: SetORSto whatever text we captured inRS
PS: Remove NR==1 if you want to do this for entire file.
CodePudding user response:
With your shown samples, please try following awk program. Tested and written in GNU awk.
awk -v RS= '
match($0,/{\nseq-cont {\n[^}]*/){
val=substr($0,RSTART,RLENGTH)
gsub(/[[:space:]] /,"",val)
print substr($0,1,RSTART-1) val substr($0,RSTART RLENGTH)
}
' Input_file
Explanation: Simple explanation would be, using RS capability to set it to null. Then using match function of awk to match everything between seq-cont { to till next occurrence of }. Removing all spaces, new lines in matched value. Finally printing all the values including newly edited values to get expected output mentioned by OP.
CodePudding user response:
You can do that much easier with perl:
perl -0777 -i -pe 's/\s (seq-cont\s*\{[^}]*\})/$1=~s|\s ||gr/ge' logfilepath
The -0777 option tells perl to slurp the file into a single string, -i saves changes inline, \s (seq-cont\s*\{[^}]*\}) regex matches one or more whitespaces, then captures into Group 1 ($1) seq-cont, zero or more whitespaces, and then a substring between the leftmost { and the next } char ([^}]* matches zero or more chars other than }) and then all one or more whitespace character chunks (matched with \s ) are removed from the whole Group 1 value ($1) (this second inner replacement is enabled with e flag). All occurrences are handled due to the g flag (next to e).
See the online demo:
#!/bin/bash
s=$(echo -e "test {\nseq-cont {\n\t\t\t0,\n\t\t\t67,\n\t\t\t266\n\t\t\t},\n\t\tgrp-id 505\n\t}\n}\ntest{\n\ttest1{\n\t\tval\n\t}\n}\n")
perl -0777 -pe 's/\s (seq-cont\s*\{[^}]*\})/$1=~s|\s ||gr/ge' <<< "$s"
Output:
test {seq-cont{0,67,266},
grp-id 505
}
}
test{
test1{
val
}
}
