I'm trying to trim space at the start and end of a document without touching intermediate space in a file using perl inside a bash script
The file has the following format
<newline>
<space><newline>
<tab><newline>
<space><tab><newline>
START<newline><newline>
<space>INDENTED<newline><newline>
END<newline>
<space><tab><newline>
<tab><newline>
<space><newline>
<newline>
NOTE: <newline> is \n, <space> is & <tab> is \t
So the original file looks like
START
INDENTED
END
I need the content of the file to be
START<newline><newline>
<space>INDENTED<newline><newline>
END
i.e final file like this
START
INDENTED
END
I tried using both of them the following command, but it trims intermediate space aswell. Both of them trim space & newlines from the whole document rather than just from start the start of the document
perl -pi -e 's/^\s*//gs' sample.txt
perl -pi -e 's/\A\s*//gs' sample.txt
Both collapsed all internal space
START<newline>
INDENTED<newline>
END<newline>
I tried this. It collapsed newlines
perl -pi -e 's/\s*$//gs' sample.txt
perl -pi -e 's/\s*\Z//gs' sample.txt
Both collapsed newlines
START<space>INDENTEDEND<newline>
Here are my assumptions
\Amatches just the start of the document &\Zmatches end of document (as opposed^&$)sin thegsflag ensures the whole document is treated as single line with newlines replaced with character\n
I am new to perl. Appreciate if someone can help me understand where I went wrong
CodePudding user response:
You may use this perl in slurp mode:
perl -0777 -pe 's/^\s |(\R)\s $/$1/g' file
Output:
START
INDENTED
END
Details:
-0777Enables slurp mode to makeperlread full file^\sMatch 1 whitespaces at the start of file(\R)\s $: Match a line break followed by 1 whitespaces at the end- We use
$1in replacement to put line break back otherwise you will get file content without ending line break
CodePudding user response:
Not perl, but ed is useful for editing files:
$ printf '%s\n' '1,/START/-1d' '/END/ 1,$d' w | ed -s sample.txt
$ cat sample.txt
START
INDENTED
END
This deletes everything in the ranges of lines from the first to the line before the one matching START, and from the line after END to the end of the file, and then writes the changed file back to disk.
Or a similar perl approach, which only prints lines in the range you want to keep:
perl -i -ne 'print if /START/../END/' sample.txt
CodePudding user response:
Here is a short sed version:
sed -n '/START/,/END/p'
or with the negated logic:
sed '1,/START/{/START/!d}; /END/,${/END/!d}'
