I need to find in files (xml) date in this format 2021-06-25T21:17:51Z and replace them with this format 2021-06-25T21:17:51.001Z
I thought about using regexp with sed but back references does not work.
1.xml could look like this, but I have much more fields in those files, and I got fields already correct.
<Doc>
<PUB_DATE>2021-06-25T21:17:51Z</PUB_DATE><!-- to change -->
<DATE_COLLECT_100>2021-06-25T21:17:51Z</DATE_COLLECT_100><!-- to change -->
<DATE_CREATION>2021-06-25T21:17:51.001Z</DATE_CREATION><!-- keep it like this -->
</Doc>
Desired output is
<Doc>
<PUB_DATE>2021-06-25T21:17:51.001Z</PUB_DATE><!-- to change -->
<DATE_COLLECT_100>2021-06-25T21:17:51.001Z</DATE_COLLECT_100><!-- to change -->
<DATE_CREATION>2021-06-25T21:17:51.001Z</DATE_CREATION><!-- keep it like this -->
</Doc>
Here is my sed
$ sed -Ee 's#<(PUB_DATE|DATE_COLLECT_100){1}>([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml
Is back references allowed in sed when they are used in the search portion ?
Am I missing something about sed ?
Is there a bug ?
Sed version : well... I dont know, sed --version sed -v man sed doesn't give it. I'm on OSX.
CodePudding user response:
BSD or OSX sed doesn't support back-reference \1 in regex pattern.
Your choices are perl:
perl -pe 's#<(PUB_DATE|DATE_COLLECT_100)>(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml
Or else install gnu sed using home brew installer and then use:
gsed -E 's#<(PUB_DATE|DATE_COLLECT_100)>([[:digit:]]{4}-[[:digit:]]{2}-[[:digit:]]{2}T[[:digit:]]{2}:[[:digit:]]{2}:[[:digit:]]{2})Z</\1>#<\1>\2.001Z</\1>#' 1.xml

