I'm trying to port some code from bash 5.1 to 4.2.46. One function which tries to strip color codes from a specifically formatted string stopped working.
This is a sample string text in such format. I turn on extended globbing for this.
text="$(printf -- "%b%s%b" "\[\e[31m\]" "hello" "\[\e[0m\]")"
shopt -s extglob
In bash 5.1, this parameter expansion works to remove all the color codes and escape characters
bash-5.1$ echo "${text//$'\[\e'\[/}"
31m\]hello0m\]
bash-5.1$ echo "${text//$'\[\e'\[ ([0-9])/}"
m\]hellom\]
bash-5.1$ echo "${text//$'\[\e'\[ ([0-9])m$'\]'/}"
hello
In bash 4.2.46, I start getting a different behavior as I build up the parameter expansion.
bash-4.2.46$ echo "${text//$'\[\e'\[/}"
\31m\]hello\0m\]
bash-4.2.46$ echo "${text//$'\[\e'\[ ([0-9])/}"
\[\]hello\[\] ## no longer matches because ` ([0-9])` doesn't follow `\[`
The difference comes from this line: echo "${text//$'\[\e'\[/}"
bash-5.1: 31m\]hello0m\]
bash-4.2.46: \31m\]hello\0m\]
Here's what printf "%q" "${text//$'\[\e'\[/}" shows:
bash-5.1: 31m\\\]hello0m\\\]
bash-4.2.46: \\31m\\\]hello\\0m\\\]
Where is the extra \ coming from in 4.2.26?
Even when I try to remove it, the pattern stops matching:
bash-4.2.46$ echo "${text//$'\[\e'\[\\/}"
\[\]hello\[\] ## no longer matches because `\\` doesn't follow `\[`
I'm guessing there may be a bug related to parameter expansion, backslash escaping, and extended globbing.
I am aiming to write code that works on bash 4.0 onward, so I'm looking for a workaround primarily. An explanation (bug report, etc.) to why the behavior difference happens would be great, though.
CodePudding user response:
Seems like a bug in bash. By bisecting the available versions, I found that 4.2.53(1)-release was the last version with this bug. Version 4.3.0(1)-release fixed the problem.
The list of changes mentions a few bug fixes in this direction. Maybe it was one of below bugfixes:
This document details the changes between this version, bash-4.3-alpha, and the previous version, bash-4.2-release.
[...]
zz. When using the pattern substitution word expansion, bash now runs the replacement string through quote removal, since it allows quotes in that string to act as escape characters. This is not backwards compatible, so it can be disabled by setting the bash compatibility mode to 4.2.
[...]
eee. Fixed a logic bug that caused extended globbing in a multibyte locale to cause failures when using the pattern substititution word expansions.
Workaround
Instead of using parameter expansions with extglobs, use bash pattern matching with actual regexes (available in bash 3.0.0 and higher):
text=$'\[\e[31m\]hello\[\e[0m\]'
while [[ "$text" =~ (.*)$'\[\e['[0-9]*'m\]'(.*) ]]; do
text="${BASH_REMATCH[1]}${BASH_REMATCH[2]}"
done
echo "$text"
or rely on an external (but posix standarized) tool like sed:
text=$'\[\e[31m\]hello\[\e[0m\]'
text=$(sed $'s#\\\[\e[[0-9]*m\\\]##g' <<< "$text")
echo "$text"
CodePudding user response:
The problem seems to be parsing $'...' inside ${test//<here>} when inside " quotes.
$ test='f() { "${text//\[$'\''\e'\''\[ ([0-9])/}"; }; printf "%q\n" "$(declare -f f)"'; echo -n 'bash4.1 '; docker run bash:4.1 bash -c "$test" ; echo -n 'bash5.1 '; bash -c "$test"
bash4.1 $'f () \n{ \n "${text//\\[\E\\[ ([0-9])/}"\n}'
bash5.1 $'f () \n{ \n "${text//\\[\'\E\'\\[ ([0-9])/}"\n}'
Just use a variable.
esc=$'\e'
echo "${text//\\\[$esc\[ ([0-9])/}"
