I am trying to extract a word/words from a string using bash. I did try to follow https://stackoverflow.com/a/27534223/13816738 but was partially successful. i have a string looks like below
s = abc-rabb-123 or s = abc-xyt-ppt-abt-004-456
What I would like is to get middle word/words such as rabb or xyt-ppt-abt-004
any ideas?
Actual code Scenario 1
s= 'extract-zskqxcrbdj-1823'
[[ "$s" =~ (-[^[:space:]^-] ) ]];
echo "${BASH_REMATCH[1]}"```
output -zskqxcrbdj
i want zskqxcrbdj
Scenario 2
s= 'abc-xyt-ppt-abt-004-456'
[[ "$s" =~ (-[^[:space:]^-] ) ]];
echo "${BASH_REMATCH[1]}"```
output -xyt
i want xyt-ppt-abt-004
CodePudding user response:
If the sole purpose is to strip off the first and last - delimited fields, one idea would be to use bash parameter expansion/substitution; this in turn eliminates the need to spawn any subprocesses (eg, for sed/cut/awk):
for s in 'abc-rabb-123' 'abc-xyt-ppt-abt-004-456' 'extract-zskqxcrbdj-1823'
do
echo "############ $s"
x="${s#*-}"
x="${x%-*}"
echo "${x}"
done
This generates:
############ abc-rabb-123
rabb
############ abc-xyt-ppt-abt-004-456
xyt-ppt-abt-004
############ extract-zskqxcrbdj-1823
zskqxcrbdj
One approach using a regex and the BASH_REMATCH[] array:
regex='^[^-]*-(.*)-[^-]*$'
for s in 'abc-rabb-123' 'abc-xyt-ppt-abt-004-456' 'extract-zskqxcrbdj-1823'
do
echo "############ $s"
if [[ "${s}" =~ $regex ]]
then
x="${BASH_REMATCH[1]}"
echo "${x}"
fi
done
Some comments on regex:
- I've opted to anchor the beginning/ending of the regex with
^and$ ^[^-]*- from start of string match 0 or more characters that are not a--- a literal-(.*)- (1st capture group) all characters-- a literal-[^-]*$- match 0 or more characters that are not-, match until the end of the string- if there's a match then
BASH_REMATCH[1]should contain the contents of the 1st capture group - NOTE: add
typeset -p BASH_REMATCHto see entire contents of the array)
This generates:
############ abc-rabb-123
rabb
############ abc-xyt-ppt-abt-004-456
xyt-ppt-abt-004
############ extract-zskqxcrbdj-1823
zskqxcrbdj
NOTE: OP can decide if additional checks need to be added in the case of a string that contains less than three - delimited fields
CodePudding user response:
This can be done with the sed utility:
echo "abc-xyt-ppt-abt-004-456" | sed 's/[^-]*-\(.*\)-.*/\1/'
Output:
xyt-ppt-abt-004
CodePudding user response:
echo "abc-xyt-ppt-abt-004-456" | awk -F'-' '{{for (i=2;i<NF;i ) {d=i<NF-1?"-":"";a=a$i""d}};print a}'
CodePudding user response:
You can use the cut command:
echo abc-xyt-ppt-abt-004-456 | cut -d'-' -f2-5
Result: xyt-ppt-abt-004
echo abc-rabb-123 | cut -d'-' -f2
Result: rabb
In this cases -d is the delimiter/separator, which is -, and -f is a field, a selection or a range, you can also do something like:
echo abc-xyt-ppt-abt-004-456 | cut -d'-' -f2,3,5
Result: xyt-ppt-004
CodePudding user response:
if u just wanna strip both ends :
{m,n,g}awk NF OFS= FS='^[^-]*-|-[^-]*$'
xyt-ppt-abt-004
