I'm currently writing a bash script to get the first value among the many comma separated strings. I have a file that looks like this -
name
things: "water bottle","40","new phone cover",10
place
I just need to return the value in first double quotes.
water bottle
The value in first double quotes can be one word/two words. That is, water bottle can be sometimes replaced with pen.
I tried -
awk '/:/ {print $2}'
But this just gives
water
I wanted to comma separate it, but there's colon(:) after things. So, I'm not sure how to separate it.
How do i get the value present in first double quotes?
EDIT:
SOLUTION: I used the below code since I particularly wanted to use awk -
awk '/:/' test.txt | cut -d\" -f2
CodePudding user response:
A solution using the cut utility could be
cut -d\" -f2 infile > outfile
CodePudding user response:
Using gnu awk you could make use of a capture group, and use a negated character class to not cross the , as that is the field delimiter.
awk 'match($0, /^[^",:]*:[^",]*"([^"]*)"/, a) {print a[1]}' file
Output
water bottle
The pattern matches
^Start of string[^",:]*:Optionally match any value except"and,and:, then match:[^",]*Optionally match any value except"and,"([^"]*)"Capture in group 1 the value between double quotes
If the value is always between double quotes, a short option to get the desired result could be setting the field separator to " and check if group 1 contains a colon, although technically you can also get water bottle if there is only a leading double quote and not closing one.
awk -F'"' '$1 ~ /:/ {print $2}' file
CodePudding user response:
You can use sed:
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' file > outfile
See the online demo:
#!/bin/bash
s='name
things: "water bottle","40","new phone cover",10
place'
sed -n 's/^[^"]*"\([^"]*\)".*/\1/p' <<< "$s"
# => water bottle
The command means
-n- the option suppresses the default line output^[^"]*"\([^"]*\)".*- a POSIX BRE regex pattern that matches^- start of string[^"]*- zero or more chars other than""- a"char\([^"]*\)- Group 1 (\1refers to this value): any zero or more chars other than"".*- a"char and the rest of the string.
\1replaces the match with Group 1 valuep- only prints the result of a successful substitution.
