I have a file and I only want to find lines that have "here". In each of these lines there are multiple string and integer values (see example below). I only want the first integer of each line that matches the pattern.
I have created a solution that uses a bash script, but is there a simpler method I am missing. I was hoping something like grep -w here -Eo [0-9] file would work. However when I try that it expects anything that comes after "here" to be the file.
STEP 1 STAGE 1 here other info
foo
bar
STEP 2 STAGE 1 here other info
more
foo
bar
STEP 3 STAGE 1 here other info
For this file the desired output would be
1
2
3
CodePudding user response:
Another variant with gnu-grep using -P for Perl-compatible regular expressions if supported:
grep -oP "^\D*\K\d (?=.*\bhere\b)" file
The pattern matches:
^Start of string\D*Match optional non digits\KForget what is matched do far\dMatch 1 digits(?=.*\bhere\b)Positive lookahead, asserthereto the right
Output
1
2
3
CodePudding user response:
This simpler awk should work for you:
awk '/ here / {sub(/^[^0-9] /, ""); print $1 0}' file
1
2
3
CodePudding user response:
With GNU awk you could try following awk code. Written and tested with your shown samples.
awk '
match($0,/(^|[[:space:]] )([0-9] )[[:space:]] .*here /,arr){
print arr[2]
}
' Input_file
Explanation: In GNU awk first searching string here keyword AND then using match function of GNU awk where using (^|[[:space:]] )([0-9] )[[:space:]] .*here regex which creates 2 capturing Groups and stores their values into an array named arr with index of 1,2 respectively. If all these conditions are verified then printing the 2nd element of that array which is required value(integer of line).
CodePudding user response:
grep is not the right command for this. I'd use sed:
sed -n '/ here /s/[^0-9]*\([0-9]*\).*/\1/p' file
