Have this string in a file and want to just extract the relative link:
<a href="/FreeCAD/FreeCAD-Bundle/releases/download/weekly-builds/FreeCAD_weekly-builds-28909-2022-05-20-conda-Linux-x86_64-py39.AppImage" rel="nofollow" data-skip-pjax>
This works in https://regexr.com/6m4vg :
/FreeCAD/[^]*AppImage
But returns nothing in grep.
grep -E '/FreeCAD/\[^]*AppImage' somefile
How can I make it work? Thanks.
Edit: source file:
wget https://github.com/FreeCAD/FreeCAD-Bundle/releases/tag/weekly-builds
Desired output:
/FreeCAD/FreeCAD-Bundle/releases/download/weekly-builds/FreeCAD_weekly-builds-28909-2022-05-20-conda-Linux-x86_64-py39.AppImage
CodePudding user response:
You need to use [^"]* instead of [^]*:
grep -o '/FreeCAD/[^"]*AppImage' somefile
/FreeCAD/[^]*AppImage works online because you test the pattern against the ECMAScript engine, but grep -E uses a POSIX ERE regex flavor, where the negated bracket expression should not be empty.
[^] in an ECMAScript regex flavor matches any char, so here, since grep works on a line by line basis, you can replace it with .*.
However, since the text you want to match cannot contain ", you can also use a more appropriate [^"]* pattern that matches zero or more chars other than a " char.
