Home > OS >  How do I output only a capture group with sed
How do I output only a capture group with sed

Time:01-15

I have an input file


Werkzeug==2.0.2 # https://github.com/pallets/werkzeug
ipdb==0.13.9  # https://github.com/gotcha/ipdb
psycopg2==2.9.1  # https://github.com/psycopg/psycopg2
watchgod==0.7  # https://github.com/samuelcolvin/watchgod

# Testing
# ------------------------------------------------------------------------------
mypy==0.910  # https://github.com/python/mypy
django-stubs==1.8.0  # https://github.com/typeddjango/django-stubs
pytest==6.2.5  # https://github.com/pytest-dev/pytest
pytest-sugar==0.9.4  # https://github.com/Frozenball/pytest-sugar
djangorestframework-stubs==1.4.0  # https://github.com/typeddjango/djangorestframework-stubs

# Documentation
# ------------------------------------------------------------------------------
sphinx==4.2.0  # https://github.com/sphinx-doc/sphinx
sphinx-autobuild==2021.3.14 # https://github.com/GaretJax/sphinx-autobuild

# Code quality
# ------------------------------------------------------------------------------
flake8==3.9.2  # https://github.com/PyCQA/flake8
flake8-isort==4.0.0  # https://github.com/gforcada/flake8-isort
coverage==6.0.2  # https://github.com/nedbat/coveragepy
black==21.9b0  # https://github.com/psf/black
pylint-django==2.4.4  # https://github.com/PyCQA/pylint-django
pylint-celery==0.3  # https://github.com/PyCQA/pylint-celery
pre-commit==2.15.0  # https://github.com/pre-commit/pre-commit

# Django
# ------------------------------------------------------------------------------
factory-boy==3.2.0  # https://github.com/FactoryBoy/factory_boy

django-debug-toolbar==3.2.2  # https://github.com/jazzband/django-debug-toolbar
django-extensions==3.1.3  # https://github.com/django-extensions/django-extensions
django-coverage-plugin==2.0.1  # https://github.com/nedbat/django_coverage_plugin
pytest-django==4.4.0  # https://github.com/pytest-dev/pytest-django

and I am trying to extract the parts before the # for every line beginning with pytest using this command

sed -nE "s/(^pytest. )#/\1/p" ./requirements/local.txt

Expected output

pytest==6.2.5  
pytest-sugar==0.9.4  
pytest-django==4.4.0  

Actual output

pytest==6.2.5   https://github.com/pytest-dev/pytest
pytest-sugar==0.9.4   https://github.com/Frozenball/pytest-sugar
pytest-django==4.4.0   https://github.com/pytest-dev/pytest-django

Any help to get the expected?

These refs have not helped solve this particular problem

CodePudding user response:

Using sed:

sed -nE 's/^(pytest[^=]*=[^[:blank:]]*).*/\1/p' file

pytest==6.2.5
pytest-sugar==0.9.4
pytest-django==4.4.0

However a grep -o solution would be even simpler:

grep -o '^pytest[^=]*=[^[:blank:]]*' file

pytest==6.2.5
pytest-sugar==0.9.4
pytest-django==4.4.0

Explanation:

  • ^pytest: Match pytest at the start
  • [^=]*: Match 0 or more of any character except =
  • =: Match a =
  • [^[:blank:]]*: Match 0 or more of non-whitespace characters

CodePudding user response:

1st solution: With awk you could try following. Using match function of awk here, written and tested in GNU awk should work in any any. Simple explanation would be, using match function of awk to match regex ^pytest[^ ]* to match starting value of pytest till 1st occurrence of space and print the matched value by using substr function of awk.

awk 'match($0,/^pytest[^ ]*/){print substr($0,RSTART,RLENGTH)}' Input_file

2nd solution: Using GNU awk try following where making use of RS variable of it.

awk -v RS='(^|\n)pytest[^ ]*' 'RT{sub(/^\n*/,"",RT);print RT}' Input_file

CodePudding user response:

You are missing the regex after #. This should solve it:

$ sed -nE "s/(^pytest. )#.*/\1/p" ./requirements/local.txt

CodePudding user response:

As an alternative using awk, you might also set the field separator to # preceded by optional spaces, and print the first column if it starts with pytest

awk -F"[[:blank:]]*#" '/^pytest/ {print $1}' ./requirements/local.txt

Output

pytest==6.2.5
pytest-sugar==0.9.4
pytest-django==4.4.0

If the # is not always present, you could also make the match more specific to match the number, and then print the first field:

awk '/^pytest[^[:blank:]]*==[0-9] (\.[0-9] )*/ {print $1}' file

CodePudding user response:

Using sed

$ sed -n '/^pytest/s/#.*//p' input_file
pytest==6.2.5
pytest-sugar==0.9.4
pytest-django==4.4.0

CodePudding user response:

A sed one-liner would be:

sed -e '/^pytest/!d' -e 's/[[:blank:]]*#.*//' file

The first expression deletes lines which don't begin with pytest. The second one deletes the comment portion (including blanks before the #), if any.

  •  Tags:  
  • Related