filtering lines using bash-CodePudding

I have a text file containing personal data from people.

BEGIN:VCARD
FN:Rene van der Harten
N:van der Harten;Rene;J.;Sir;R.D.O.N.
SORT-STRING:Harten
END:VCARD
BEGIN:VCARD
FN:Robert Pau Shou Chang
N:Pau;Shou Chang;Robert
SORT-STRING:Pau
END:VCARD
BEGIN:VCARD
FN:Osamu Koura
N:Koura;Osamu
SORT-STRING:Koura
END:VCARD

I wanted to sort only the last name alphabetically. I've tried

grep N: <filename>

to filter the lines begin with N:, but it doesn't work.

CodePudding user response：

You need to specify that you want N: at the start of the line:

grep ^N: <filename>

CodePudding user response：

The problem is that grep will try to match anywhere in each line and not just from the beginning, making N: match all lines that contains N: You can use the anchor ^ to anchor the pattern to the start of the line:

$ grep '^N:' <filename>

If you only want to extract the text between the first : and the first ; then you might want to opt for AWK:

$ awk -F'[;:]' ' $1 == "N" { print $2 }' <filename> 
van der Harten
Pau
Koura

The way it works is that AWK will split the input string into fields, and the option -F specifies that the fields will be splitted by : and ;. Making the first field ($1) equal N and the second the last name.

CodePudding user response：

You need to do two things, as you can see:

grep "^N\:" test.txt | sort -t ":" -k 2

First, you filter on lines, starting with N::

^N : means that the line must start with capital 'N'
\: : means that next to that first 'N' you must have a semicolon.
     The backslash is meant to explain to "grep" this is not a range separator

Second, you sort on the second column:

-t ":" : means that the semicolon is the fields separator
-k 2   : means you need to sort on the second column