I need your assistance to find the list of unmatched in the Employee.txt from the following examples on AIX 6.x.
Employee.txt
1|Sam|Smith|Seatle
2|Barry|Jones|Seatle
3|Garry|Brown|Houston
4|George|Bla|LA
5|Celine|Wood|Atlanta
6|Jody|Ford|Chicago
Car.txt
100|red|1
110|green|9
120|yellow|2
130|yellow|6
140|red|8
150|white|0
bash-4.3$ awk -F"|" 'NR==FNR { empcar[$1]=$0; next } { if (empcar[$3]) print empcar[$3] "|" $1 "|" $2 > "match.txt"; else print $0 > "no_match.txt" }' Employee.txt Car.txt
110|green|9
140|red|8
150|white|0
match.txt
1|Sam|Smith|Seatle|100|red
2|Barry|Jones|Seatle|120|yellow
6|Jody|Ford|Chicago|130|yellow
no_match.txt
110|green|9
140|red|8
150|white|0
bash-4.3$ awk -F"|" 'NR==FNR { empcar[$1]=$0; next } !($3 in empcar)' employee.txt car.txt produced the same list as in the no_match.txt.
However, I want the no_match.txt to be as follows:
3|Garry|Brown|Houston
4|George|Bla|LA
5|Celine|Wood|Atlanta
In other words, print the row in Employee.txt when does not have employee no. in Car.txt. I couldn’t work out how to reference those unmatched records in the else statement.
I also encountered a lot of unexplained duplicates in the match.txt with my private confidential data that cannot be disclosed.
Many thanks, George
CodePudding user response:
print the row in
Employee.txtwhen does not have employee no. inCar.txt.
You may use this solution:
awk -F"|" '
NR == FNR {
empcar[$3]
next
}
{
print > ($1 in empcar ? "match.txt" : "no_match.txt")
}' Car.txt Employee.txt
cat match.txt
1|Sam|Smith|Seatle
2|Barry|Jones|Seatle
6|Jody|Ford|Chicago
cat no_match.txt
3|Garry|Brown|Houston
4|George|Bla|LA
5|Celine|Wood|Atlanta
Note that we are processing Car.txt as first file and storing all IDs from 3rd field in array empcar. Later while processing Employee.txt we just redirect output to match or no match based on the condition if $1 from later file exists in associative array empcar or not.
