I want to sort a file in Unix and for that I am using command
sort file --field-separator=' ' --key=7,7
But position of this field is not fixed, sometimes it can be 7th field or sometimes 6th or 8th field in the line.
Do we know if its possible to sort the file based on field name, something like
sort file --field-separator=' ' --keyname=<my_unique_id>
File looks something like this, I want to sort on the basis of party_id
status_date="2000-01-31" ref_date="2021-03-31" ead_percent="0.00365316" accounting_standard="IFRS" party_default_status_cd="NOTDFLT" party_id="36113477" v_src_system_id="ABC"
status_date="2002-12-31" ref_date="2021-03-31" ead_percent="1" accounting_standard="IFRS" orig_src_system_id="GRD" party_default_status_cd="UNLIKE" party_id="36053415" v_src_system_id="XYZ"
CodePudding user response:
sort doesn't have a concept of named keys, but you can perform a Schwartzian transform to temporarily add the key as a prefix to the line, sort on the first field, then discard it.
sed 's/\(.*\)\(party_id="[^"]*"\)/\2 \1\2/' file |
sort -t ' ' -k1,1 |
cut -f2-
(where the whitespace between the two first back references and in the sort -t argument is a literal tab, which however Stack Overflow renders as a sequence of spaces).
CodePudding user response:
Using the decorate/sort/undecorate idiom and assuming that, like in the example you provided, your quoted strings don't contain blanks, =, or ":
$ awk -F'[ ="] ' -v OFS='\t' -v keyname='party_id' '{for (i=1; i<NF; i =2) if ($i == keyname) { print $(i 1), $0; next} }' file
36113477 status_date="2000-01-31" ref_date="2021-03-31" ead_percent="0.00365316" accounting_standard="IFRS" party_default_status_cd="NOTDFLT" party_id="36113477" v_src_system_id="ABC"
36053415 status_date="2002-12-31" ref_date="2021-03-31" ead_percent="1" accounting_standard="IFRS" orig_src_system_id="GRD" party_default_status_cd="UNLIKE" party_id="36053415" v_src_system_id="XYZ"
$ awk -F'[ ="] ' -v OFS='\t' -v keyname='party_id' '{for (i=1; i<NF; i =2) if ($i == keyname) { print $(i 1), $0; next} }' file |
sort -k1,1n
36053415 status_date="2002-12-31" ref_date="2021-03-31" ead_percent="1" accounting_standard="IFRS" orig_src_system_id="GRD" party_default_status_cd="UNLIKE" party_id="36053415" v_src_system_id="XYZ"
36113477 status_date="2000-01-31" ref_date="2021-03-31" ead_percent="0.00365316" accounting_standard="IFRS" party_default_status_cd="NOTDFLT" party_id="36113477" v_src_system_id="ABC"
$ awk -F'[ ="] ' -v OFS='\t' -v keyname='party_id' '{for (i=1; i<NF; i =2) if ($i == keyname) { print $(i 1), $0; next} }' file |
sort -k1,1n | cut -d$'\t' -f2-
status_date="2002-12-31" ref_date="2021-03-31" ead_percent="1" accounting_standard="IFRS" orig_src_system_id="GRD" party_default_status_cd="UNLIKE" party_id="36053415" v_src_system_id="XYZ"
status_date="2000-01-31" ref_date="2021-03-31" ead_percent="0.00365316" accounting_standard="IFRS" party_default_status_cd="NOTDFLT" party_id="36113477" v_src_system_id="ABC"
