Home > Software design >  How to sort a file based on key name instead of its position in unix?
How to sort a file based on key name instead of its position in unix?

Time:01-12

I want to sort a file in Unix and for that I am using command

sort file --field-separator=' ' --key=7,7

But position of this field is not fixed, sometimes it can be 7th field or sometimes 6th or 8th field in the line.

Do we know if its possible to sort the file based on field name, something like

sort file --field-separator=' ' --keyname=<my_unique_id>

File looks something like this, I want to sort on the basis of party_id

status_date="2000-01-31" ref_date="2021-03-31" ead_percent="0.00365316" accounting_standard="IFRS" party_default_status_cd="NOTDFLT" party_id="36113477" v_src_system_id="ABC"
status_date="2002-12-31" ref_date="2021-03-31" ead_percent="1" accounting_standard="IFRS" orig_src_system_id="GRD" party_default_status_cd="UNLIKE" party_id="36053415" v_src_system_id="XYZ"

CodePudding user response:

sort doesn't have a concept of named keys, but you can perform a Schwartzian transform to temporarily add the key as a prefix to the line, sort on the first field, then discard it.

sed 's/\(.*\)\(party_id="[^"]*"\)/\2    \1\2/' file |
sort -t '   ' -k1,1 |
cut -f2-

(where the whitespace between the two first back references and in the sort -t argument is a literal tab, which however Stack Overflow renders as a sequence of spaces).

CodePudding user response:

Using the decorate/sort/undecorate idiom and assuming that, like in the example you provided, your quoted strings don't contain blanks, =, or ":

$ awk -F'[ ="] ' -v OFS='\t' -v keyname='party_id' '{for (i=1; i<NF; i =2) if ($i == keyname) { print $(i 1), $0; next} }' file
36113477        status_date="2000-01-31" ref_date="2021-03-31" ead_percent="0.00365316" accounting_standard="IFRS" party_default_status_cd="NOTDFLT" party_id="36113477" v_src_system_id="ABC"
36053415        status_date="2002-12-31" ref_date="2021-03-31" ead_percent="1" accounting_standard="IFRS" orig_src_system_id="GRD" party_default_status_cd="UNLIKE" party_id="36053415" v_src_system_id="XYZ"

$ awk -F'[ ="] ' -v OFS='\t' -v keyname='party_id' '{for (i=1; i<NF; i =2) if ($i == keyname) { print $(i 1), $0; next} }' file |
    sort -k1,1n
36053415        status_date="2002-12-31" ref_date="2021-03-31" ead_percent="1" accounting_standard="IFRS" orig_src_system_id="GRD" party_default_status_cd="UNLIKE" party_id="36053415" v_src_system_id="XYZ"
36113477        status_date="2000-01-31" ref_date="2021-03-31" ead_percent="0.00365316" accounting_standard="IFRS" party_default_status_cd="NOTDFLT" party_id="36113477" v_src_system_id="ABC"

$ awk -F'[ ="] ' -v OFS='\t' -v keyname='party_id' '{for (i=1; i<NF; i =2) if ($i == keyname) { print $(i 1), $0; next} }' file |
    sort -k1,1n | cut -d$'\t' -f2-
status_date="2002-12-31" ref_date="2021-03-31" ead_percent="1" accounting_standard="IFRS" orig_src_system_id="GRD" party_default_status_cd="UNLIKE" party_id="36053415" v_src_system_id="XYZ"
status_date="2000-01-31" ref_date="2021-03-31" ead_percent="0.00365316" accounting_standard="IFRS" party_default_status_cd="NOTDFLT" party_id="36113477" v_src_system_id="ABC"
  •  Tags:  
  • Related