I'm trying to match a certain field and update its data from a file delimited with multiple characters. I'm using this to create an imitation of SQL's UPDATE. This is part of a bigger project to create a mini DBMS with bash.
What I tried:
sed "s/\^\_\^/:/g" $file_path | cut -d: -f1 | grep -nw SAR | sed "s/\^\_\^/:/g" | cut -d: -f2 | sed -i "s/$match/$update/g"
My issue is I am unable to use sed -i to update only these specific columns found as you can't pipe into it.
The delimiter being used is : ^_^
Example of data file:
'EGP'^_^'Egypt'
'SAR'^_^'Europe'
'SAR'^_^'Europe'
'SAR'^_^'Europe'
'SAR'^_^'Europe'
'Europe'^_^'SAR'
'SAR'^_^'Europe'
'MYR'^_^'Malaysia'
'MYR'^_^'Malasia'
my $match can be SAR for example and $update would be USD
Expected change to data file
'EGP'^_^'Egypt'
'USD'^_^'Europe'
'USD'^_^'Europe'
'USD'^_^'Europe'
'USD'^_^'Europe'
'Europe'^_^'SAR'
'USD'^_^'Europe'
'MYR'^_^'Malaysia'
'MYR'^_^'Malasia'
If there is a different approach that is better, that is also welcome as I am fairly new to bash scripting.
CodePudding user response:
Better to use awk here:
awk -v s="'SAR'" -v q="'USD'" -F'\\^_\\^' -v OFS='^_^' '$1==s {$1=q} 1' file
'EGP'^_^'Egypt'
'USD'^_^'Europe'
'USD'^_^'Europe'
'USD'^_^'Europe'
'USD'^_^'Europe'
'Europe'^_^'SAR'
'USD'^_^'Europe'
'MYR'^_^'Malaysia'
'MYR'^_^'Malasia'
Explanation:
-v s="'SAR'"sets command line variables='SAR'-v q="'USD'"sets command line variables='USD'-F '\\^_\\^'sets input field separator as^_^. We need to escape^as that is a special regex meta character and we need to double escape it because we're using it in a string that is later converted to a regexp (field separator).-v OFS='^_^'sets output field separator to^_^$1 == scompares first field to'SAR'$1 = qsets$1to variable'USD'
CodePudding user response:
Using sed you could use $match at the start of the string ^, and change it with $update:
match="'SAR'"
update="'USD'"
sed "s/^$match\^_\^/$update^_^/" file
Output
'EGP'^_^'Egypt'
'USD'^_^'Europe'
'USD'^_^'Europe'
'USD'^_^'Europe'
'USD'^_^'Europe'
'Europe'^_^'SAR'
'USD'^_^'Europe'
'MYR'^_^'Malaysia'
'MYR'^_^'Malasia'
See a bash demo.
CodePudding user response:
With your shown samples, please try following awk program.
awk -v matchvalue="'SAR'" -v updatedvalue="'USD'" '
match($0,"^"matchvalue"\\^_\\^"){
print updatedvalue"^_^" substr($0,RSTART RLENGTH)
next
}
1
' Input_file
Explanation: Creating 2 variables named matchvalue and updatedvalue with 'SAR' and 'USD' values shown by OP in samples. Then in main program using match function to match if a line starts with 'SAR'^_^ then print new value in its place and then print rest of the line, if line doesn't start from 'SAR'^_^ then simply print it.
CodePudding user response:
Here's what I ended up doing:
escaped_delm=$(echo $curr_delim | sed 's/[^^\\]/[&]/g; s/\^/\\^/g; s/\\/\\\\/g')
awk -F"$escaped_delm" -v a_col_update=$update_field -v pick=$p_tmp_field -v a_del="$match" -v a_ins="$insert" -v OFS="$curr_delim" '$pick==a_del {$a_col_update=a_ins} 1' $t_path > tmp && mv tmp $t_path
The biggest issue I faced is that I am passing my delimiter ^_^ as a variable to awk. My script assumes that I don't know what the variable is and that it is passed to it at the beginning of the script. This causes an issue, just like the first answer suggested the -F option for awk thinks that ^ is a regex argument.
This is where the sed command I used comes in super handy! It sanitizes the delimiter by escaping any regex meta characters. Would like to thank Ed Morton for the great answer to this question where I got the sed command from.
Hope someone else finds this useful!
My awk variables are:
curr_delim="^_^"a_col_update=<column number to be updated>pick=<column used to pick the record/row>a_del=<field to be updated from picked record/row>a_ins=<new value for the picked field>OFS="$curr_delim"
