Need to the calculate % change between consecutive values in second column of a data file using awk.
The data file looks like -
-bash-4.2$ cat coun11.csv
16-SEP-21,4218596
17-SEP-21,3747178
Since the number of rows will remain fixed, i.e., 2, I don't want to use while with read to loop over the file, instead I want to use a single line statement awk, if possible.
The more i read on it in relation with awk, it seemed possible but didn't get the desired result.
Answer should 11.2% (can be negative) i.e., rounded off by two decimal places for which we can use OFMT i believe.
% diff should always be calculated with ((oldvalue-newvalue)/oldvalue)*100, wherein oldvalue refers to first row and newvalue to second row, which is evident from dates in first column.
One of the used commands awk 'BEGIN { NF==1?a=$2:b=$2}{c=(((a-b)/a)*100)} END {OFMT="%f";print c}' coun11.csv, which didn't work
CodePudding user response:
OP's current code:
awk 'BEGIN { NF==1?a=$2:b=$2}{c=(((a-b)/a)*100)} END {OFMT="%f";print c}' coun11.csv
A few problems:
BEGIN {...}processing is performed before the file is opened/read so ...NF=0within theBEGIN {...}block so ...aandbare undefined coming out of theBEGIN {...}block (NOTE: inawkundefined variables have a default value of 0)NFrefers to the number of fields in the current line but OP is looking to process specific lines 1 & 2 so ...- I'm guessing OP wants to use the
FNR(aka line number of current record) instead - for each line read from the file
awktries to applyc=(((a-b)/a)*100), but sincea==b==0OP should be getting an error (eg,... fatal: division by zero attempted) - the calculation should only be performed once and can either be performed as part of processing the 2nd line (
FNR==2) or simply moved to theEND {...}block OFMT="%f"is not going to generate the desired output format of11.2%; while OP has mentioned 2 decimal places the expected output only shows 1 decimal place; we'll look at formats to generate both; after that OP is free to experiment withOFMTas needed
One idea for some edits (and assuming OPs calculation is correct):
awk -F',' '
FNR==1 { a=$2 }
FNR==2 { b=$2 }
END { c=(((a-b)/a)*100)
printf "%.1f%\n", c # 1 decimal place
}
' coun11.csv
# or as a single line
awk -F',' 'FNR==1{a=$2}FNR==2{b=$2} END{c=(((a-b)/a)*100); printf "%.1f%\n", c}' coun11.csv
# using ternary for the test/assignment:
awk -F',' '
{ (FNR==1) ? (a=$2) : (b=$2) } # if FNR > 2 b will be overwritten with latest field #2
END { c=(((a-b)/a)*100)
printf "%.1f%\n", c
}
' coun11.csv
# or as a single line
awk -F',' '{(FNR==1)?(a=$2):(b=$2)} END{c=(((a-b)/a)*100); printf "%.1f%\n", c}' coun11.csv
All of these generate:
11.2%
For 2 decimal places we want printf "%.2f%\n",c which generates:
11.17%
CodePudding user response:
I think your calculation is a bit off. I would do:
printf "foo %d\n" 4218596 3747178 4218596 |
awk 'NR>1 {printf "%.1f\n", ($2/d) * 100 - 100} {d=$2}'
