Home > Enterprise >  AWK passing loop variables
AWK passing loop variables

Time:01-28

So I have this file containing timestamps.

cat file
2022/01/27-00:47:05;2022/01/27-00:47:05;
2022/01/27-00:47:06;2022/01/27-00:47:06;
2022/01/27-00:48:59;2022/01/27-00:48:59;
2022/01/27-01:38:06;2022/01/27-01:38:06;
2022/01/27-01:45:17;2022/01/27-01:45:17;
2022/01/27-01:47:46;2022/01/27-01:47:47;
<bunch of lines>
2022/01/27-15:00:01;2022/01/27-15:00:01;
2022/01/27-15:00:05;2022/01/27-15:00:05;
2022/01/27-15:00:06;2022/01/27-15:00:06;

And I was trying to create a for loop to isolate all those lines whose first field is 2022/01/27-hour:.

So far, this is what I've come up with, but it's not working:

for var in {00..23}
do
awk -F ';' -v var="$var" '$1 ~2022/01/27-var"' file > $var.txt
done

I’m getting no output at all.

And what trying to accomplish is getting 24 files, whose content is hourly timestamps.

00.txt: all lines whose first field matches 2022-01-27-00

01.txt: all lines whose first field matches 2022-01-27-01

…/…

23.txt: all lines whose first field matches 2022-01-27-23

I'm clearly missing something, but I don't know what, because this other thing seems to work just fine.

for kk in {00..23}
do
echo | awk -v kk="$kk" '{print kk}'
done
00
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23

I must be passing the variable the wrong way.

Any help would be greatly appreciated.

CodePudding user response:

You could use match explicitly instead of ~. eg:

awk 'match($1,"2022/01/27-" var )' var=00 FS=\; input

CodePudding user response:

Assumptions:

  • all lines start with a datetime stamp like YYYY/MM/DD-HH:
  • datetime stamps may cover multiple days but output files are still based simply on HH (ie, each HH.txt file could contain data from different days)

One awk idea that eliminates the need for the bash loop:

awk -F'[-:]' '{print $0 > $2".txt"}' file

NOTES:

  • -F'[-:] - define two input field delimiters (- and :)
  • use field #2 as the prefix for the name of the output file
  • we're talking about a max of 24 output files so there should be no issues of maxing out the number of open file descriptors

For the given sample input (sans the line <bunch of lines>) this generates:

$ for fname in {00..23}.txt; do [[ -f "${fname}" ]] && echo "########### $fname" && cat $fname; done
########### 00.txt
2022/01/27-00:47:05;2022/01/27-00:47:05;
2022/01/27-00:47:06;2022/01/27-00:47:06;
2022/01/27-00:48:59;2022/01/27-00:48:59;
########### 01.txt
2022/01/27-01:38:06;2022/01/27-01:38:06;
2022/01/27-01:45:17;2022/01/27-01:45:17;
2022/01/27-01:47:46;2022/01/27-01:47:47;
########### 15.txt
2022/01/27-15:00:01;2022/01/27-15:00:01;
2022/01/27-15:00:05;2022/01/27-15:00:05;
2022/01/27-15:00:06;2022/01/27-15:00:06;
  •  Tags:  
  • Related