I am looking for suggestions to overcome the problem I am facing. To provide context, I am trying to develop a tool for monitoring our in-house HPC clusters. Since we use slurm workload scheduling, I have made use of the provided commands from them.
I am running the following command:
squeue -h -t R -O Partition,NumCPUs,tres-per-node which is used to tell for a partition CPUs allocated for the job and the resources like GPU. However, the partition names that we have are long which causes the columnar output to be treated as one value.
Output:
gpu-2080ti-interacti8 gpu:1
gpu-2080ti-interacti8 gpu:1
gpu-2080ti-interacti8 gpu:1
gpu-2080ti-interacti8 gpu:1
gpu-2080ti-interacti8 gpu:1
gpu-2080ti-interacti8 gpu:1
gpu-2080ti-interacti8 gpu:1
gpu-2080ti-interacti8 gpu:1
gpu-2080ti-long 32 gpu:4
gpu-2080ti-long 16 gpu:2
gpu-v100 4 gpu:1
If I run a awk on the above command as squeue -h -t R -O Partition,NumCPUs,tres-per-node| awk "{print \$1,\$2,\$3}" will be problematic because gpu-2080ti-interacti8 is treated like one value rather it should have been gpu-2080ti-interacti 8. I have already looked up for -o with --format but that does not work for me since tres-per-node does not exist in the % options that squeue provide. I am looking for a solution that can help me separate those values.
CodePudding user response:
The -O, --Format allows specifying a column width with :. So you can try
squeue -h -t R -O Partition:30,NumCPUs,tres-per-node
Replace the 30 with the maximum length of all partition names.
