General Slurm and unix format suggestion-CodePudding

I am looking for suggestions to overcome the problem I am facing. To provide context, I am trying to develop a tool for monitoring our in-house HPC clusters. Since we use slurm workload scheduling, I have made use of the provided commands from them.

I am running the following command: squeue -h -t R -O Partition,NumCPUs,tres-per-node which is used to tell for a partition CPUs allocated for the job and the resources like GPU. However, the partition names that we have are long which causes the columnar output to be treated as one value.

Output:

gpu-2080ti-interacti8                   gpu:1               
gpu-2080ti-interacti8                   gpu:1               
gpu-2080ti-interacti8                   gpu:1               
gpu-2080ti-interacti8                   gpu:1               
gpu-2080ti-interacti8                   gpu:1               
gpu-2080ti-interacti8                   gpu:1               
gpu-2080ti-interacti8                   gpu:1               
gpu-2080ti-interacti8                   gpu:1
gpu-2080ti-long     32                  gpu:4               
gpu-2080ti-long     16                  gpu:2               
gpu-v100            4                   gpu:1

If I run a awk on the above command as squeue -h -t R -O Partition,NumCPUs,tres-per-node| awk "{print \$1,\$2,\$3}" will be problematic because gpu-2080ti-interacti8 is treated like one value rather it should have been gpu-2080ti-interacti 8. I have already looked up for -o with --format but that does not work for me since tres-per-node does not exist in the % options that squeue provide. I am looking for a solution that can help me separate those values.

CodePudding user response：

The -O, --Format allows specifying a column width with :. So you can try

squeue -h -t R -O Partition:30,NumCPUs,tres-per-node

Replace the 30 with the maximum length of all partition names.