I am trying to use GNU Parallel to run a script that has multiple binary flags. I would like to enable/disable these as follows:
Given a script named "sample.py", with two options, "--seed" which takes an integer and "--something" which is a binary flag and takes no input, I would like to construct a call to parallel that produces the following calls:
python sample.py --seed 1111
python sample.py --seed 1111 --something
python sample.py --seed 2222
python sample.py --seed 2222 --something
python sample.py --seed 3333
python sample.py --seed 3333 --something
I've tried things like
parallel python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: "" --something
parallel python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: '' --something
parallel python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: \ --something
but haven't had any luck. Is what I'm trying to achieve possible with GNU parallel? I can modify my script to take explicit TRUE/FALSE values for the flag but I'd prefer to avoid that if possible.
CodePudding user response:
> bash$ cat sample.py
#!/usr/bin/python3
import sys
import time
time.sleep(0.2)
print(sys.argv)
> bash$ cat split.sh
#!/bin/sh
exec $*
> bash$ for seed in 1111 2222 3333; do \
printf "%s\0" "$seed" "$seed --something"; \
done \
| xargs -0 parallel \
sh split.sh python3 sample.py --
['sample.py', '1111', '--something']
['sample.py', '1111']
['sample.py', '2222']
['sample.py', '2222', '--something']
['sample.py', '3333', '--something']
['sample.py', '3333']
Explanation:
The first part, the loop, just creates the list of arguments:
1111
1111 --something
2222
2222 --something
3333
3333 --something
xargs will send that list from stdin to parallel's arguments.
split.sh splits its arguments by whitespace - here we're assuming that the arguments to your script don't have whitespace in them.
So we call sh split.sh which will basically execute the command by splitting arguments like 2222 --something to 2222 and --something.
Those arguments will be passed to python3 sample.py, so you get a shell command like python3 sample.py 2222 --something that is run by parallel.
If we didn't use split.sh and just called python directly (xargs -0 parallel python3 sample.py --), then when xargs passes 2222 --something as a single argument, parallel would have ran something like python3 sample.py '2222 --something'.
CodePudding user response:
You are so close.
GNU Parallel quotes replacement strings. That usually makes sense, because it is then safe to give it filenames like:
My brother's 12" records, all with ***.csv
which could otherwise give no end of troubles.
However, to be consistent GNU Parallel also quotes the empty string. And that is what is hitting you here.
--dry-run shows what is going on:
$ parallel --dry-run python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: '' --something
python sample.py --seed 1111 ''
python sample.py --seed 1111 --something
python sample.py --seed 2222 ''
python sample.py --seed 2222 --something
python sample.py --seed 3333 ''
python sample.py --seed 3333 --something
So how can you avoid that?
You can tell the shell to evaluate all strings:
parallel eval python sample.py --seed {1} {2} ::: 1111 2222 3333 ::: '' --something
but that might be a bit of a blunt hammer when you need a scalpel. From version 20190722 you can also use {=uq=}. uq() is a perl function which tells GNU Parallel that this replacement string should not be quoted:
$ parallel-20190722 --dry-run python sample.py --seed {1} {=2 uq=} ::: 1111 2222 3333 ::: '' --something
python sample.py --seed 1111
python sample.py --seed 1111 --something
python sample.py --seed 2222
python sample.py --seed 2222 --something
python sample.py --seed 3333
python sample.py --seed 3333 --something
