Good day,
I am new to Snakemake, so probably I am missing a lot. However, I have been trying to do somewhat with the scripts. As the docs say, you can use an external Python script. I have decided to create a super simple external script (not a bioinformatics one; just a script). But I am getting an error:
Command 'set -euo pipefail; returned non-zero exit status 1.
Here's the code of the external script:
def do_something(data_path, out_path, threads, myparam):
fOpen = open(out_path, 'w')
for line in data_path:
print("LINE IS", line)
for s in line:
print(s)
fOpen.write(s)
do_something(snakemake.input[0], snakemake.output[0], snakemake.threads, snakemake.config["myparam"])
Here's the Snakemake file:
rule test:
input:
"data/test1.txt"
output:
"data/test2.txt"
script:
"scripts/test.py"
Here's the -n output (snakemake -n -c1):
Building DAG of jobs...
Job stats:
job count min threads max threads
----- ------- ------------- -------------
test 1 1 1
total 1 1 1
[Mon Feb 7 12:29:48 2022]
rule test:
input: data/test1.txt
output: data/test2.txt
jobid: 0
resources: tmpdir=/var/folders/by/zt1z4xzs29580lnmwrw7f1880000gn/T
Job stats:
job count min threads max threads
----- ------- ------------- -------------
test 1 1 1
total 1 1 1
This was a dry-run (flag -n). The order of jobs does not reflect the order of execution.
What's possibly wrong?
CodePudding user response:
The formatting in the question makes it difficult to judge if the error is caused by some typo/indentation. Below is a simplified working version of your script.
Let this be the content of copy_file.py:
#!/usr/bin/env python3
def do_something(data_path, out_path):
with open(out_path, 'w') as f, open(data_path, 'r') as g:
for line in g:
print("LINE IS", line)
for s in line:
print(s)
f.write(s)
do_something(snakemake.input[0], snakemake.output[0])
Let this be the content of Snakefile:
rule test:
input:
"test1.txt"
output:
"test2.txt"
script:
"copy_file.py"
Finally, let's assume that test1.txt contains the following:
A
AB
ABC
Now, running snakemake -s Snakefile -j 1 in the terminal will print:
Building DAG of jobs...
Using shell: /usr/local/bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 test
1
[Mon Feb 7 16:47:11 2022]
rule test:
input: test1.txt
output: test2.txt
jobid: 0
LINE IS A
A
LINE IS AB
A
B
LINE IS ABC
A
B
C
[Mon Feb 7 16:47:11 2022]
Finished job 0.
1 of 1 steps (100%) done
Complete log: /Users/abc/wip/.snakemake/log/2022-02-07T164711.042034.snakemake.log
