Snakemake¶
Command-line execution of Snakemake is handled by ATLAS, however, a user can still opt to use Snakemake outside of the ATLAS command line interface.
When executing a command from atlas
, the first line of output will include
the snakemake command. For example, if we are getting started and need to
download the databases, we execute:
atlas download --jobs 1 --out-dir metaomics/references
Here is the abbreviated output from the above command:
[2017-06-30 11:48 INFO] Executing: snakemake -s /people/brow015/anaconda3/lib/python3.5/site-packages/atlas/Snakefile -d /pic/projects/mint/metaomics -p -j 1 --nolock --rerun-incomplete --config db_dir='/pic/projects/mint/metaomics/references' workflow=download --
Provided cores: 1
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 all
6 transfer_files
7
rule transfer_files:
output: /pic/projects/mint/metaomics/references/silva_rfam_all_rRNAs.fa
jobid: 1
wildcards: filename=silva_rfam_all_rRNAs.fa
curl 'https://zenodo.org/record/804435/files/silva_rfam_all_rRNAs.fa' -s > /pic/projects/mint/metaomics/references/silva_rfam_all_rRNAs.fa
...
6 of 7 steps (86%) done
localrule all:
input: /pic/projects/mint/metaomics/references/adapters.fa, /pic/projects/mint/metaomics/references/phiX174_virus.fa, /pic/projects/mint/metaomics/references/silva_rfam_all_rRNAs.fa, /pic/projects/mint/metaomics/references/refseq.tree, /pic/projects/mint/metaomics/references/refseq.dmnd, /pic/projects/mint/metaomics/references/refseq.db
jobid: 0
Finished job 0.
7 of 7 steps (100%) done
All databases have downloaded and validated successfully.
When generating your configuration file, use '--database-dir /pic/projects/mint/metaomics/references'
We can see that the workflow being utilized is:
/people/brow015/anaconda3/lib/python3.5/site-packages/atlas/Snakefile
So any later or more complex use cases can simply call snakemake
and
specify --snakefile
for their local instance.
Provenance¶
Extra arguments on the command line are passed directly into the snakemake
call, so even within atlas run
we can do things like:
atlas run --jobs 24 -w test-dir --summary
This results in the call of:
snakemake --snakefile /path/to/atlas/Snakefile \
--directory WD \
--printshellcmds --jobs 12 --rerun-incomplete \
--configfile 'WD/config.yaml' \
--nolock --use-conda all
The output gives details per output file, which rule created the file, the creation date, and other information that is relevant to the file’s creation.
Snakemake’s --detailed-summary
adds columns for input file as well as the
shell command that was used.