snakemake package¶
Subpackages¶
- snakemake.remote package
- Submodules
- snakemake.remote.EGA module
- snakemake.remote.FTP module
- snakemake.remote.GS module
- snakemake.remote.HTTP module
- snakemake.remote.NCBI module
- snakemake.remote.S3 module
- snakemake.remote.S3Mocked module
- snakemake.remote.SFTP module
- snakemake.remote.XRootD module
- snakemake.remote.dropbox module
- snakemake.remote.gfal module
- snakemake.remote.gridftp module
- snakemake.remote.iRODS module
- snakemake.remote.webdav module
- Module contents
- snakemake.report package
Submodules¶
snakemake.benchmark module¶
snakemake.checkpoints module¶
snakemake.common module¶
-
class
snakemake.common.
Mode
[source]¶ Bases:
object
Enum for execution mode of Snakemake. This handles the behavior of e.g. the logger.
-
cluster
= 2¶
-
default
= 0¶
-
subprocess
= 1¶
-
snakemake.conda module¶
snakemake.cwl module¶
-
snakemake.cwl.
cwl
(path, basedir, input, output, params, wildcards, threads, resources, log, config, rulename, use_singularity, bench_record, jobid)[source]¶ Load cwl from the given basedir + path and execute it.
snakemake.dag module¶
-
class
snakemake.dag.
Batch
(rulename: str, idx: int, batches: int)[source]¶ Bases:
object
Definition of a batch for calculating only a partial DAG.
-
get_batch
(items: list)[source]¶ Return the defined batch of the given items. Items are usually input files.
-
is_final
¶
-
-
class
snakemake.dag.
DAG
(workflow, rules=None, dryrun=False, targetfiles=None, targetrules=None, forceall=False, forcerules=None, forcefiles=None, priorityfiles=None, priorityrules=None, untilfiles=None, untilrules=None, omitfiles=None, omitrules=None, ignore_ambiguity=False, force_incomplete=False, ignore_incomplete=False, notemp=False, keep_remote_local=False, batch=None)[source]¶ Bases:
object
Directed acyclic graph of jobs.
-
archive
(path)[source]¶ Archives workflow such that it can be re-run on a different system.
Archiving includes git versioned files (i.e. Snakefiles, config files, …), ancestral input files and conda environments.
-
bfs
(direction, *jobs, stop=<function DAG.<lambda>>)[source]¶ Perform a breadth-first traversal of the DAG.
-
check_and_touch_output
(job, wait=3, ignore_missing_output=False, no_touch=False, force_stay_on_remote=False)[source]¶ Raise exception if output files of job are missing.
-
check_directory_outputs
()[source]¶ Check that no output file is contained in a directory output of the same or another rule.
-
check_incomplete
()[source]¶ Check if any output files are incomplete. This is done by looking up markers in the persistence module.
-
check_periodic_wildcards
(job)[source]¶ Raise an exception if a wildcard of the given job appears to be periodic, indicating a cyclic dependency.
-
checkpoint_jobs
¶
-
collect_potential_dependencies
(job)[source]¶ Collect all potential dependencies of a job. These might contain ambiguities. The keys of the returned dict represent the files to be considered.
-
dfs
(direction, *jobs, stop=<function DAG.<lambda>>, post=True)[source]¶ Perform depth-first traversal of the DAG.
-
downstream_of_omitfrom
()[source]¶ Returns the downstream of –omit-from rules or files and themselves.
-
dynamic
(job)[source]¶ Return whether a job is dynamic (i.e. it is only a placeholder for those that are created after the job with dynamic output has finished.
-
dynamic_output_jobs
¶ Iterate over all jobs with dynamic output files.
-
filegraph_dot
(node2rule=<function DAG.<lambda>>, node2style=<function DAG.<lambda>>, node2label=<function DAG.<lambda>>)[source]¶
-
finish
(job, update_dynamic=True)[source]¶ Finish a given job (e.g. remove from ready jobs, mark depending jobs as ready).
-
finished_jobs
¶ Iterate over all jobs that have been finished.
-
handle_pipes
()[source]¶ Use pipes to determine job groups. Check if every pipe has exactly one consumer
-
handle_remote
(job, upload=True)[source]¶ Remove local files if they are no longer needed and upload.
-
incomplete_external_jobid
(job)[source]¶ Return the external jobid of the job if it is marked as incomplete.
Returns None, if job is not incomplete, or if no external jobid has been registered or if force_incomplete is True.
-
incomplete_files
¶ Return list of incomplete files.
-
jobs
¶ All jobs in the DAG.
-
level_bfs
(direction, *jobs, stop=<function DAG.<lambda>>)[source]¶ Perform a breadth-first traversal of the DAG, but also yield the level together with each job.
-
local_needrun_jobs
¶ Iterate over all jobs that need to be run and are marked as local.
-
needrun_jobs
¶ Jobs that need to be executed.
-
new_job
(rule, targetfile=None, format_wildcards=None)[source]¶ Create new job for given rule and (optional) targetfile. This will reuse existing jobs with the same wildcards.
-
new_wildcards
(job)[source]¶ Return wildcards that are newly introduced in this job, compared to its ancestors.
-
newversion_files
¶ Return list of files where the current version is newer than the recorded version.
-
noneedrun_finished
(job)[source]¶ Return whether a given job is finished or was not required to run at all.
-
postprocess
()[source]¶ Postprocess the DAG. This has to be invoked after any change to the DAG topology.
-
ready_jobs
¶ Jobs that are ready to execute.
-
specialize_rule
(rule, newrule)[source]¶ Specialize the given rule by inserting newrule into the DAG.
-
temp_size
(job)[source]¶ Return the total size of temporary input files of the job. If none, return 0.
-
unshadow_output
(job, only_log=False)[source]¶ Move files from shadow directory to real output paths.
-
update
(jobs, file=None, visited=None, skip_until_dynamic=False, progress=False)[source]¶ Update the DAG by adding given jobs and their dependencies.
-
update_
(job, visited=None, skip_until_dynamic=False, progress=False)[source]¶ Update the DAG by adding the given job and its dependencies.
-
snakemake.decorators module¶
snakemake.exceptions module¶
-
exception
snakemake.exceptions.
AmbiguousRuleException
(filename, job_a, job_b, lineno=None, snakefile=None)[source]¶
-
exception
snakemake.exceptions.
ChildIOException
(parent=None, child=None, wildcards=None, lineno=None, snakefile=None, rule=None)[source]¶
-
exception
snakemake.exceptions.
CreateCondaEnvironmentException
(*args, lineno=None, snakefile=None, rule=None)[source]¶
-
exception
snakemake.exceptions.
CreateRuleException
(message=None, include=None, lineno=None, snakefile=None, rule=None)[source]¶
-
exception
snakemake.exceptions.
IOException
(prefix, rule, files, include=None, lineno=None, snakefile=None)[source]¶
-
exception
snakemake.exceptions.
ImproperOutputException
(rule, files, include=None, lineno=None, snakefile=None)[source]¶
-
exception
snakemake.exceptions.
IncompleteCheckpointException
(rule, targetfile)[source]¶ Bases:
Exception
-
exception
snakemake.exceptions.
InputFunctionException
(msg, wildcards=None, lineno=None, snakefile=None, rule=None)[source]¶
-
exception
snakemake.exceptions.
MissingInputException
(rule, files, include=None, lineno=None, snakefile=None)[source]¶
-
exception
snakemake.exceptions.
MissingOutputException
(message=None, include=None, lineno=None, snakefile=None, rule=None)[source]¶
-
exception
snakemake.exceptions.
PeriodicWildcardError
(message=None, include=None, lineno=None, snakefile=None, rule=None)[source]¶
-
exception
snakemake.exceptions.
ProtectedOutputException
(rule, files, include=None, lineno=None, snakefile=None)[source]¶
-
exception
snakemake.exceptions.
RuleException
(message=None, include=None, lineno=None, snakefile=None, rule=None)[source]¶ Bases:
Exception
Base class for exception occuring within the execution or definition of rules.
-
messages
¶
-
-
exception
snakemake.exceptions.
UnexpectedOutputException
(rule, files, include=None, lineno=None, snakefile=None)[source]¶
-
exception
snakemake.exceptions.
UnknownRuleException
(name, prefix='', lineno=None, snakefile=None)[source]¶
-
exception
snakemake.exceptions.
WildcardError
(*args, lineno=None, snakefile=None, rule=None)[source]¶
-
exception
snakemake.exceptions.
WorkflowError
(*args, lineno=None, snakefile=None, rule=None)[source]¶ Bases:
Exception
snakemake.executors module¶
-
class
snakemake.executors.
AbstractExecutor
(workflow, dag, printreason=False, quiet=False, printshellcmds=False, printthreads=True, latency_wait=3, keepincomplete=False)[source]¶ Bases:
object
-
class
snakemake.executors.
CPUExecutor
(workflow, dag, workers, printreason=False, quiet=False, printshellcmds=False, use_threads=False, latency_wait=3, cores=1, keepincomplete=False)[source]¶
-
class
snakemake.executors.
ClusterExecutor
(workflow, dag, cores, jobname='snakejob.{name}.{jobid}.sh', printreason=False, quiet=False, printshellcmds=False, latency_wait=3, cluster_config=None, local_input=None, restart_times=None, exec_job=None, assume_shared_fs=True, max_status_checks_per_second=1, disable_default_remote_provider_args=False, disable_get_default_resources_args=False, keepincomplete=False)[source]¶ Bases:
snakemake.executors.RealExecutor
-
default_jobscript
= 'jobscript.sh'¶
-
tmpdir
¶
-
-
class
snakemake.executors.
DRMAAClusterJob
(job, jobid, callback, error_callback, jobscript)¶ Bases:
tuple
-
callback
¶ Alias for field number 2
-
error_callback
¶ Alias for field number 3
-
job
¶ Alias for field number 0
-
jobid
¶ Alias for field number 1
-
jobscript
¶ Alias for field number 4
-
-
class
snakemake.executors.
DRMAAExecutor
(workflow, dag, cores, jobname='snakejob.{rulename}.{jobid}.sh', printreason=False, quiet=False, printshellcmds=False, drmaa_args='', drmaa_log_dir=None, latency_wait=3, cluster_config=None, restart_times=0, assume_shared_fs=True, max_status_checks_per_second=1, keepincomplete=False)[source]¶
-
class
snakemake.executors.
DryrunExecutor
(workflow, dag, printreason=False, quiet=False, printshellcmds=False, printthreads=True, latency_wait=3, keepincomplete=False)[source]¶
-
class
snakemake.executors.
GenericClusterExecutor
(workflow, dag, cores, submitcmd='qsub', statuscmd=None, cluster_config=None, jobname='snakejob.{rulename}.{jobid}.sh', printreason=False, quiet=False, printshellcmds=False, latency_wait=3, restart_times=0, assume_shared_fs=True, max_status_checks_per_second=1, keepincomplete=False)[source]¶
-
class
snakemake.executors.
GenericClusterJob
(job, jobid, callback, error_callback, jobscript, jobfinished, jobfailed)¶ Bases:
tuple
-
callback
¶ Alias for field number 2
-
error_callback
¶ Alias for field number 3
-
job
¶ Alias for field number 0
-
jobfailed
¶ Alias for field number 6
-
jobfinished
¶ Alias for field number 5
-
jobid
¶ Alias for field number 1
-
jobscript
¶ Alias for field number 4
-
-
class
snakemake.executors.
KubernetesExecutor
(workflow, dag, namespace, container_image=None, jobname='{rulename}.{jobid}', printreason=False, quiet=False, printshellcmds=False, latency_wait=3, cluster_config=None, local_input=None, restart_times=None, keepincomplete=False)[source]¶
-
class
snakemake.executors.
KubernetesJob
(job, jobid, callback, error_callback, kubejob, jobscript)¶ Bases:
tuple
-
callback
¶ Alias for field number 2
-
error_callback
¶ Alias for field number 3
-
job
¶ Alias for field number 0
-
jobid
¶ Alias for field number 1
-
jobscript
¶ Alias for field number 5
-
kubejob
¶ Alias for field number 4
-
-
class
snakemake.executors.
RealExecutor
(workflow, dag, printreason=False, quiet=False, printshellcmds=False, latency_wait=3, assume_shared_fs=True, keepincomplete=False)[source]¶
-
class
snakemake.executors.
SynchronousClusterExecutor
(workflow, dag, cores, submitcmd='qsub', cluster_config=None, jobname='snakejob.{rulename}.{jobid}.sh', printreason=False, quiet=False, printshellcmds=False, latency_wait=3, restart_times=0, assume_shared_fs=True, keepincomplete=False)[source]¶ Bases:
snakemake.executors.ClusterExecutor
invocations like “qsub -sync y” (SGE) or “bsub -K” (LSF) are synchronous, blocking the foreground thread and returning the remote exit code at remote exit.
-
class
snakemake.executors.
SynchronousClusterJob
(job, jobid, callback, error_callback, jobscript, process)¶ Bases:
tuple
-
callback
¶ Alias for field number 2
-
error_callback
¶ Alias for field number 3
-
job
¶ Alias for field number 0
-
jobid
¶ Alias for field number 1
-
jobscript
¶ Alias for field number 4
-
process
¶ Alias for field number 5
-
-
class
snakemake.executors.
TibannaExecutor
(workflow, dag, cores, tibanna_sfn, precommand='', tibanna_config=False, container_image=None, printreason=False, quiet=False, printshellcmds=False, latency_wait=3, local_input=None, restart_times=None, max_status_checks_per_second=1, keepincomplete=False)[source]¶
-
class
snakemake.executors.
TibannaJob
(job, jobname, jobid, exec_arn, callback, error_callback)¶ Bases:
tuple
-
callback
¶ Alias for field number 4
-
error_callback
¶ Alias for field number 5
-
exec_arn
¶ Alias for field number 3
-
job
¶ Alias for field number 0
-
jobid
¶ Alias for field number 2
-
jobname
¶ Alias for field number 1
-
-
class
snakemake.executors.
TouchExecutor
(workflow, dag, printreason=False, quiet=False, printshellcmds=False, latency_wait=3, assume_shared_fs=True, keepincomplete=False)[source]¶
-
snakemake.executors.
change_working_directory
(directory=None)[source]¶ Change working directory in execution context if provided.
-
snakemake.executors.
run_wrapper
(job_rule, input, output, params, wildcards, threads, resources, log, benchmark, benchmark_repeats, conda_env, container_img, singularity_args, env_modules, use_singularity, linemaps, debug, cleanup_scripts, shadow_dir, jobid)[source]¶ Wrapper around the run method that handles exceptions and benchmarking.
Arguments job_rule – the
job.rule
member input – list of input files output – list of output files wildcards – so far processed wildcards threads – usable threads log – list of log files shadow_dir – optional shadow directory root
snakemake.gui module¶
snakemake.io module¶
-
class
snakemake.io.
InputFiles
(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶ Bases:
snakemake.io.Namedlist
-
size
¶
-
size_mb
¶
-
-
class
snakemake.io.
Log
(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶ Bases:
snakemake.io.Namedlist
-
class
snakemake.io.
Namedlist
(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶ Bases:
list
A list that additionally provides functions to name items. Further, it is hashable, however the hash does not consider the item names.
-
class
snakemake.io.
OutputFiles
(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶ Bases:
snakemake.io.Namedlist
-
class
snakemake.io.
Params
(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶ Bases:
snakemake.io.Namedlist
-
class
snakemake.io.
ReportObject
(caption, category, subcategory, patterns)¶ Bases:
tuple
-
caption
¶ Alias for field number 0
-
category
¶ Alias for field number 1
-
patterns
¶ Alias for field number 3
-
subcategory
¶ Alias for field number 2
-
-
class
snakemake.io.
Resources
(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶ Bases:
snakemake.io.Namedlist
-
class
snakemake.io.
Wildcards
(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶ Bases:
snakemake.io.Namedlist
-
snakemake.io.
ancient
(value)[source]¶ A flag for an input file that shall be considered ancient; i.e. its timestamp shall have no effect on which jobs to run.
-
snakemake.io.
apply_wildcards
(pattern, wildcards, fill_missing=False, fail_dynamic=False, dynamic_fill=None, keep_dynamic=False)[source]¶
-
snakemake.io.
directory
(value)[source]¶ A flag to specify that an output is a directory, rather than a file or named pipe.
-
snakemake.io.
dynamic
(value)[source]¶ A flag for a file that shall be dynamic, i.e. the multiplicity (and wildcard values) will be expanded after a certain rule has been run
-
snakemake.io.
expand
(*args, **wildcards)[source]¶ Expand wildcards in given filepatterns.
Arguments *args – first arg: filepatterns as list or one single filepattern,
second arg (optional): a function to combine wildcard values (itertools.product per default)- **wildcards – the wildcards as keyword arguments
- with their values as lists. If allow_missing=True is included wildcards in filepattern without values will stay unformatted.
-
snakemake.io.
get_git_root
(path)[source]¶ Parameters: path – (str) Path a to a directory/file that is located inside the repo Returns: path to root folder for git repo
-
snakemake.io.
get_git_root_parent_directory
(path, input_path)[source]¶ This function will recursively go through parent directories until a git repository is found or until no parent directories are left, in which case a error will be raised. This is needed when providing a path to a file/folder that is located on a branch/tag no currently checked out.
Parameters: - path – (str) Path a to a directory that is located inside the repo
- input_path – (str) origin path, used when raising WorkflowError
Returns: path to root folder for git repo
-
snakemake.io.
git_content
(git_file)[source]¶ This function will extract a file from a git repository, one located on the filesystem. Expected format is git+file:///path/to/your/repo/path_to_file@@version
Parameters: env_file (str) – consist of path to repo, @, version and file information Ex: git+file:////home/smeds/snakemake-wrappers/bio/fastqc/wrapper.py@0.19.3 Returns: file content or None if the expected format isn’t meet
-
snakemake.io.
glob_wildcards
(pattern, files=None, followlinks=False)[source]¶ Glob the values of the wildcards by matching the given pattern to the filesystem. Returns a named tuple with a list of values for each wildcard.
-
snakemake.io.
limit
(pattern, **wildcards)[source]¶ Limit wildcards to the given values.
Arguments: **wildcards – the wildcards as keyword arguments
with their values as lists
-
snakemake.io.
load_configfile
(configpath)[source]¶ Loads a JSON or YAML configfile as a dict, then checks that it’s a dict.
-
snakemake.io.
local
(value)[source]¶ Mark a file as local file. This disables application of a default remote provider.
-
snakemake.io.
multiext
(prefix, *extensions)[source]¶ Expand a given prefix with multiple extensions (e.g. .txt, .csv, …).
-
snakemake.io.
protected
(value)[source]¶ A flag for a file that shall be write protected after creation.
-
snakemake.io.
report
(value, caption=None, category=None, subcategory=None, patterns=[])[source]¶ Flag output file or directory as to be included into reports.
In case of directory, files to include can be specified via a glob pattern (default: *).
Arguments value – File or directory. caption – Path to a .rst file with a textual description of the result. category – Name of the category in which the result should be displayed in the report. pattern – Wildcard pattern for selecting files if a directory is given (this is used as
input for snakemake.io.glob_wildcards). Pattern shall not include the path to the directory itself.
-
snakemake.io.
strip_wildcard_constraints
(pattern)[source]¶ Return a string that does not contain any wildcard constraints.
-
snakemake.io.
temp
(value)[source]¶ A flag for an input or output file that shall be removed after usage.
-
snakemake.io.
update_wildcard_constraints
(pattern, wildcard_constraints, global_wildcard_constraints)[source]¶ Update wildcard constraints
Parameters: - pattern (str) – pattern on which to update constraints
- wildcard_constraints (dict) – dictionary of wildcard:constraint key-value pairs
- global_wildcard_constraints (dict) – dictionary of wildcard:constraint key-value pairs
snakemake.jobs module¶
-
class
snakemake.jobs.
GroupJob
(id, jobs)[source]¶ Bases:
snakemake.jobs.AbstractJob
-
all_products
¶
-
attempt
¶
-
dag
¶
-
dynamic_input
¶
-
expanded_output
¶ Yields the entire expanded output of all jobs
-
groupid
¶
-
input
¶
-
inputsize
¶
-
is_branched
¶
-
is_checkpoint
¶
-
is_local
¶
-
is_updated
¶
-
jobid
¶
-
jobs
¶
-
log
¶
-
name
¶
-
needs_singularity
¶
-
output
¶
-
priority
¶
-
products
¶
-
resources
¶
-
restart_times
¶
-
rules
¶
-
threads
¶
-
toposorted
¶
-
-
class
snakemake.jobs.
Job
(rule, dag, wildcards_dict=None, format_wildcards=None, targetfile=None)[source]¶ Bases:
snakemake.jobs.AbstractJob
-
HIGHEST_PRIORITY
= 9223372036854775807¶
-
attempt
¶
-
b64id
¶
-
benchmark
¶
-
benchmark_repeats
¶
-
conda_env
¶
-
conda_env_file
¶
-
conda_env_path
¶
-
container_img
¶
-
container_img_path
¶
-
container_img_url
¶
-
dag
¶
-
dependencies
¶
-
dynamic_input
¶
-
dynamic_output
¶
-
dynamic_wildcards
¶ Return all wildcard values determined from dynamic output.
-
empty_remote_dirs
¶
-
env_modules
¶
-
existing_output
¶
-
existing_remote_input
¶
-
existing_remote_output
¶
-
expanded_output
¶ Iterate over output files while dynamic output is expanded.
-
files_to_download
¶
-
files_to_upload
¶
-
group
¶
-
input
¶
-
input_maxtime
¶ Return newest input file.
-
inputsize
¶ Return the size of the input files. Input files need to be present.
-
is_branched
¶
-
is_checkpoint
¶
-
is_cwl
¶
-
is_local
¶
-
is_norun
¶
-
is_notebook
¶
-
is_pipe
¶
-
is_run
¶
-
is_script
¶
-
is_shadow
¶
-
is_shell
¶
-
is_wrapper
¶
-
jobid
¶
-
local_input
¶
-
local_output
¶
-
log
¶
-
message
¶ Return the message for this job.
-
missing_input
¶ Return missing input files.
-
missing_remote_input
¶
-
missing_remote_output
¶
-
name
¶
-
needs_singularity
¶
-
output
¶
-
output_mintime
¶ Return oldest output file.
-
output_mintime_local
¶
-
outputs_older_than_script_or_notebook
()[source]¶ return output that’s older than script, i.e. script has changed
-
params
¶
-
postprocess
(upload_remote=True, handle_log=True, handle_touch=True, handle_temp=True, error=False, ignore_missing_output=False, assume_shared_fs=True, latency_wait=None)[source]¶
-
prepare
()[source]¶ Prepare execution of job. This includes creation of directories and deletion of previously created dynamic files. Creates a shadow directory for the job if specified.
-
priority
¶
-
products
¶
-
protected_output
¶
-
remote_input
¶
-
remote_input_newer_than_local
¶
-
remote_input_older_than_local
¶
-
remote_output
¶
-
remote_output_newer_than_local
¶
-
remote_output_older_than_local
¶
-
remove_existing_output
()[source]¶ Clean up both dynamic and regular output before rules actually run
-
resources
¶
-
restart_times
¶
-
rule
¶
-
rules
¶
-
shadow_dir
¶
-
shellcmd
¶ Return the shell command.
-
subworkflow_input
¶
-
targetfile
¶
-
temp_output
¶
-
threads
¶
-
touch_output
¶
-
unique_input
¶
-
wildcards
¶
-
wildcards_dict
¶
-
snakemake.logging module¶
-
class
snakemake.logging.
ColorizingStreamHandler
(nocolor=False, stream=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>, use_threads=False, mode=0)[source]¶ Bases:
logging.StreamHandler
-
BLACK
= 0¶
-
BLUE
= 4¶
-
BOLD_SEQ
= '\x1b[1m'¶
-
COLOR_SEQ
= '\x1b[%dm'¶
-
CYAN
= 6¶
-
GREEN
= 2¶
-
MAGENTA
= 5¶
-
RED
= 1¶
-
RESET_SEQ
= '\x1b[0m'¶
-
WHITE
= 7¶
-
YELLOW
= 3¶
-
colors
= {'CRITICAL': 1, 'DEBUG': 4, 'ERROR': 1, 'INFO': 2, 'WARNING': 3}¶
-
emit
(record)[source]¶ Emit a record.
If a formatter is specified, it is used to format the record. The record is then written to the stream with a trailing newline. If exception information is present, it is formatted using traceback.print_exception and appended to the stream. If the stream has an ‘encoding’ attribute, it is used to determine how to do the output to the stream.
-
is_tty
¶
-
-
class
snakemake.logging.
Logger
[source]¶ Bases:
object
-
snakemake.logging.
format_resources
(dict_like, *, omit_keys={'_cores', '_nodes'}, omit_values=[])¶
-
snakemake.logging.
format_wildcards
(dict_like, omit_keys=[], *, omit_values={'__snakemake_dynamic__'})¶
snakemake.output_index module¶
snakemake.parser module¶
-
class
snakemake.parser.
AbstractCmd
(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.Run
-
end_func
= None¶
-
overwrite_cmd
= None¶
-
start_func
= None¶
-
-
class
snakemake.parser.
Benchmark
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
CWL
(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.Script
-
end_func
= 'cwl'¶
-
start_func
= '@workflow.cwl'¶
-
-
class
snakemake.parser.
Cache
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶ Bases:
snakemake.parser.RuleKeywordState
-
keyword
¶
-
-
class
snakemake.parser.
Checkpoint
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.Rule
-
class
snakemake.parser.
Container
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
DecoratorKeywordState
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.KeywordState
-
args
= []¶
-
decorator
= None¶
-
-
class
snakemake.parser.
EnvModules
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
Envvars
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.GlobalKeywordState
-
keyword
¶
-
-
class
snakemake.parser.
GlobalContainer
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.GlobalKeywordState
-
keyword
¶
-
-
class
snakemake.parser.
GlobalSingularity
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.GlobalKeywordState
-
keyword
¶
-
-
class
snakemake.parser.
GlobalWildcardConstraints
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.GlobalKeywordState
-
keyword
¶
-
-
class
snakemake.parser.
KeywordState
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.TokenAutomaton
-
keyword
¶
-
prefix
= ''¶
-
-
class
snakemake.parser.
Message
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
Notebook
(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.Script
-
end_func
= 'notebook'¶
-
start_func
= '@workflow.notebook'¶
-
-
class
snakemake.parser.
OnError
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.DecoratorKeywordState
-
args
= ['log']¶
-
decorator
= 'onerror'¶
-
-
class
snakemake.parser.
OnStart
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.DecoratorKeywordState
-
args
= ['log']¶
-
decorator
= 'onstart'¶
-
-
class
snakemake.parser.
OnSuccess
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.DecoratorKeywordState
-
args
= ['log']¶
-
decorator
= 'onsuccess'¶
-
-
class
snakemake.parser.
Output
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
Params
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
Priority
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
Python
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.TokenAutomaton
-
subautomata
= {'checkpoint': <class 'snakemake.parser.Checkpoint'>, 'configfile': <class 'snakemake.parser.Configfile'>, 'container': <class 'snakemake.parser.GlobalContainer'>, 'envvars': <class 'snakemake.parser.Envvars'>, 'include': <class 'snakemake.parser.Include'>, 'localrules': <class 'snakemake.parser.Localrules'>, 'onerror': <class 'snakemake.parser.OnError'>, 'onstart': <class 'snakemake.parser.OnStart'>, 'onsuccess': <class 'snakemake.parser.OnSuccess'>, 'report': <class 'snakemake.parser.Report'>, 'rule': <class 'snakemake.parser.Rule'>, 'ruleorder': <class 'snakemake.parser.Ruleorder'>, 'singularity': <class 'snakemake.parser.GlobalSingularity'>, 'subworkflow': <class 'snakemake.parser.Subworkflow'>, 'wildcard_constraints': <class 'snakemake.parser.GlobalWildcardConstraints'>, 'workdir': <class 'snakemake.parser.Workdir'>}¶
-
-
class
snakemake.parser.
Resources
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
Rule
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.GlobalKeywordState
-
dedent
¶
-
subautomata
= {'benchmark': <class 'snakemake.parser.Benchmark'>, 'cache': <class 'snakemake.parser.Cache'>, 'conda': <class 'snakemake.parser.Conda'>, 'container': <class 'snakemake.parser.Container'>, 'cwl': <class 'snakemake.parser.CWL'>, 'envmodules': <class 'snakemake.parser.EnvModules'>, 'group': <class 'snakemake.parser.Group'>, 'input': <class 'snakemake.parser.Input'>, 'log': <class 'snakemake.parser.Log'>, 'message': <class 'snakemake.parser.Message'>, 'notebook': <class 'snakemake.parser.Notebook'>, 'output': <class 'snakemake.parser.Output'>, 'params': <class 'snakemake.parser.Params'>, 'priority': <class 'snakemake.parser.Priority'>, 'resources': <class 'snakemake.parser.Resources'>, 'run': <class 'snakemake.parser.Run'>, 'script': <class 'snakemake.parser.Script'>, 'shadow': <class 'snakemake.parser.Shadow'>, 'shell': <class 'snakemake.parser.Shell'>, 'singularity': <class 'snakemake.parser.Singularity'>, 'threads': <class 'snakemake.parser.Threads'>, 'version': <class 'snakemake.parser.Version'>, 'wildcard_constraints': <class 'snakemake.parser.WildcardConstraints'>, 'wrapper': <class 'snakemake.parser.Wrapper'>}¶
-
-
class
snakemake.parser.
RuleKeywordState
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
Script
(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.AbstractCmd
-
end_func
= 'script'¶
-
start_func
= '@workflow.script'¶
-
-
class
snakemake.parser.
Shadow
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
Shell
(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.AbstractCmd
-
end_func
= 'shell'¶
-
start_func
= '@workflow.shellcmd'¶
-
-
class
snakemake.parser.
Singularity
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶ Bases:
snakemake.parser.RuleKeywordState
-
keyword
¶
-
-
class
snakemake.parser.
Subworkflow
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.GlobalKeywordState
-
subautomata
= {'configfile': <class 'snakemake.parser.SubworkflowConfigfile'>, 'snakefile': <class 'snakemake.parser.SubworkflowSnakefile'>, 'workdir': <class 'snakemake.parser.SubworkflowWorkdir'>}¶
-
-
class
snakemake.parser.
SubworkflowConfigfile
(snakefile, base_indent=0, dedent=0, root=True)[source]¶
-
class
snakemake.parser.
SubworkflowKeywordState
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.KeywordState
-
prefix
= 'Subworkflow'¶
-
-
class
snakemake.parser.
Threads
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
TokenAutomaton
(snakefile, base_indent=0, dedent=0, root=True)[source]¶ Bases:
object
-
dedent
¶
-
effective_indent
¶
-
subautomata
= {}¶
-
-
class
snakemake.parser.
Version
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶
-
class
snakemake.parser.
WildcardConstraints
(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶ Bases:
snakemake.parser.RuleKeywordState
-
keyword
¶
-
-
class
snakemake.parser.
Wrapper
(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶ Bases:
snakemake.parser.Script
-
end_func
= 'wrapper'¶
-
start_func
= '@workflow.wrapper'¶
-
snakemake.persistence module¶
-
class
snakemake.persistence.
Persistence
(nolock=False, dag=None, conda_prefix=None, singularity_prefix=None, shadow_prefix=None, warn_only=False)[source]¶ Bases:
object
-
files
¶
-
input_changed
(job, file=None)[source]¶ Yields output files with changed input of bool if file given.
-
locked
¶
-
snakemake.rules module¶
-
class
snakemake.rules.
Rule
(*args, lineno=None, snakefile=None, restart_times=0)[source]¶ Bases:
object
-
apply_input_function
(func, wildcards, incomplete_checkpoint_func=<function Rule.<lambda>>, raw_exceptions=False, **aux_params)[source]¶
-
benchmark
¶
-
check_output_duplicates
()[source]¶ Check
Namedlist
for duplicate entries and raise aWorkflowError
on problems.
-
conda_env
¶
-
container_img
¶
-
static
get_wildcard_len
(wildcards)[source]¶ Return the length of the given wildcard values.
Arguments wildcards – a dict of wildcards
-
get_wildcards
(requested_output)[source]¶ Return wildcard dictionary by matching regular expression output files to the requested concrete ones.
Arguments requested_output – a concrete filepath
-
input
¶
-
is_cwl
¶
-
is_notebook
¶
-
is_producer
(requested_output)[source]¶ Returns True if this rule is a producer of the requested output.
-
is_run
¶
-
is_script
¶
-
is_shell
¶
-
is_wrapper
¶
-
log
¶
-
output
¶
-
params
¶
-
products
¶
-
set_input
(*input, **kwinput)[source]¶ Add a list of input files. Recursive lists are flattened.
Arguments input – the list of input files
-
set_output
(*output, **kwoutput)[source]¶ Add a list of output files. Recursive lists are flattened.
After creating the output files, they are checked for duplicates.
Arguments output – the list of output files
-
version
¶
-
wildcard_constraints
¶
-
wildcard_names
¶
-
snakemake.scheduler module¶
-
class
snakemake.scheduler.
JobScheduler
(workflow, dag, cores, local_cores=1, dryrun=False, touch=False, cluster=None, cluster_status=None, cluster_config=None, cluster_sync=None, drmaa=None, drmaa_log_dir=None, kubernetes=None, container_image=None, tibanna=None, tibanna_sfn=None, precommand='', tibanna_config=False, jobname=None, quiet=False, printreason=False, printshellcmds=False, keepgoing=False, max_jobs_per_second=None, max_status_checks_per_second=100, latency_wait=3, greediness=1.0, force_use_threads=False, assume_shared_fs=True, keepincomplete=False)[source]¶ Bases:
object
-
job_selector
(jobs)[source]¶ - Using the greedy heuristic from “A Greedy Algorithm for the General Multidimensional Knapsack
Problem”, Akcay, Li, Xu, Annals of Operations Research, 2012
- Args:
- jobs (list): list of jobs
-
open_jobs
¶ Return open jobs.
-
stats
¶
-
snakemake.script module¶
-
class
snakemake.script.
JuliaEncoder
[source]¶ Bases:
object
Encoding Pyton data structures into Julia.
-
class
snakemake.script.
JuliaScript
(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶ Bases:
snakemake.script.ScriptBase
-
class
snakemake.script.
PythonScript
(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶ Bases:
snakemake.script.ScriptBase
-
class
snakemake.script.
RMarkdown
(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶ Bases:
snakemake.script.ScriptBase
-
class
snakemake.script.
RScript
(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶ Bases:
snakemake.script.ScriptBase
-
class
snakemake.script.
ScriptBase
(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶ Bases:
abc.ABC
-
class
snakemake.script.
Snakemake
(input_, output, params, wildcards, threads, resources, log, config, rulename, bench_iteration, scriptdir=None)[source]¶ Bases:
object
-
log_fmt_shell
(stdout=True, stderr=True, append=False)[source]¶ Return a shell redirection string to be used in shell() calls
This function allows scripts and wrappers support optional log files specified in the calling rule. If no log was specified, then an empty string “” is returned, regardless of the values of stdout, stderr, and append.
Parameters: - stdout (bool) – Send stdout to log
- stderr (bool) – Send stderr to log
- append (bool) – Do not overwrite the log file. Useful for sending output of multiple commands to the same log. Note however that the log will not be truncated at the start.
- following table describes the output (The) –
- -------- -------- ----- ------------- (--------) –
- stderr append log return value (stdout) –
- -------- -------- ----- ------------ (--------) –
- True True fn >> fn 2>&1 (True) –
- False True fn >> fn (True) –
- True True fn 2>> fn (False) –
- True False fn > fn 2>&1 (True) –
- False False fn > fn (True) –
- True False fn 2> fn (False) –
- any any None "" (any) –
- -------- -------- ----- ----------- (--------) –
-
-
snakemake.script.
script
(path, basedir, input, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶ Load a script from the given basedir + path and execute it.
snakemake.singularity module¶
snakemake.stats module¶
snakemake.utils module¶
-
class
snakemake.utils.
AlwaysQuotedFormatter
(quote_func=None, *args, **kwargs)[source]¶ Bases:
snakemake.utils.QuotedFormatter
Subclass of QuotedFormatter that always quotes.
Usage is identical to QuotedFormatter, except that it always acts like “q” was appended to the format spec.
-
class
snakemake.utils.
QuotedFormatter
(quote_func=None, *args, **kwargs)[source]¶ Bases:
string.Formatter
Subclass of string.Formatter that supports quoting.
Using this formatter, any field can be quoted after formatting by appending “q” to its format string. By default, shell quoting is performed using “shlex.quote”, but you can pass a different quote_func to the constructor. The quote_func simply has to take a string argument and return a new string representing the quoted form of the input string.
Note that if an element after formatting is the empty string, it will not be quoted.
-
snakemake.utils.
R
(code)[source]¶ Execute R code.
This is deprecated in favor of the
script
directive. This function executes the R code given as a string. The function requires rpy2 to be installed.Parameters: code (str) – R code to be executed
-
class
snakemake.utils.
SequenceFormatter
(separator=' ', element_formatter=<string.Formatter object>, *args, **kwargs)[source]¶ Bases:
string.Formatter
string.Formatter subclass with special behavior for sequences.
This class delegates formatting of individual elements to another formatter object. Non-list objects are formatted by calling the delegate formatter’s “format_field” method. List-like objects (list, tuple, set, frozenset) are formatted by formatting each element of the list according to the specified format spec using the delegate formatter and then joining the resulting strings with a separator (space by default).
-
class
snakemake.utils.
Unformattable
(errormsg='This cannot be used for formatting')[source]¶ Bases:
object
-
snakemake.utils.
argvquote
(arg, force=True)[source]¶ Returns an argument quoted in such a way that that CommandLineToArgvW on Windows will return the argument string unchanged. This is the same thing Popen does when supplied with an list of arguments. Arguments in a command line should be separated by spaces; this function does not add these spaces. This implementation follows the suggestions outlined here: https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/
-
snakemake.utils.
available_cpu_count
()[source]¶ Return the number of available virtual or physical CPUs on this system. The number of available CPUs can be smaller than the total number of CPUs when the cpuset(7) mechanism is in use, as is the case on some cluster systems.
Adapted from https://stackoverflow.com/a/1006301/715090
-
snakemake.utils.
format
(_pattern, *args, stepout=1, _quote_all=False, **kwargs)[source]¶ Format a pattern in Snakemake style.
This means that keywords embedded in braces are replaced by any variable values that are available in the current namespace.
-
snakemake.utils.
linecount
(filename)[source]¶ Return the number of lines of given file.
Parameters: filename (str) – the path to the file
-
snakemake.utils.
listfiles
(pattern, restriction=None, omit_value=None)[source]¶ Yield a tuple of existing filepaths for the given pattern.
Wildcard values are yielded as the second tuple item.
Parameters: - pattern (str) – a filepattern. Wildcards are specified in snakemake syntax, e.g. “{id}.txt”
- restriction (dict) – restrict to wildcard values given in this dictionary
- omit_value (str) – wildcard value to omit
Yields: tuple – The next file matching the pattern, and the corresponding wildcards object
-
snakemake.utils.
makedirs
(dirnames)[source]¶ Recursively create the given directory or directories without reporting errors if they are present.
-
snakemake.utils.
min_version
(version)[source]¶ Require minimum snakemake version, raise workflow error if not met.
-
snakemake.utils.
read_job_properties
(jobscript, prefix='# properties', pattern=re.compile('# properties = (.*)'))[source]¶ Read the job properties defined in a snakemake jobscript.
This function is a helper for writing custom wrappers for the snakemake –cluster functionality. Applying this function to a jobscript will return a dict containing information about the job.
-
snakemake.utils.
report
(text, path, stylesheet=None, defaultenc='utf8', template=None, metadata=None, **files)[source]¶ Create an HTML report using python docutils.
This is deprecated in favor of the –report flag.
Attention: This function needs Python docutils to be installed for the python installation you use with Snakemake.
All keywords not listed below are intepreted as paths to files that shall be embedded into the document. They keywords will be available as link targets in the text. E.g. append a file as keyword arg via F1=input[0] and put a download link in the text like this:
report(''' ============== Report for ... ============== Some text. A link to an embedded file: F1_. Further text. ''', outputpath, F1=input[0]) Instead of specifying each file as a keyword arg, you can also expand the input of your rule if it is completely named, e.g.: report(''' Some text... ''', outputpath, **input)
Parameters: - text (str) – The “restructured text” as it is expected by python docutils.
- path (str) – The path to the desired output file
- stylesheet (str) – An optional path to a css file that defines the style of the document. This defaults to <your snakemake install>/report.css. Use the default to get a hint how to create your own.
- defaultenc (str) – The encoding that is reported to the browser for embedded text files, defaults to utf8.
- template (str) – An optional path to a docutils HTML template.
- metadata (str) – E.g. an optional author name or email address.
-
snakemake.utils.
update_config
(config, overwrite_config)[source]¶ Recursively update dictionary config with overwrite_config.
See https://stackoverflow.com/questions/3232943/update-value-of-a-nested-dictionary-of-varying-depth for details.
Parameters: - config (dict) – dictionary to update
- overwrite_config (dict) – dictionary whose items will overwrite those in config
-
snakemake.utils.
validate
(data, schema, set_default=True)[source]¶ Validate data with JSON schema at given path.
Parameters: - data (object) – data to validate. Can be a config dict or a pandas data frame.
- schema (str) – Path to JSON schema used for validation. The schema can also be in YAML format. If validating a pandas data frame, the schema has to describe a row record (i.e., a dict with column names as keys pointing to row values). See https://json-schema.org. The path is interpreted relative to the Snakefile when this function is called.
- set_default (bool) – set default values defined in schema. See https://python-jsonschema.readthedocs.io/en/latest/faq/ for more information
snakemake.workflow module¶
-
class
snakemake.workflow.
Rules
[source]¶ Bases:
object
A namespace for rules so that they can be accessed via dot notation.
-
class
snakemake.workflow.
Subworkflow
(workflow, name, snakefile, workdir, configfile)[source]¶ Bases:
object
-
snakefile
¶
-
workdir
¶
-
-
class
snakemake.workflow.
Workflow
(snakefile=None, jobscript=None, overwrite_shellcmd=None, overwrite_config={}, overwrite_workdir=None, overwrite_configfiles=None, overwrite_clusterconfig={}, overwrite_threads={}, config_args=None, debug=False, verbose=False, use_conda=False, conda_prefix=None, use_singularity=False, use_env_modules=False, singularity_prefix=None, singularity_args='', shadow_prefix=None, mode=0, wrapper_prefix=None, printshellcmds=False, restart_times=None, attempt=1, default_remote_provider=None, default_remote_prefix='', run_local=True, default_resources=None, cache=None, nodes=1, cores=1, resources=None, conda_cleanup_pkgs=None)[source]¶ Bases:
object
-
apply_default_remote
(path)[source]¶ Apply the defined default remote provider to the given path and return the updated _IOFile. Asserts that default remote provider is defined.
-
concrete_files
¶
-
config
¶
-
cores
¶
-
current_basedir
¶ Basedir of currently parsed Snakefile.
-
execute
(targets=None, dryrun=False, touch=False, local_cores=1, forcetargets=False, forceall=False, forcerun=None, until=[], omit_from=[], prioritytargets=None, quiet=False, keepgoing=False, printshellcmds=False, printreason=False, printdag=False, cluster=None, cluster_sync=None, jobname=None, immediate_submit=False, ignore_ambiguity=False, printrulegraph=False, printfilegraph=False, printd3dag=False, drmaa=None, drmaa_log_dir=None, kubernetes=None, tibanna=None, tibanna_sfn=None, precommand='', tibanna_config=False, container_image=None, stats=None, force_incomplete=False, ignore_incomplete=False, list_version_changes=False, list_code_changes=False, list_input_changes=False, list_params_changes=False, list_untracked=False, list_conda_envs=False, summary=False, archive=None, delete_all_output=False, delete_temp_output=False, detailed_summary=False, latency_wait=3, wait_for_files=None, nolock=False, unlock=False, notemp=False, nodeps=False, cleanup_metadata=None, conda_cleanup_envs=False, cleanup_shadow=False, cleanup_scripts=True, subsnakemake=None, updated_files=None, keep_target_files=False, keep_shadow=False, keep_remote_local=False, allowed_rules=None, max_jobs_per_second=None, max_status_checks_per_second=None, greediness=1.0, no_hooks=False, force_use_threads=False, conda_create_envs_only=False, assume_shared_fs=True, cluster_status=None, report=None, report_stylesheet=None, export_cwl=False, batch=None, keepincomplete=False)[source]¶
-
include
(snakefile, overwrite_first_rule=False, print_compilation=False, overwrite_shellcmd=None)[source]¶ Include a snakefile.
-
inputfile
(path)[source]¶ Mark file as being an input file of the workflow.
This also means that eventual –default-remote-provider/prefix settings will be applied to this file. The file is returned as _IOFile object, such that it can e.g. be transparently opened with _IOFile.open().
-
nodes
¶
-
register_envvars
(*envvars)[source]¶ Register environment variables that shall be passed to jobs. If used multiple times, union is taken.
-
rules
¶
-
subworkflows
¶
-
-
snakemake.workflow.
format_resources
(dict_like, *, omit_keys={'_cores', '_nodes'}, omit_values=[])¶
snakemake.wrapper module¶
-
snakemake.wrapper.
wrapper
(path, input, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, prefix, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶ Load a wrapper from https://github.com/snakemake/snakemake-wrappers under the given path + wrapper.(py|R|Rmd) and execute it.
Module contents¶
-
snakemake.
snakemake
(snakefile, batch=None, cache=None, report=None, report_stylesheet=None, lint=None, listrules=False, list_target_rules=False, cores=1, nodes=1, local_cores=1, resources={}, overwrite_threads={}, default_resources=None, config={}, configfiles=None, config_args=None, workdir=None, targets=None, dryrun=False, touch=False, forcetargets=False, forceall=False, forcerun=[], until=[], omit_from=[], prioritytargets=[], stats=None, printreason=False, printshellcmds=False, debug_dag=False, printdag=False, printrulegraph=False, printfilegraph=False, printd3dag=False, nocolor=False, quiet=False, keepgoing=False, cluster=None, cluster_config=None, cluster_sync=None, drmaa=None, drmaa_log_dir=None, jobname='snakejob.{rulename}.{jobid}.sh', immediate_submit=False, standalone=False, ignore_ambiguity=False, snakemakepath=None, lock=True, unlock=False, cleanup_metadata=None, conda_cleanup_envs=False, cleanup_shadow=False, cleanup_scripts=True, force_incomplete=False, ignore_incomplete=False, list_version_changes=False, list_code_changes=False, list_input_changes=False, list_params_changes=False, list_untracked=False, list_resources=False, summary=False, archive=None, delete_all_output=False, delete_temp_output=False, detailed_summary=False, latency_wait=3, wait_for_files=None, print_compilation=False, debug=False, notemp=False, keep_remote_local=False, nodeps=False, keep_target_files=False, allowed_rules=None, jobscript=None, greediness=None, no_hooks=False, overwrite_shellcmd=None, updated_files=None, log_handler=[], keep_logger=False, max_jobs_per_second=None, max_status_checks_per_second=100, restart_times=0, attempt=1, verbose=False, force_use_threads=False, use_conda=False, use_singularity=False, use_env_modules=False, singularity_args='', conda_prefix=None, conda_cleanup_pkgs=None, list_conda_envs=False, singularity_prefix=None, shadow_prefix=None, conda_create_envs_only=False, mode=0, wrapper_prefix=None, kubernetes=None, container_image=None, tibanna=False, tibanna_sfn=None, precommand='', default_remote_provider=None, default_remote_prefix='', tibanna_config=False, assume_shared_fs=True, cluster_status=None, export_cwl=None, show_failed_logs=False, keep_incomplete=False, messaging=None)[source]¶ Run snakemake on a given snakefile.
This function provides access to the whole snakemake functionality. It is not thread-safe.
Parameters: - snakefile (str) – the path to the snakefile
- batch (Batch) – whether to compute only a partial DAG, defined by the given Batch object (default None)
- report (str) – create an HTML report for a previous run at the given path
- lint (str) – print lints instead of executing (None, “plain” or “json”, default None)
- listrules (bool) – list rules (default False)
- list_target_rules (bool) – list target rules (default False)
- cores (int) – the number of provided cores (ignored when using cluster support) (default 1)
- nodes (int) – the number of provided cluster nodes (ignored without cluster support) (default 1)
- local_cores (int) – the number of provided local cores if in cluster mode (ignored without cluster support) (default 1)
- resources (dict) – provided resources, a dictionary assigning integers to resource names, e.g. {gpu=1, io=5} (default {})
- default_resources (DefaultResources) – default values for resources not defined in rules (default None)
- config (dict) – override values for workflow config
- workdir (str) – path to working directory (default None)
- targets (list) – list of targets, e.g. rule or file names (default None)
- dryrun (bool) – only dry-run the workflow (default False)
- touch (bool) – only touch all output files if present (default False)
- forcetargets (bool) – force given targets to be re-created (default False)
- forceall (bool) – force all output files to be re-created (default False)
- forcerun (list) – list of files and rules that shall be re-created/re-executed (default [])
- prioritytargets (list) – list of targets that shall be run with maximum priority (default [])
- stats (str) – path to file that shall contain stats about the workflow execution (default None)
- printreason (bool) – print the reason for the execution of each job (default false)
- printshellcmds (bool) – print the shell command of each job (default False)
- printdag (bool) – print the dag in the graphviz dot language (default False)
- printrulegraph (bool) – print the graph of rules in the graphviz dot language (default False)
- printfilegraph (bool) – print the graph of rules with their input and output files in the graphviz dot language (default False)
- printd3dag (bool) – print a D3.js compatible JSON representation of the DAG (default False)
- nocolor (bool) – do not print colored output (default False)
- quiet (bool) – do not print any default job information (default False)
- keepgoing (bool) – keep goind upon errors (default False)
- cluster (str) – submission command of a cluster or batch system to use, e.g. qsub (default None)
- cluster_config (str,list) – configuration file for cluster options, or list thereof (default None)
- cluster_sync (str) – blocking cluster submission command (like SGE ‘qsub -sync y’) (default None)
- drmaa (str) – if not None use DRMAA for cluster support, str specifies native args passed to the cluster when submitting a job
- drmaa_log_dir (str) – the path to stdout and stderr output of DRMAA jobs (default None)
- jobname (str) – naming scheme for cluster job scripts (default “snakejob.{rulename}.{jobid}.sh”)
- immediate_submit (bool) – immediately submit all cluster jobs, regardless of dependencies (default False)
- standalone (bool) – kill all processes very rudely in case of failure (do not use this if you use this API) (default False) (deprecated)
- ignore_ambiguity (bool) – ignore ambiguous rules and always take the first possible one (default False)
- snakemakepath (str) – deprecated parameter whose value is ignored. Do not use.
- lock (bool) – lock the working directory when executing the workflow (default True)
- unlock (bool) – just unlock the working directory (default False)
- cleanup_metadata (list) – just cleanup metadata of given list of output files (default None)
- conda_cleanup_envs (bool) – just cleanup unused conda environments (default False)
- cleanup_shadow (bool) – just cleanup old shadow directories (default False)
- cleanup_scripts (bool) – delete wrapper scripts used for execution (default True)
- force_incomplete (bool) – force the re-creation of incomplete files (default False)
- ignore_incomplete (bool) – ignore incomplete files (default False)
- list_version_changes (bool) – list output files with changed rule version (default False)
- list_code_changes (bool) – list output files with changed rule code (default False)
- list_input_changes (bool) – list output files with changed input files (default False)
- list_params_changes (bool) – list output files with changed params (default False)
- list_untracked (bool) – list files in the workdir that are not used in the workflow (default False)
- summary (bool) – list summary of all output files and their status (default False)
- archive (str) – archive workflow into the given tarball
- delete_all_output (bool) remove all files generated by the workflow (default False) –
- delete_temp_output (bool) remove all temporary files generated by the workflow (default False) –
- latency_wait (int) – how many seconds to wait for an output file to appear after the execution of a job, e.g. to handle filesystem latency (default 3)
- wait_for_files (list) – wait for given files to be present before executing the workflow
- list_resources (bool) – list resources used in the workflow (default False)
- summary – list summary of all output files and their status (default False). If no option is specified a basic summary will be ouput. If ‘detailed’ is added as an option e.g –summary detailed, extra info about the input and shell commands will be included
- detailed_summary (bool) – list summary of all input and output files and their status (default False)
- print_compilation (bool) – print the compilation of the snakefile (default False)
- debug (bool) – allow to use the debugger within rules
- notemp (bool) – ignore temp file flags, e.g. do not delete output files marked as temp after use (default False)
- keep_remote_local (bool) – keep local copies of remote files (default False)
- nodeps (bool) – ignore dependencies (default False)
- keep_target_files (bool) – do not adjust the paths of given target files relative to the working directory.
- allowed_rules (set) – restrict allowed rules to the given set. If None or empty, all rules are used.
- jobscript (str) – path to a custom shell script template for cluster jobs (default None)
- greediness (float) – set the greediness of scheduling. This value between 0 and 1 determines how careful jobs are selected for execution. The default value (0.5 if prioritytargets are used, 1.0 else) provides the best speed and still acceptable scheduling quality.
- overwrite_shellcmd (str) – a shell command that shall be executed instead of those given in the workflow. This is for debugging purposes only.
- updated_files (list) – a list that will be filled with the files that are updated or created during the workflow execution
- verbose (bool) – show additional debug output (default False)
- max_jobs_per_second (int) – maximal number of cluster/drmaa jobs per second, None to impose no limit (default None)
- restart_times (int) – number of times to restart failing jobs (default 0)
- attempt (int) – initial value of Job.attempt. This is intended for internal use only (default 1).
- force_use_threads – whether to force use of threads over processes. helpful if shared memory is full or unavailable (default False)
- use_conda (bool) – use conda environments for each job (defined with conda directive of rules)
- use_singularity (bool) – run jobs in singularity containers (if defined with singularity directive)
- use_env_modules (bool) – load environment modules if defined in rules
- singularity_args (str) – additional arguments to pass to singularity
- conda_prefix (str) – the directory in which conda environments will be created (default None)
- conda_cleanup_pkgs (snakemake.deployment.conda.CondaCleanupMode) – whether to clean up conda tarballs after env creation (default None), valid values: “tarballs”, “cache”
- singularity_prefix (str) – the directory to which singularity images will be pulled (default None)
- shadow_prefix (str) – prefix for shadow directories. The job-specific shadow directories will be created in $SHADOW_PREFIX/shadow/ (default None)
- conda_create_envs_only (bool) – if specified, only builds the conda environments specified for each job, then exits.
- list_conda_envs (bool) – list conda environments and their location on disk.
- mode (snakemake.common.Mode) – execution mode
- wrapper_prefix (str) – prefix for wrapper script URLs (default None)
- kubernetes (str) – submit jobs to kubernetes, using the given namespace.
- container_image (str) – Docker image to use, e.g., for kubernetes.
- default_remote_provider (str) – default remote provider to use instead of local files (e.g. S3, GS)
- default_remote_prefix (str) – prefix for default remote provider (e.g. name of the bucket).
- tibanna (str) – submit jobs to AWS cloud using Tibanna.
- tibanna_sfn (str) – Step function (Unicorn) name of Tibanna (e.g. tibanna_unicorn_monty). This must be deployed first using tibanna cli.
- precommand (str) – commands to run on AWS cloud before the snakemake command (e.g. wget, git clone, unzip, etc). Use with –tibanna.
- tibanna_config (list) – Additional tibanan config e.g. –tibanna-config spot_instance=true subnet=<subnet_id> security group=<security_group_id>
- assume_shared_fs (bool) – assume that cluster nodes share a common filesystem (default true).
- cluster_status (str) – status command for cluster execution. If None, Snakemake will rely on flag files. Otherwise, it expects the command to return “success”, “failure” or “running” when executing with a cluster jobid as single argument.
- export_cwl (str) – Compile workflow to CWL and save to given file
- log_handler (list) – redirect snakemake output to this custom log handler, a function that takes a log message dictionary (see below) as its only argument (default None). The log message dictionary for the log handler has to following entries:
- keep_incomplete (bool) – keep incomplete output files of failed jobs
- log_handler –
redirect snakemake output to this list of custom log handler, each a function that takes a log message dictionary (see below) as its only argument (default []). The log message dictionary for the log handler has to following entries:
level: the log level (“info”, “error”, “debug”, “progress”, “job_info”) level=”info”, “error” or “debug”: msg: the log message level=”progress”: done: number of already executed jobs total: number of total jobs level=”job_info”: input: list of input files of a job output: list of output files of a job log: path to log file of a job local: whether a job is executed locally (i.e. ignoring cluster) msg: the job message reason: the job reason priority: the job priority threads: the threads of the job
Returns: True if workflow execution was successful.
Return type: bool