snakemake package¶

Submodules¶

snakemake.benchmark module¶

snakemake.checkpoints module¶

class snakemake.checkpoints.Checkpoint(rule, checkpoints)[source]¶

Bases: object

checkpoints¶

get(**wildcards)[source]¶

rule¶

class snakemake.checkpoints.CheckpointJob(rule, output)[source]¶

Bases: object

output¶

rule¶

class snakemake.checkpoints.Checkpoints[source]¶

Bases: object

A namespace for checkpoints so that they can be accessed via dot notation.

register(rule)[source]¶

snakemake.common module¶

class snakemake.common.Mode[source]¶

Bases: object

Enum for execution mode of Snakemake. This handles the behavior of e.g. the logger.

cluster = 2¶

default = 0¶

subprocess = 1¶

class snakemake.common.TBDInt[source]¶

Bases: int

An integer that prints into <TBD>

snakemake.common.async_run(coroutine)[source]¶

snakemake.common.bytesto(bytes, to, bsize=1024)[source]¶: convert bytes to megabytes. bytes to mb: bytesto(bytes, ‘m’) bytes to gb: bytesto(bytes, ‘g’ etc. From https://gist.github.com/shawnbutts/3906915

snakemake.common.get_container_image()[source]¶

snakemake.common.get_file_hash(filename, algorithm='sha256')[source]¶: find the SHA256 hash string of a file. We use this so that the user can choose to cache working directories in storage.

snakemake.common.get_last_stable_version()[source]¶

snakemake.common.get_uuid(name)[source]¶

snakemake.common.group_into_chunks(n, iterable)[source]¶

Group iterable into chunks of size at most n.

See https://stackoverflow.com/a/8998040.

class snakemake.common.lazy_property(method)[source]¶

Bases: property

cached¶

static clean(instance, method)[source]¶

method¶

snakemake.common.log_location(msg)[source]¶

snakemake.common.num_if_possible(s)[source]¶: Convert string to number if possible, otherwise return string.

snakemake.common.strip_prefix(text, prefix)[source]¶

snakemake.conda module¶

snakemake.cwl module¶

snakemake.cwl.cwl(path, basedir, input, output, params, wildcards, threads, resources, log, config, rulename, use_singularity, bench_record, jobid)[source]¶: Load cwl from the given basedir + path and execute it.

snakemake.cwl.dag_to_cwl(dag)[source]¶: Convert a given DAG to a CWL workflow, which is returned as JSON object.

snakemake.cwl.job_to_cwl(job, dag, outputs, inputs)[source]¶: Convert a job with its dependencies to a CWL workflow step.

snakemake.dag module¶

class snakemake.dag.Batch(rulename: str, idx: int, batches: int)[source]¶

Bases: object

Definition of a batch for calculating only a partial DAG.

get_batch(items: list)[source]¶: Return the defined batch of the given items. Items are usually input files.

is_final¶

class snakemake.dag.DAG(workflow, rules=None, dryrun=False, targetfiles=None, targetrules=None, forceall=False, forcerules=None, forcefiles=None, priorityfiles=None, priorityrules=None, untilfiles=None, untilrules=None, omitfiles=None, omitrules=None, ignore_ambiguity=False, force_incomplete=False, ignore_incomplete=False, notemp=False, keep_remote_local=False, batch=None)[source]¶

Bases: object

Directed acyclic graph of jobs.

archive(path)[source]¶

Archives workflow such that it can be re-run on a different system.

Archiving includes git versioned files (i.e. Snakefiles, config files, …), ancestral input files and conda environments.

bfs(direction, *jobs, stop=<function DAG.<lambda>>)[source]¶: Perform a breadth-first traversal of the DAG.

cache_job(job)[source]¶

check_and_touch_output(job, wait=3, ignore_missing_output=False, no_touch=False, force_stay_on_remote=False)[source]¶: Raise exception if output files of job are missing.

check_directory_outputs()[source]¶: Check that no output file is contained in a directory output of the same or another rule.

check_dynamic()[source]¶: Check dynamic output and update downstream rules if necessary.

check_incomplete()[source]¶: Check if any output files are incomplete. This is done by looking up markers in the persistence module.

check_periodic_wildcards(job)[source]¶: Raise an exception if a wildcard of the given job appears to be periodic, indicating a cyclic dependency.

checkpoint_jobs¶

clean(only_temp=False, dryrun=False)[source]¶: Removes files generated by the workflow.

cleanup()[source]¶

cleanup_workdir()[source]¶

close_remote_objects()[source]¶: Close all remote objects.

collect_potential_dependencies(job, known_producers)[source]¶: Collect all potential dependencies of a job. These might contain ambiguities. The keys of the returned dict represent the files to be considered.

create_conda_envs(dryrun=False, forceall=False, init_only=False, quiet=False)[source]¶

d3dag(max_jobs=10000)[source]¶

delete_job(job, recursive=True, add_dependencies=False)[source]¶: Delete given job from DAG.

delete_omitfrom_jobs()[source]¶: Removes jobs downstream of jobs specified by –omit-from.

dfs(direction, *jobs, stop=<function DAG.<lambda>>, post=True)[source]¶: Perform depth-first traversal of the DAG.

dot()[source]¶

downstream_of_omitfrom()[source]¶: Returns the downstream of –omit-from rules or files and themselves.

dynamic(job)[source]¶: Return whether a job is dynamic (i.e. it is only a placeholder for those that are created after the job with dynamic output has finished.

dynamic_output_jobs¶: Iterate over all jobs with dynamic output files.

file2jobs(targetfile)[source]¶

filegraph_dot(node2rule=<function DAG.<lambda>>, node2style=<function DAG.<lambda>>, node2label=<function DAG.<lambda>>)[source]¶

finish(job, update_dynamic=True)[source]¶: Finish a given job (e.g. remove from ready jobs, mark depending jobs as ready).

finished(job)[source]¶: Return whether a job is finished.

finished_jobs¶: Iterate over all jobs that have been finished.

get_jobs_or_groups()[source]¶

handle_log(job, upload_remote=True)[source]¶

handle_pipes()[source]¶: Use pipes to determine job groups. Check if every pipe has exactly one consumer

handle_protected(job)[source]¶: Write-protect output files that are marked with protected().

handle_remote(job, upload=True)[source]¶: Remove local files if they are no longer needed and upload.

handle_temp(job)[source]¶: Remove temp files if they are no longer needed. Update temp_mtimes.

handle_touch(job)[source]¶: Touches those output files that are marked for touching.

in_omitfrom(job)[source]¶: Return whether given job has been specified via –omit-from.

in_until(job)[source]¶: Return whether given job has been specified via –until.

incomplete_external_jobid(job)[source]¶

Return the external jobid of the job if it is marked as incomplete.

Returns None, if job is not incomplete, or if no external jobid has been registered or if force_incomplete is True.

incomplete_files¶: Return list of incomplete files.

init(progress=False)[source]¶: Initialise the DAG.

is_batch_rule(rule)[source]¶: Return True if the underlying rule is to be used for batching the DAG.

is_edit_notebook_job(job)[source]¶

jobid(job)[source]¶: Return job id of given job.

jobs¶: All jobs in the DAG.

level_bfs(direction, *jobs, stop=<function DAG.<lambda>>)[source]¶: Perform a breadth-first traversal of the DAG, but also yield the level together with each job.

list_untracked()[source]¶: List files in the workdir that are not in the dag.

local_needrun_jobs¶: Iterate over all jobs that need to be run and are marked as local.

missing_temp(job)[source]¶: Return whether a temp file that is input of the given job is missing.

needrun(job)[source]¶: Return whether a given job needs to be executed.

needrun_jobs¶: Jobs that need to be executed.

new_job(rule, targetfile=None, format_wildcards=None)[source]¶: Create new job for given rule and (optional) targetfile. This will reuse existing jobs with the same wildcards.

new_wildcards(job)[source]¶: Return wildcards that are newly introduced in this job, compared to its ancestors.

newversion_files¶: Return list of files where the current version is newer than the recorded version.

noneedrun_finished(job)[source]¶: Return whether a given job is finished or was not required to run at all.

omitfrom_jobs()[source]¶: Returns a generator of jobs specified by omitfromjobs.

postprocess(update_needrun=True)[source]¶: Postprocess the DAG. This has to be invoked after any change to the DAG topology.

priority(job)[source]¶: Return priority of given job.

pull_container_imgs(dryrun=False, forceall=False, quiet=False)[source]¶

ready_jobs¶: Jobs that are ready to execute.

reason(job)[source]¶: Return the reason of the job execution.

replace_job(job, newjob, recursive=True)[source]¶: Replace given job with new job.

requested_files(job)[source]¶: Return the files a job requests.

rule2job(targetrule)[source]¶: Generate a new job from a given rule.

rule_dot()[source]¶

rule_dot2()[source]¶

set_until_jobs()[source]¶: Removes jobs downstream of jobs specified by –omit-from.

specialize_rule(rule, newrule)[source]¶: Specialize the given rule by inserting newrule into the DAG.

stats()[source]¶

summary(detailed=False)[source]¶

temp_input(job)[source]¶

temp_size(job)[source]¶: Return the total size of temporary input files of the job. If none, return 0.

unshadow_output(job, only_log=False)[source]¶: Move files from shadow directory to real output paths.

until_jobs()[source]¶: Returns a generator of jobs specified by untiljobs.

update(jobs, file=None, visited=None, known_producers=None, skip_until_dynamic=False, progress=False, create_inventory=False)[source]¶: Update the DAG by adding given jobs and their dependencies.

update_(job, visited=None, known_producers=None, skip_until_dynamic=False, progress=False, create_inventory=False)[source]¶: Update the DAG by adding the given job and its dependencies.

update_checkpoint_dependencies(jobs=None)[source]¶: Update dependencies of checkpoints.

update_checkpoint_outputs()[source]¶

update_dynamic(job)[source]¶: Update the DAG by evaluating the output of the given job that contains dynamic output files.

update_groups()[source]¶

update_jobids()[source]¶

update_needrun(create_inventory=False)[source]¶: Update the information whether a job needs to be executed.

update_output_index()[source]¶: Update the OutputIndex.

update_priority()[source]¶: Update job priorities.

update_ready(jobs=None)[source]¶

Update information whether a job is ready to execute.

Given jobs must be needrun jobs!

class snakemake.dag.PotentialDependency(file, jobs, known)¶

Bases: tuple

file¶: Alias for field number 0

jobs¶: Alias for field number 1

known¶: Alias for field number 2

snakemake.decorators module¶

snakemake.decorators.dec_all_methods(decorator, prefix='test_')[source]¶

snakemake.exceptions module¶

exception snakemake.exceptions.AmbiguousRuleException(filename, job_a, job_b, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.AzureFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.CacheMissException[source]¶: Bases: Exception

exception snakemake.exceptions.CheckSumMismatchException(*args, lineno=None, snakefile=None, rule=None)[source]¶

Bases: snakemake.exceptions.WorkflowError

“should be called to indicate that checksum of a file compared to known hash does not match, typically done with large downloads, etc.

exception snakemake.exceptions.ChildIOException(parent=None, child=None, wildcards=None, lineno=None, snakefile=None, rule=None)[source]¶: Bases: snakemake.exceptions.WorkflowError

exception snakemake.exceptions.ClusterJobException(job_info, jobid)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.CreateCondaEnvironmentException(*args, lineno=None, snakefile=None, rule=None)[source]¶: Bases: snakemake.exceptions.WorkflowError

exception snakemake.exceptions.CreateRuleException(message=None, include=None, lineno=None, snakefile=None, rule=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.CyclicGraphException(repeatedrule, file, rule=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.DropboxFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.FTPFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.HTTPFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.IOException(prefix, rule, files, include=None, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.IOFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.ImproperOutputException(rule, files, include=None, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.IOException

exception snakemake.exceptions.ImproperShadowException(rule, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.IncompleteCheckpointException(rule, targetfile)[source]¶: Bases: Exception

exception snakemake.exceptions.IncompleteFilesException(files)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.InputFunctionException(msg, wildcards=None, lineno=None, snakefile=None, rule=None)[source]¶: Bases: snakemake.exceptions.WorkflowError

exception snakemake.exceptions.MissingInputException(rule, files, include=None, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.IOException

exception snakemake.exceptions.MissingOutputException(message=None, include=None, lineno=None, snakefile=None, rule=None, jobid='')[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.MissingRuleException(file, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.NCBIFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.NoRulesException(lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.PeriodicWildcardError(message=None, include=None, lineno=None, snakefile=None, rule=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.ProtectedOutputException(rule, files, include=None, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.IOException

exception snakemake.exceptions.RemoteFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.RuleException(message=None, include=None, lineno=None, snakefile=None, rule=None)[source]¶

Bases: Exception

Base class for exception occuring within the execution or definition of rules.

messages¶

exception snakemake.exceptions.S3FileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.SFTPFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.SpawnedJobError[source]¶: Bases: Exception

exception snakemake.exceptions.TerminatedException[source]¶: Bases: Exception

exception snakemake.exceptions.UnexpectedOutputException(rule, files, include=None, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.IOException

exception snakemake.exceptions.UnknownRuleException(name, prefix='', lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.WebDAVFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

exception snakemake.exceptions.WildcardError(*args, lineno=None, snakefile=None, rule=None)[source]¶: Bases: snakemake.exceptions.WorkflowError

exception snakemake.exceptions.WorkflowError(*args, lineno=None, snakefile=None, rule=None)[source]¶

Bases: Exception

static format_arg(arg)[source]¶

exception snakemake.exceptions.XRootDFileException(msg, lineno=None, snakefile=None)[source]¶: Bases: snakemake.exceptions.RuleException

snakemake.exceptions.cut_traceback(ex)[source]¶

snakemake.exceptions.format_error(ex, lineno, linemaps=None, snakefile=None, show_traceback=False)[source]¶

snakemake.exceptions.format_traceback(tb, linemaps)[source]¶

snakemake.exceptions.get_exception_origin(ex, linemaps)[source]¶

snakemake.exceptions.log_verbose_traceback(ex)[source]¶

snakemake.exceptions.print_exception(ex, linemaps)[source]¶

Print an error message for a given exception.

Arguments ex – the exception linemaps – a dict of a dict that maps for each snakefile

the compiled lines to source code lines in the snakefile.

snakemake.executors module¶

class snakemake.executors.AbstractExecutor(workflow, dag, printreason=False, quiet=False, printshellcmds=False, printthreads=True, latency_wait=3, keepincomplete=False, keepmetadata=True)[source]¶

Bases: object

cancel()[source]¶

get_default_remote_provider_args()[source]¶

get_default_resources_args()[source]¶

get_set_scatter_args()[source]¶

get_set_threads_args()[source]¶

handle_job_error(job)[source]¶

handle_job_success(job)[source]¶

print_job_error(job, msg=None, **kwargs)[source]¶

printjob(job)[source]¶

rule_prefix(job)[source]¶

run(job, callback=None, submit_callback=None, error_callback=None)[source]¶: Run a specific job or group job.

run_jobs(jobs, callback=None, submit_callback=None, error_callback=None)[source]¶

Run a list of jobs that is ready at a given point in time.

By default, this method just runs each job individually. This method can be overwritten to submit many jobs in a more efficient way than one-by-one. Note that in any case, for each job, the callback functions have to be called individually!

shutdown()[source]¶

class snakemake.executors.CPUExecutor(workflow, dag, workers, printreason=False, quiet=False, printshellcmds=False, use_threads=False, latency_wait=3, cores=1, keepincomplete=False, keepmetadata=True)[source]¶

Bases: snakemake.executors.RealExecutor

cached_or_run(job, run_func, *args)[source]¶: Either retrieve result from cache, or run job with given function.

cancel()[source]¶

handle_job_error(job)[source]¶

handle_job_success(job)[source]¶

job_args_and_prepare(job)[source]¶

run(job, callback=None, submit_callback=None, error_callback=None)[source]¶: Run a specific job or group job.

run_group_job(job)[source]¶

Run a pipe group job.

This lets all items run simultaneously.

run_single_job(job)[source]¶

shutdown()[source]¶

spawn_job(job)[source]¶

class snakemake.executors.ClusterExecutor(workflow, dag, cores, jobname='snakejob.{name}.{jobid}.sh', printreason=False, quiet=False, printshellcmds=False, latency_wait=3, cluster_config=None, local_input=None, restart_times=None, exec_job=None, assume_shared_fs=True, max_status_checks_per_second=1, disable_default_remote_provider_args=False, disable_get_default_resources_args=False, keepincomplete=False, keepmetadata=True)[source]¶

Bases: snakemake.executors.RealExecutor

Backend for distributed execution.

The key idea is that a job is converted into a script that invokes Snakemake again, in whatever environment is targeted. The script is submitted to some job management platform (e.g. a cluster scheduler like slurm). This class can be specialized to generate more specific backends, also for the cloud.

cancel()[source]¶

cluster_params(job)[source]¶: Return wildcards object for job from cluster_config.

cluster_wildcards(job)[source]¶

default_jobscript = 'jobscript.sh'¶

format_job(pattern, job, **kwargs)[source]¶

get_jobscript(job)[source]¶

handle_job_error(job)[source]¶

handle_job_success(job)[source]¶

print_cluster_job_error(job_info, jobid)[source]¶

shutdown()[source]¶

tmpdir¶

write_jobscript(job, jobscript, **kwargs)[source]¶

class snakemake.executors.DRMAAClusterJob(job, jobid, callback, error_callback, jobscript)¶

Bases: tuple

callback¶: Alias for field number 2

error_callback¶: Alias for field number 3

job¶: Alias for field number 0

jobid¶: Alias for field number 1

jobscript¶: Alias for field number 4

class snakemake.executors.DRMAAExecutor(workflow, dag, cores, jobname='snakejob.{rulename}.{jobid}.sh', printreason=False, quiet=False, printshellcmds=False, drmaa_args='', drmaa_log_dir=None, latency_wait=3, cluster_config=None, restart_times=0, assume_shared_fs=True, max_status_checks_per_second=1, keepincomplete=False, keepmetadata=True)[source]¶

Bases: snakemake.executors.ClusterExecutor

cancel()[source]¶

run(job, callback=None, submit_callback=None, error_callback=None)[source]¶: Run a specific job or group job.

shutdown()[source]¶

class snakemake.executors.DryrunExecutor(workflow, dag, printreason=False, quiet=False, printshellcmds=False, printthreads=True, latency_wait=3, keepincomplete=False, keepmetadata=True)[source]¶

Bases: snakemake.executors.AbstractExecutor

printcache(job)[source]¶

printjob(job)[source]¶

class snakemake.executors.GenericClusterExecutor(workflow, dag, cores, submitcmd='qsub', statuscmd=None, cluster_config=None, jobname='snakejob.{rulename}.{jobid}.sh', printreason=False, quiet=False, printshellcmds=False, latency_wait=3, restart_times=0, assume_shared_fs=True, max_status_checks_per_second=1, keepincomplete=False, keepmetadata=True)[source]¶

Bases: snakemake.executors.ClusterExecutor

cancel()[source]¶

register_job(job)[source]¶

run(job, callback=None, submit_callback=None, error_callback=None)[source]¶: Run a specific job or group job.

class snakemake.executors.GenericClusterJob(job, jobid, callback, error_callback, jobscript, jobfinished, jobfailed)¶

Bases: tuple

callback¶: Alias for field number 2

error_callback¶: Alias for field number 3

job¶: Alias for field number 0

jobfailed¶: Alias for field number 6

jobfinished¶: Alias for field number 5

jobid¶: Alias for field number 1

jobscript¶: Alias for field number 4

class snakemake.executors.KubernetesExecutor(workflow, dag, namespace, container_image=None, jobname='{rulename}.{jobid}', printreason=False, quiet=False, printshellcmds=False, latency_wait=3, cluster_config=None, local_input=None, restart_times=None, keepincomplete=False, keepmetadata=True)[source]¶

Bases: snakemake.executors.ClusterExecutor

cancel()[source]¶

register_secret()[source]¶

run(job, callback=None, submit_callback=None, error_callback=None)[source]¶: Run a specific job or group job.

safe_delete_pod(jobid, ignore_not_found=True)[source]¶

shutdown()[source]¶

unregister_secret()[source]¶

class snakemake.executors.KubernetesJob(job, jobid, callback, error_callback, kubejob, jobscript)¶

Bases: tuple

callback¶: Alias for field number 2

error_callback¶: Alias for field number 3

job¶: Alias for field number 0

jobid¶: Alias for field number 1

jobscript¶: Alias for field number 5

kubejob¶: Alias for field number 4

class snakemake.executors.RealExecutor(workflow, dag, printreason=False, quiet=False, printshellcmds=False, latency_wait=3, assume_shared_fs=True, keepincomplete=False, keepmetadata=False)[source]¶

Bases: snakemake.executors.AbstractExecutor

format_job_pattern(pattern, job=None, **kwargs)[source]¶

get_additional_args()[source]¶: Return a string to add to self.exec_job that includes additional arguments from the command line. This is currently used in the ClusterExecutor and CPUExecutor, as both were using the same code. Both have base class of the RealExecutor.

handle_job_error(job, upload_remote=True)[source]¶

handle_job_success(job, upload_remote=True, handle_log=True, handle_touch=True, ignore_missing_output=False)[source]¶

register_job(job)[source]¶

class snakemake.executors.SynchronousClusterExecutor(workflow, dag, cores, submitcmd='qsub', cluster_config=None, jobname='snakejob.{rulename}.{jobid}.sh', printreason=False, quiet=False, printshellcmds=False, latency_wait=3, restart_times=0, assume_shared_fs=True, keepincomplete=False, keepmetadata=True)[source]¶

Bases: snakemake.executors.ClusterExecutor

invocations like “qsub -sync y” (SGE) or “bsub -K” (LSF) are synchronous, blocking the foreground thread and returning the remote exit code at remote exit.

cancel()[source]¶

run(job, callback=None, submit_callback=None, error_callback=None)[source]¶: Run a specific job or group job.

class snakemake.executors.SynchronousClusterJob(job, jobid, callback, error_callback, jobscript, process)¶

Bases: tuple

callback¶: Alias for field number 2

error_callback¶: Alias for field number 3

job¶: Alias for field number 0

jobid¶: Alias for field number 1

jobscript¶: Alias for field number 4

process¶: Alias for field number 5

class snakemake.executors.TibannaExecutor(workflow, dag, cores, tibanna_sfn, precommand='', tibanna_config=False, container_image=None, printreason=False, quiet=False, printshellcmds=False, latency_wait=3, local_input=None, restart_times=None, max_status_checks_per_second=1, keepincomplete=False, keepmetadata=True)[source]¶

Bases: snakemake.executors.ClusterExecutor

add_command(job, tibanna_args, tibanna_config)[source]¶

add_workflow_files(job, tibanna_args)[source]¶

adjust_filepath(f)[source]¶

cancel()[source]¶

handle_remote(target)[source]¶

make_tibanna_input(job)[source]¶

remove_prefix(s)[source]¶

run(job, callback=None, submit_callback=None, error_callback=None)[source]¶: Run a specific job or group job.

shutdown()[source]¶

split_filename(filename, checkdir=None)[source]¶

class snakemake.executors.TibannaJob(job, jobname, jobid, exec_arn, callback, error_callback)¶

Bases: tuple

callback¶: Alias for field number 4

error_callback¶: Alias for field number 5

exec_arn¶: Alias for field number 3

job¶: Alias for field number 0

jobid¶: Alias for field number 2

jobname¶: Alias for field number 1

class snakemake.executors.TouchExecutor(workflow, dag, printreason=False, quiet=False, printshellcmds=False, latency_wait=3, assume_shared_fs=True, keepincomplete=False, keepmetadata=False)[source]¶

Bases: snakemake.executors.RealExecutor

handle_job_success(job)[source]¶

run(job, callback=None, submit_callback=None, error_callback=None)[source]¶: Run a specific job or group job.

snakemake.executors.change_working_directory(directory=None)[source]¶: Change working directory in execution context if provided.

snakemake.executors.run_wrapper(job_rule, input, output, params, wildcards, threads, resources, log, benchmark, benchmark_repeats, conda_env, container_img, singularity_args, env_modules, use_singularity, linemaps, debug, cleanup_scripts, shadow_dir, jobid, edit_notebook)[source]¶

Wrapper around the run method that handles exceptions and benchmarking.

Arguments job_rule – the job.rule member input – list of input files output – list of output files wildcards – so far processed wildcards threads – usable threads log – list of log files shadow_dir – optional shadow directory root

snakemake.executors.sleep()[source]¶

snakemake.gui module¶

snakemake.io module¶

class snakemake.io.AnnotatedString(value)[source]¶: Bases: str

class snakemake.io.ExistsDict(cache)[source]¶: Bases: dict

class snakemake.io.IOCache(max_wait_time)[source]¶

Bases: object

clear()[source]¶

collect_mtime(path)[source]¶

deactivate()[source]¶

mtime_inventory(jobs)[source]¶

snakemake.io.IOFile(file, rule=None)[source]¶

class snakemake.io.InputFiles(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶

Bases: snakemake.io.Namedlist

size¶

size_mb¶

class snakemake.io.Log(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶: Bases: snakemake.io.Namedlist

class snakemake.io.Mtime(local=None, local_target=None, remote=None)[source]¶

Bases: object

local(follow_symlinks=False)[source]¶

local_or_remote(follow_symlinks=False)[source]¶

remote()[source]¶

class snakemake.io.Namedlist(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶

Bases: list

A list that additionally provides functions to name items. Further, it is hashable, however the hash does not consider the item names.

get(key, default_value=None)[source]¶

items()[source]¶

keys()[source]¶

class snakemake.io.OutputFiles(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶: Bases: snakemake.io.Namedlist

class snakemake.io.Params(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶: Bases: snakemake.io.Namedlist

class snakemake.io.PeriodicityDetector(min_repeat=20, max_repeat=100)[source]¶

Bases: object

is_periodic(value)[source]¶: Returns the periodic substring or None if not periodic.

class snakemake.io.ReportObject(caption, category, subcategory, patterns, htmlindex)¶

Bases: tuple

caption¶: Alias for field number 0

category¶: Alias for field number 1

htmlindex¶: Alias for field number 4

patterns¶: Alias for field number 3

subcategory¶: Alias for field number 2

class snakemake.io.Resources(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶: Bases: snakemake.io.Namedlist

class snakemake.io.Wildcards(toclone=None, fromdict=None, plainstr=False, strip_constraints=False, custom_map=None)[source]¶: Bases: snakemake.io.Namedlist

snakemake.io.ancient(value)[source]¶: A flag for an input file that shall be considered ancient; i.e. its timestamp shall have no effect on which jobs to run.

snakemake.io.apply_wildcards(pattern, wildcards, fill_missing=False, fail_dynamic=False, dynamic_fill=None, keep_dynamic=False)[source]¶

snakemake.io.checkpoint_target(value)[source]¶

snakemake.io.contains_wildcard(path)[source]¶

snakemake.io.contains_wildcard_constraints(pattern)[source]¶

snakemake.io.directory(value)[source]¶: A flag to specify that an output is a directory, rather than a file or named pipe.

snakemake.io.dynamic(value)[source]¶: A flag for a file that shall be dynamic, i.e. the multiplicity (and wildcard values) will be expanded after a certain rule has been run

snakemake.io.expand(*args, **wildcards)[source]¶

Expand wildcards in given filepatterns.

Arguments *args – first arg: filepatterns as list or one single filepattern,

second arg (optional): a function to combine wildcard values (itertools.product per default)

**wildcards – the wildcards as keyword arguments: with their values as lists. If allow_missing=True is included wildcards in filepattern without values will stay unformatted.

snakemake.io.flag(value, flag_type, flag_value=True)[source]¶

snakemake.io.get_flag_value(value, flag_type)[source]¶

snakemake.io.get_git_root(path)[source]¶

Parameters:	path – (str) Path a to a directory/file that is located inside the repo
Returns:	path to root folder for git repo

snakemake.io.get_git_root_parent_directory(path, input_path)[source]¶

This function will recursively go through parent directories until a git repository is found or until no parent directories are left, in which case a error will be raised. This is needed when providing a path to a file/folder that is located on a branch/tag no currently checked out.

Parameters:	path – (str) Path a to a directory that is located inside the repo input_path – (str) origin path, used when raising WorkflowError
Returns:	path to root folder for git repo

snakemake.io.get_wildcard_names(pattern)[source]¶

snakemake.io.git_content(git_file)[source]¶

This function will extract a file from a git repository, one located on the filesystem. Expected format is git+file:///path/to/your/repo/path_to_file@@version

Parameters:	env_file (str) – consist of path to repo, @, version and file information Ex: git+file:////home/smeds/snakemake-wrappers/bio/fastqc/wrapper.py@0.19.3
Returns:	file content or None if the expected format isn’t meet

snakemake.io.glob_wildcards(pattern, files=None, followlinks=False)[source]¶: Glob the values of the wildcards by matching the given pattern to the filesystem. Returns a named tuple with a list of values for each wildcard.

snakemake.io.is_callable(value)[source]¶

snakemake.io.is_flagged(value, flag)[source]¶

snakemake.io.lchmod(f, mode)[source]¶

snakemake.io.limit(pattern, **wildcards)[source]¶

Limit wildcards to the given values.

Arguments: **wildcards – the wildcards as keyword arguments

with their values as lists

snakemake.io.load_configfile(configpath)[source]¶: Loads a JSON or YAML configfile as a dict, then checks that it’s a dict.

snakemake.io.local(value)[source]¶: Mark a file as local file. This disables application of a default remote provider.

snakemake.io.lutime(f, times)[source]¶

snakemake.io.multiext(prefix, *extensions)[source]¶: Expand a given prefix with multiple extensions (e.g. .txt, .csv, _peaks.bed, …).

snakemake.io.not_iterable(value)[source]¶

snakemake.io.pipe(value)[source]¶

snakemake.io.protected(value)[source]¶: A flag for a file that shall be write protected after creation.

snakemake.io.regex(filepattern)[source]¶

snakemake.io.remove(file, remove_non_empty_dir=False)[source]¶

snakemake.io.repeat(value, n_repeat)[source]¶: Flag benchmark records with the number of repeats.

snakemake.io.report(value, caption=None, category=None, subcategory=None, patterns=[], htmlindex=None)[source]¶

Flag output file or directory as to be included into reports.

In case of directory, files to include can be specified via a glob pattern (default: *).

Arguments value – File or directory. caption – Path to a .rst file with a textual description of the result. category – Name of the category in which the result should be displayed in the report. pattern – Wildcard pattern for selecting files if a directory is given (this is used as

input for snakemake.io.glob_wildcards). Pattern shall not include the path to the directory itself.

snakemake.io.split_git_path(path)[source]¶

snakemake.io.strip_wildcard_constraints(pattern)[source]¶: Return a string that does not contain any wildcard constraints.

snakemake.io.temp(value)[source]¶: A flag for an input or output file that shall be removed after usage.

snakemake.io.temporary(value)[source]¶: An alias for temp.

snakemake.io.touch(value)[source]¶

snakemake.io.unpack(value)[source]¶

snakemake.io.update_wildcard_constraints(pattern, wildcard_constraints, global_wildcard_constraints)[source]¶

Update wildcard constraints

Parameters:	pattern (str) – pattern on which to update constraints wildcard_constraints (dict) – dictionary of wildcard:constraint key-value pairs global_wildcard_constraints (dict) – dictionary of wildcard:constraint key-value pairs

snakemake.io.wait_for_files(files, latency_wait=3, force_stay_on_remote=False, ignore_pipe=False)[source]¶: Wait for given files to be present in filesystem.

snakemake.jobs module¶

class snakemake.jobs.AbstractJob[source]¶

Bases: object

download_remote_input()[source]¶

is_group()[source]¶

log_error(msg=None, **kwargs)[source]¶

log_info(skip_dynamic=False)[source]¶

properties(omit_resources=['_cores', '_nodes'], **aux_properties)[source]¶

remove_existing_output()[source]¶

class snakemake.jobs.GroupJob(id, jobs)[source]¶

Bases: snakemake.jobs.AbstractJob

all_products¶

attempt¶

check_protected_output()[source]¶

cleanup()[source]¶

dag¶

download_remote_input()[source]¶

dynamic_input¶

expanded_output¶: Yields the entire expanded output of all jobs

finalize()[source]¶

format_wildcards(string, **variables)[source]¶: Format a string with variables from the job.

get_targets()[source]¶

get_wait_for_files()[source]¶

groupid¶

input¶

inputsize¶

is_branched¶

is_checkpoint¶

is_group()[source]¶

is_local¶

is_updated¶

jobid¶

jobs¶

log¶

log_error(msg=None, **kwargs)[source]¶

log_info(skip_dynamic=False)[source]¶

merge(other)[source]¶

name¶

needs_singularity¶

obj_cache = {}¶

output¶

postprocess(error=False, **kwargs)[source]¶

priority¶

products¶

properties(omit_resources=['_cores', '_nodes'], **aux_properties)[source]¶

register()[source]¶

remove_existing_output()[source]¶

resources¶

restart_times¶

rules¶

threads¶

toposorted¶

class snakemake.jobs.GroupJobFactory[source]¶

Bases: object

new(id, jobs)[source]¶

class snakemake.jobs.Job(rule, dag, wildcards_dict=None, format_wildcards=None, targetfile=None)[source]¶

Bases: snakemake.jobs.AbstractJob

HIGHEST_PRIORITY = 9223372036854775807¶

archive_conda_env()[source]¶: Archive a conda environment into a custom local channel.

attempt¶

b64id¶

benchmark¶

benchmark_repeats¶

check_protected_output()[source]¶

cleanup()[source]¶: Cleanup output files.

close_remote()[source]¶

conda_env¶

conda_env_file¶

conda_env_path¶

container_img¶

container_img_path¶

container_img_url¶

dag¶

dependencies¶

download_remote_input()[source]¶

dynamic_input¶

dynamic_output¶

dynamic_wildcards¶: Return all wildcard values determined from dynamic output.

env_modules¶

existing_output¶

existing_remote_input¶

existing_remote_output¶

expand_dynamic(pattern)[source]¶: Expand dynamic files.

expanded_output¶: Iterate over output files while dynamic output is expanded.

files_to_download¶

files_to_upload¶

format_wildcards(string, **variables)[source]¶: Format a string with variables from the job.

get_targets()[source]¶

get_wait_for_files()[source]¶

group¶

input¶

inputsize¶: Return the size of the input files. Input files need to be present.

is_branched¶

is_checkpoint¶

is_cwl¶

is_group()[source]¶

is_local¶

is_norun¶

is_notebook¶

is_pipe¶

is_run¶

is_script¶

is_shadow¶

is_shell¶

is_valid()[source]¶: Check if job is valid

is_wrapper¶

jobid¶

local_input¶

local_output¶

log¶

log_error(msg=None, indent=False, **kwargs)[source]¶

log_info(skip_dynamic=False, indent=False, printshellcmd=True)[source]¶

message¶: Return the message for this job.

missing_input¶: Return missing input files.

missing_output(requested)[source]¶

missing_remote_input¶

missing_remote_output¶

name¶

needs_singularity¶

obj_cache = {}¶

output¶

output_mintime¶: Return oldest output file.

outputs_older_than_script_or_notebook()[source]¶: return output that’s older than script, i.e. script has changed

params¶

postprocess(upload_remote=True, handle_log=True, handle_touch=True, handle_temp=True, error=False, ignore_missing_output=False, assume_shared_fs=True, latency_wait=None, keep_metadata=True)[source]¶

prepare()[source]¶: Prepare execution of job. This includes creation of directories and deletion of previously created dynamic files. Creates a shadow directory for the job if specified.

priority¶

products¶

properties(omit_resources=['_cores', '_nodes'], **aux_properties)[source]¶

protected_output¶

register()[source]¶

remote_input¶

remote_input_newer_than_local¶

remote_input_older_than_local¶

remote_output¶

remote_output_newer_than_local¶

remote_output_older_than_local¶

remove_existing_output()[source]¶: Clean up both dynamic and regular output before rules actually run

resources¶

restart_times¶

rule¶

rules¶

shadow_dir¶

shadowed_path(f)[source]¶: Get the shadowed path of IOFile f.

shellcmd¶: Return the shell command.

subworkflow_input¶

targetfile¶

temp_output¶

threads¶

touch_output¶

unique_input¶

updated()[source]¶

wildcards¶

wildcards_dict¶

class snakemake.jobs.JobFactory[source]¶

Bases: object

new(rule, dag, wildcards_dict=None, format_wildcards=None, targetfile=None, update=False)[source]¶

class snakemake.jobs.Reason[source]¶

Bases: object

derived¶

finished¶

forced¶

incomplete_output¶

mark_finished()[source]¶: called if the job has been run

missing_output¶

noio¶

nooutput¶

pipe¶

target¶

updated_input¶

updated_input_run¶

snakemake.jobs.format_files(job, io, dynamicio)[source]¶

snakemake.jobs.jobfiles(jobs, type)[source]¶

snakemake.logging module¶

class snakemake.logging.ColorizingStreamHandler(nocolor=False, stream=<_io.TextIOWrapper name='<stderr>' mode='w' encoding='UTF-8'>, use_threads=False, mode=0)[source]¶

Bases: logging.StreamHandler

BLACK = 0¶

BLUE = 4¶

BOLD_SEQ = '\x1b[1m'¶

COLOR_SEQ = '\x1b[%dm'¶

CYAN = 6¶

GREEN = 2¶

MAGENTA = 5¶

RED = 1¶

RESET_SEQ = '\x1b[0m'¶

WHITE = 7¶

YELLOW = 3¶

can_color_tty(mode)[source]¶

colors = {'CRITICAL': 1, 'DEBUG': 4, 'ERROR': 1, 'INFO': 2, 'WARNING': 3}¶

decorate(record)[source]¶

emit(record)[source]¶

Emit a record.

If a formatter is specified, it is used to format the record. The record is then written to the stream with a trailing newline. If exception information is present, it is formatted using traceback.print_exception and appended to the stream. If the stream has an ‘encoding’ attribute, it is used to determine how to do the output to the stream.

is_tty¶

class snakemake.logging.Logger[source]¶

Bases: object

cleanup()[source]¶

custom_server_handler(msg)[source]¶

Custom server log handler.

Sends the log to the server.

Parameters:	msg (dict) – the log message dictionary

d3dag(**msg)[source]¶

dag_debug(msg)[source]¶

debug(msg)[source]¶

error(msg)[source]¶

get_logfile()[source]¶

group_error(**msg)[source]¶

group_info(**msg)[source]¶

handler(msg)[source]¶

info(msg, indent=False)[source]¶

job_error(**msg)[source]¶

job_finished(**msg)[source]¶

job_info(**msg)[source]¶

location(msg)[source]¶

logfile_hint()[source]¶

progress(done=None, total=None)[source]¶

remove_logfile()[source]¶

resources_info(msg)[source]¶

rule_info(**msg)[source]¶

run_info(msg)[source]¶

set_level(level)[source]¶

set_stream_handler(stream_handler)[source]¶

setup_logfile()[source]¶

shellcmd(msg, indent=False)[source]¶

text_handler(msg)[source]¶

The default snakemake log handler.

Prints the output to the console.

Parameters:	msg (dict) – the log message dictionary

warning(msg)[source]¶

class snakemake.logging.SlackLogger[source]¶

Bases: object

log_handler(msg)[source]¶

snakemake.logging.format_dict(dict_like, omit_keys=[], omit_values=[])[source]¶

snakemake.logging.format_resource_names(resources, omit_resources=['_cores', '_nodes'])[source]¶

snakemake.logging.format_resources(dict_like, *, omit_keys={'_cores', '_nodes'}, omit_values=[])¶

snakemake.logging.format_wildcards(dict_like, omit_keys=[], *, omit_values={'__snakemake_dynamic__'})¶

snakemake.logging.setup_logger(handler=[], quiet=False, printshellcmds=False, printreason=False, debug_dag=False, nocolor=False, stdout=False, debug=False, use_threads=False, mode=0, show_failed_logs=False, wms_monitor=None)[source]¶

snakemake.output_index module¶

class snakemake.output_index.OutputIndex(rules)[source]¶

Bases: object

match(targetfile)[source]¶

snakemake.parser module¶

class snakemake.parser.AbstractCmd(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.Run

args()[source]¶

block_content(token)[source]¶

decorate_end(token)[source]¶

end()[source]¶

end_func = None¶

is_block_end(token)[source]¶

overwrite_block_content(token)[source]¶

overwrite_cmd = None¶

start()[source]¶

start_func = None¶

class snakemake.parser.Benchmark(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.CWL(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.Script

args()[source]¶

end_func = 'cwl'¶

start_func = '@workflow.cwl'¶

class snakemake.parser.Cache(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶

Bases: snakemake.parser.RuleKeywordState

keyword¶

class snakemake.parser.Checkpoint(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.Rule

start()[source]¶

class snakemake.parser.Conda(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Configfile(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.GlobalKeywordState

class snakemake.parser.Container(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.DecoratorKeywordState(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.KeywordState

args = []¶

decorator = None¶

end()[source]¶

start()[source]¶

class snakemake.parser.EnvModules(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Envvars(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.GlobalKeywordState

keyword¶

class snakemake.parser.GlobalContainer(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.GlobalKeywordState

keyword¶

class snakemake.parser.GlobalKeywordState(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.KeywordState

start()[source]¶

class snakemake.parser.GlobalSingularity(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.GlobalKeywordState

keyword¶

class snakemake.parser.GlobalWildcardConstraints(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.GlobalKeywordState

keyword¶

class snakemake.parser.Group(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Include(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.GlobalKeywordState

class snakemake.parser.Input(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.KeywordState(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.TokenAutomaton

block(token)[source]¶

block_content(token)[source]¶

colon(token)[source]¶

decorate_end(token)[source]¶

end()[source]¶

is_block_end(token)[source]¶

keyword¶

prefix = ''¶

yield_indent(token)[source]¶

class snakemake.parser.Localrules(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.GlobalKeywordState

block_content(token)[source]¶

class snakemake.parser.Log(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Message(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Notebook(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.Script

args()[source]¶

end_func = 'notebook'¶

start_func = '@workflow.notebook'¶

class snakemake.parser.OnError(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.DecoratorKeywordState

args = ['log']¶

decorator = 'onerror'¶

class snakemake.parser.OnStart(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.DecoratorKeywordState

args = ['log']¶

decorator = 'onstart'¶

class snakemake.parser.OnSuccess(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.DecoratorKeywordState

args = ['log']¶

decorator = 'onsuccess'¶

class snakemake.parser.Output(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Params(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Pepfile(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.GlobalKeywordState

class snakemake.parser.Pepschema(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.GlobalKeywordState

class snakemake.parser.Priority(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Python(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.TokenAutomaton

python(token)[source]¶

subautomata = {'checkpoint': <class 'snakemake.parser.Checkpoint'>, 'configfile': <class 'snakemake.parser.Configfile'>, 'container': <class 'snakemake.parser.GlobalContainer'>, 'envvars': <class 'snakemake.parser.Envvars'>, 'include': <class 'snakemake.parser.Include'>, 'localrules': <class 'snakemake.parser.Localrules'>, 'onerror': <class 'snakemake.parser.OnError'>, 'onstart': <class 'snakemake.parser.OnStart'>, 'onsuccess': <class 'snakemake.parser.OnSuccess'>, 'pepfile': <class 'snakemake.parser.Pepfile'>, 'pepschema': <class 'snakemake.parser.Pepschema'>, 'report': <class 'snakemake.parser.Report'>, 'rule': <class 'snakemake.parser.Rule'>, 'ruleorder': <class 'snakemake.parser.Ruleorder'>, 'scattergather': <class 'snakemake.parser.Scattergather'>, 'singularity': <class 'snakemake.parser.GlobalSingularity'>, 'subworkflow': <class 'snakemake.parser.Subworkflow'>, 'wildcard_constraints': <class 'snakemake.parser.GlobalWildcardConstraints'>, 'workdir': <class 'snakemake.parser.Workdir'>}¶

class snakemake.parser.Report(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.GlobalKeywordState

class snakemake.parser.Resources(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Rule(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.GlobalKeywordState

block_content(token)[source]¶

dedent¶

end()[source]¶

name(token)[source]¶

start(aux='')[source]¶

subautomata = {'benchmark': <class 'snakemake.parser.Benchmark'>, 'cache': <class 'snakemake.parser.Cache'>, 'conda': <class 'snakemake.parser.Conda'>, 'container': <class 'snakemake.parser.Container'>, 'cwl': <class 'snakemake.parser.CWL'>, 'envmodules': <class 'snakemake.parser.EnvModules'>, 'group': <class 'snakemake.parser.Group'>, 'input': <class 'snakemake.parser.Input'>, 'log': <class 'snakemake.parser.Log'>, 'message': <class 'snakemake.parser.Message'>, 'notebook': <class 'snakemake.parser.Notebook'>, 'output': <class 'snakemake.parser.Output'>, 'params': <class 'snakemake.parser.Params'>, 'priority': <class 'snakemake.parser.Priority'>, 'resources': <class 'snakemake.parser.Resources'>, 'run': <class 'snakemake.parser.Run'>, 'script': <class 'snakemake.parser.Script'>, 'shadow': <class 'snakemake.parser.Shadow'>, 'shell': <class 'snakemake.parser.Shell'>, 'singularity': <class 'snakemake.parser.Singularity'>, 'threads': <class 'snakemake.parser.Threads'>, 'version': <class 'snakemake.parser.Version'>, 'wildcard_constraints': <class 'snakemake.parser.WildcardConstraints'>, 'wrapper': <class 'snakemake.parser.Wrapper'>}¶

class snakemake.parser.RuleKeywordState(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶

Bases: snakemake.parser.KeywordState

start()[source]¶

class snakemake.parser.Ruleorder(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.GlobalKeywordState

block_content(token)[source]¶

class snakemake.parser.Run(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.RuleKeywordState

block_content(token)[source]¶

end()[source]¶

is_block_end(token)[source]¶

start()[source]¶

class snakemake.parser.Scattergather(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.GlobalKeywordState

class snakemake.parser.Script(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.AbstractCmd

args()[source]¶

end_func = 'script'¶

start_func = '@workflow.script'¶

class snakemake.parser.SectionKeywordState(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.KeywordState

end()[source]¶

start()[source]¶

class snakemake.parser.Shadow(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.Shell(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.AbstractCmd

args()[source]¶

end_func = 'shell'¶

start_func = '@workflow.shellcmd'¶

class snakemake.parser.Singularity(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶

Bases: snakemake.parser.RuleKeywordState

keyword¶

class snakemake.parser.Snakefile(path, rulecount=0)[source]¶: Bases: object

exception snakemake.parser.StopAutomaton(token)[source]¶: Bases: Exception

class snakemake.parser.Subworkflow(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.GlobalKeywordState

block_content(token)[source]¶

end()[source]¶

name(token)[source]¶

subautomata = {'configfile': <class 'snakemake.parser.SubworkflowConfigfile'>, 'snakefile': <class 'snakemake.parser.SubworkflowSnakefile'>, 'workdir': <class 'snakemake.parser.SubworkflowWorkdir'>}¶

class snakemake.parser.SubworkflowConfigfile(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.SubworkflowKeywordState

class snakemake.parser.SubworkflowKeywordState(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.SectionKeywordState

prefix = 'Subworkflow'¶

class snakemake.parser.SubworkflowSnakefile(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.SubworkflowKeywordState

class snakemake.parser.SubworkflowWorkdir(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.SubworkflowKeywordState

class snakemake.parser.Threads(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.TokenAutomaton(snakefile, base_indent=0, dedent=0, root=True)[source]¶

Bases: object

consume()[source]¶

dedent¶

effective_indent¶

error(msg, token)[source]¶

indentation(token)[source]¶

subautomata = {}¶

subautomaton(automaton, *args, **kwargs)[source]¶

class snakemake.parser.Version(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶: Bases: snakemake.parser.RuleKeywordState

class snakemake.parser.WildcardConstraints(snakefile, base_indent=0, dedent=0, root=True, rulename=None)[source]¶

Bases: snakemake.parser.RuleKeywordState

keyword¶

class snakemake.parser.Workdir(snakefile, base_indent=0, dedent=0, root=True)[source]¶: Bases: snakemake.parser.GlobalKeywordState

class snakemake.parser.Wrapper(snakefile, rulename, base_indent=0, dedent=0, root=True)[source]¶

Bases: snakemake.parser.Script

args()[source]¶

end_func = 'wrapper'¶

start_func = '@workflow.wrapper'¶

snakemake.parser.format_tokens(tokens)[source]¶

snakemake.parser.is_colon(token)[source]¶

snakemake.parser.is_comma(token)[source]¶

snakemake.parser.is_comment(token)[source]¶

snakemake.parser.is_dedent(token)[source]¶

snakemake.parser.is_eof(token)[source]¶

snakemake.parser.is_greater(token)[source]¶

snakemake.parser.is_indent(token)[source]¶

snakemake.parser.is_name(token)[source]¶

snakemake.parser.is_newline(token, newline_tokens={4, 58})[source]¶

snakemake.parser.is_op(token)[source]¶

snakemake.parser.is_string(token)[source]¶

snakemake.parser.lineno(token)[source]¶

snakemake.parser.parse(path, overwrite_shellcmd=None, rulecount=0)[source]¶

snakemake.persistence module¶

class snakemake.persistence.Persistence(nolock=False, dag=None, conda_prefix=None, singularity_prefix=None, shadow_prefix=None, warn_only=False)[source]¶

Bases: object

all_inputfiles()[source]¶

all_outputfiles()[source]¶

cleanup(job)[source]¶

cleanup_locks()[source]¶

cleanup_metadata(path)[source]¶

cleanup_shadow()[source]¶

code(path)[source]¶

code_changed(job, file=None)[source]¶: Yields output files with changed code of bool if file given.

conda_cleanup_envs()[source]¶

deactivate_cache()[source]¶

external_jobids(job)[source]¶

files¶

finished(job, keep_metadata=True)[source]¶

incomplete(job)[source]¶

input(path)[source]¶

input_changed(job, file=None)[source]¶: Yields output files with changed input of bool if file given.

lock()[source]¶

lock_warn_only()[source]¶

locked¶

log(path)[source]¶

metadata(path)[source]¶

migrate_v1_to_v2()[source]¶

noop(*args)[source]¶

params(path)[source]¶

params_changed(job, file=None)[source]¶: Yields output files with changed params of bool if file given.

rule(path)[source]¶

shellcmd(path)[source]¶

started(job, external_jobid=None)[source]¶

unlock(*args)[source]¶

version(path)[source]¶

version_changed(job, file=None)[source]¶: Yields output files with changed versions of bool if file given.

snakemake.persistence.pickle_code(code)[source]¶

snakemake.rules module¶

class snakemake.rules.Rule(*args, lineno=None, snakefile=None, restart_times=0)[source]¶

Bases: object

apply_default_remote(item)[source]¶

apply_input_function(func, wildcards, incomplete_checkpoint_func=<function Rule.<lambda>>, raw_exceptions=False, **aux_params)[source]¶

benchmark¶

check_caching()[source]¶

check_output_duplicates()[source]¶: Check Namedlist for duplicate entries and raise a WorkflowError on problems.

check_wildcards(wildcards)[source]¶

conda_env¶

container_img¶

dynamic_branch(wildcards, input=True)[source]¶

expand_benchmark(wildcards)[source]¶

expand_conda_env(wildcards)[source]¶

expand_group(wildcards)[source]¶: Expand the group given wildcards.

expand_input(wildcards)[source]¶

expand_log(wildcards)[source]¶

expand_output(wildcards)[source]¶

expand_params(wildcards, input, output, resources, omit_callable=False)[source]¶

expand_resources(wildcards, input, attempt)[source]¶

static get_wildcard_len(wildcards)[source]¶

Return the length of the given wildcard values.

Arguments wildcards – a dict of wildcards

get_wildcards(requested_output)[source]¶

Return wildcard dictionary by matching regular expression output files to the requested concrete ones.

Arguments requested_output – a concrete filepath

has_wildcards()[source]¶: Return True if rule contains wildcards.

input¶

is_cwl¶

is_notebook¶

is_producer(requested_output)[source]¶: Returns True if this rule is a producer of the requested output.

is_run¶

is_script¶

is_shell¶

is_wrapper¶

log¶

output¶

params¶

products¶

register_wildcards(wildcard_names)[source]¶

set_input(*input, **kwinput)[source]¶

Add a list of input files. Recursive lists are flattened.

Arguments input – the list of input files

set_log(*logs, **kwlogs)[source]¶

set_output(*output, **kwoutput)[source]¶

Add a list of output files. Recursive lists are flattened.

After creating the output files, they are checked for duplicates.

Arguments output – the list of output files

set_params(*params, **kwparams)[source]¶

set_wildcard_constraints(**kwwildcard_constraints)[source]¶

update_wildcard_constraints()[source]¶

version¶

wildcard_constraints¶

wildcard_names¶

class snakemake.rules.RuleProxy(rule)[source]¶

Bases: object

benchmark¶

input¶

log¶

output¶

params¶

class snakemake.rules.Ruleorder[source]¶

Bases: object

add(*rulenames)[source]¶: Records the order of given rules as rule1 > rule2 > rule3, …

compare(rule1, rule2)[source]¶: Return whether rule2 has a higher priority than rule1.

snakemake.scheduler module¶

class snakemake.scheduler.DummyRateLimiter[source]¶: Bases: contextlib.ContextDecorator

class snakemake.scheduler.JobScheduler(workflow, dag, cores, local_cores=1, dryrun=False, touch=False, cluster=None, cluster_status=None, cluster_config=None, cluster_sync=None, drmaa=None, drmaa_log_dir=None, kubernetes=None, container_image=None, tibanna=None, tibanna_sfn=None, google_lifesciences=None, google_lifesciences_regions=None, google_lifesciences_location=None, google_lifesciences_cache=False, tes=None, precommand='', preemption_default=None, preemptible_rules=None, tibanna_config=False, jobname=None, quiet=False, printreason=False, printshellcmds=False, keepgoing=False, max_jobs_per_second=None, max_status_checks_per_second=100, latency_wait=3, greediness=1.0, force_use_threads=False, assume_shared_fs=True, keepincomplete=False, keepmetadata=True, scheduler_type=None, scheduler_ilp_solver=None)[source]¶

Bases: object

calc_resource(name, value)[source]¶

exit_gracefully(*args)[source]¶

get_executor(job)[source]¶

job_reward(job)[source]¶

job_selector_greedy(jobs)[source]¶

Using the greedy heuristic from “A Greedy Algorithm for the General Multidimensional Knapsack Problem”, Akcay, Li, Xu, Annals of Operations Research, 2012

Parameters:	jobs (list) – list of jobs

job_selector_ilp(jobs)[source]¶: Job scheduling by optimization of resource usage by solving ILP using pulp

job_weight(job)[source]¶

open_jobs¶: Return open jobs.

progress()[source]¶: Display the progress.

remaining_jobs¶: Return jobs to be scheduled including not yet ready ones.

required_by_job(temp_file, job)[source]¶

rule_weight(rule)[source]¶

run(jobs, executor=None)[source]¶

schedule()[source]¶: Schedule jobs that are ready, maximizing cpu usage.

stats¶

snakemake.scheduler.cumsum(iterable, zero=[0])[source]¶

snakemake.script module¶

class snakemake.script.JuliaEncoder[source]¶

Bases: object

Encoding Pyton data structures into Julia.

classmethod encode_dict(d)[source]¶

classmethod encode_items(items)[source]¶

classmethod encode_list(l)[source]¶

classmethod encode_namedlist(namedlist)[source]¶

classmethod encode_positional_items(namedlist)[source]¶

classmethod encode_value(value)[source]¶

class snakemake.script.JuliaScript(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶

Bases: snakemake.script.ScriptBase

execute_script(fname, edit=False)[source]¶

get_preamble()[source]¶

write_script(preamble, fd)[source]¶

class snakemake.script.PythonScript(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶

Bases: snakemake.script.ScriptBase

execute_script(fname, edit=False)[source]¶

static generate_preamble(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir, preamble_addendum='')[source]¶

get_preamble()[source]¶

write_script(preamble, fd)[source]¶

class snakemake.script.REncoder[source]¶

Bases: object

Encoding Pyton data structures into R.

classmethod encode_dict(d)[source]¶

classmethod encode_items(items)[source]¶

classmethod encode_list(l)[source]¶

classmethod encode_namedlist(namedlist)[source]¶

classmethod encode_numeric(value)[source]¶

classmethod encode_value(value)[source]¶

class snakemake.script.RMarkdown(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶

Bases: snakemake.script.ScriptBase

execute_script(fname, edit=False)[source]¶

get_preamble()[source]¶

write_script(preamble, fd)[source]¶

class snakemake.script.RScript(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶

Bases: snakemake.script.ScriptBase

execute_script(fname, edit=False)[source]¶

static generate_preamble(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir, preamble_addendum='')[source]¶

get_preamble()[source]¶

write_script(preamble, fd)[source]¶

class snakemake.script.ScriptBase(path, source, basedir, input_, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶

Bases: abc.ABC

editable = False¶

evaluate(edit=False)[source]¶

execute_script(fname, edit=False)[source]¶

get_preamble()[source]¶

local_path¶

write_script(preamble, fd)[source]¶

class snakemake.script.Snakemake(input_, output, params, wildcards, threads, resources, log, config, rulename, bench_iteration, scriptdir=None)[source]¶

Bases: object

log_fmt_shell(stdout=True, stderr=True, append=False)[source]¶

Return a shell redirection string to be used in shell() calls

This function allows scripts and wrappers support optional log files specified in the calling rule. If no log was specified, then an empty string “” is returned, regardless of the values of stdout, stderr, and append.

Parameters:

stdout (bool) – Send stdout to log
stderr (bool) – Send stderr to log
append (bool) – Do not overwrite the log file. Useful for sending output of multiple commands to the same log. Note however that the log will not be truncated at the start.
following table describes the output (The) –
-------- -------- ----- ------------- (--------) –
stderr append log return value (stdout) –
-------- -------- ----- ------------ (--------) –
True True fn >> fn 2>&1 (True) –
False True fn >> fn (True) –
True True fn 2>> fn (False) –
True False fn > fn 2>&1 (True) –
False False fn > fn (True) –
True False fn 2> fn (False) –
any any None "" (any) –
-------- -------- ----- ----------- (--------) –

snakemake.script.get_language(path, source)[source]¶

snakemake.script.get_source(path, basedir='.', wildcards=None, params=None)[source]¶

snakemake.script.script(path, basedir, input, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶: Load a script from the given basedir + path and execute it.

snakemake.shell module¶

class snakemake.shell.shell[source]¶

Bases: object

classmethod check_output(cmd, **kwargs)[source]¶

classmethod cleanup()[source]¶

classmethod executable(cmd)[source]¶

classmethod get_executable()[source]¶

static iter_stdout(proc, cmd)[source]¶

classmethod kill(jobid)[source]¶

classmethod prefix(prefix)[source]¶

classmethod suffix(suffix)[source]¶

classmethod win_command_prefix(cmd)[source]¶: The command prefix used on windows when specifing a explicit shell executable. This would be “-c” for bash and “/C” for cmd.exe Note: that if no explicit executable is set commands are executed with Popen(…, shell=True) which uses COMSPEC on windows where this is not needed.

snakemake.singularity module¶

snakemake.stats module¶

class snakemake.stats.Stats[source]¶

Bases: object

file_stats¶

overall_runtime¶

report_job_end(job)[source]¶

report_job_start(job)[source]¶

rule_stats¶

to_json(path)[source]¶

snakemake.utils module¶

class snakemake.utils.AlwaysQuotedFormatter(quote_func=None, *args, **kwargs)[source]¶

Bases: snakemake.utils.QuotedFormatter

Subclass of QuotedFormatter that always quotes.

Usage is identical to QuotedFormatter, except that it always acts like “q” was appended to the format spec, unless u (for unquoted) is appended.

format_field(value, format_spec)[source]¶

class snakemake.utils.QuotedFormatter(quote_func=None, *args, **kwargs)[source]¶

Bases: string.Formatter

Subclass of string.Formatter that supports quoting.

Using this formatter, any field can be quoted after formatting by appending “q” to its format string. By default, shell quoting is performed using “shlex.quote”, but you can pass a different quote_func to the constructor. The quote_func simply has to take a string argument and return a new string representing the quoted form of the input string.

Note that if an element after formatting is the empty string, it will not be quoted.

format_field(value, format_spec)[source]¶

snakemake.utils.R(code)[source]¶

Execute R code.

This is deprecated in favor of the script directive. This function executes the R code given as a string. The function requires rpy2 to be installed.

Parameters:	code (str) – R code to be executed

class snakemake.utils.SequenceFormatter(separator=' ', element_formatter=<string.Formatter object>, *args, **kwargs)[source]¶

Bases: string.Formatter

string.Formatter subclass with special behavior for sequences.

This class delegates formatting of individual elements to another formatter object. Non-list objects are formatted by calling the delegate formatter’s “format_field” method. List-like objects (list, tuple, set, frozenset) are formatted by formatting each element of the list according to the specified format spec using the delegate formatter and then joining the resulting strings with a separator (space by default).

format_element(elem, format_spec)[source]¶

Format a single element

For sequences, this is called once for each element in a sequence. For anything else, it is called on the entire object. It is intended to be overridden in subclases.

format_field(value, format_spec)[source]¶

class snakemake.utils.Unformattable(errormsg='This cannot be used for formatting')[source]¶: Bases: object

snakemake.utils.argvquote(arg, force=True)[source]¶: Returns an argument quoted in such a way that that CommandLineToArgvW on Windows will return the argument string unchanged. This is the same thing Popen does when supplied with an list of arguments. Arguments in a command line should be separated by spaces; this function does not add these spaces. This implementation follows the suggestions outlined here: https://blogs.msdn.microsoft.com/twistylittlepassagesallalike/2011/04/23/everyone-quotes-command-line-arguments-the-wrong-way/

snakemake.utils.available_cpu_count()[source]¶

Return the number of available virtual or physical CPUs on this system. The number of available CPUs can be smaller than the total number of CPUs when the cpuset(7) mechanism is in use, as is the case on some cluster systems.

Adapted from https://stackoverflow.com/a/1006301/715090

snakemake.utils.format(_pattern, *args, stepout=1, _quote_all=False, **kwargs)[source]¶

Format a pattern in Snakemake style.

This means that keywords embedded in braces are replaced by any variable values that are available in the current namespace.

snakemake.utils.linecount(filename)[source]¶

Return the number of lines of given file.

Parameters:	filename (str) – the path to the file

snakemake.utils.listfiles(pattern, restriction=None, omit_value=None)[source]¶

Yield a tuple of existing filepaths for the given pattern.

Wildcard values are yielded as the second tuple item.

Parameters:	pattern (str) – a filepattern. Wildcards are specified in snakemake syntax, e.g. “{id}.txt” restriction (dict) – restrict to wildcard values given in this dictionary omit_value (str) – wildcard value to omit
Yields:	tuple – The next file matching the pattern, and the corresponding wildcards object

snakemake.utils.makedirs(dirnames)[source]¶: Recursively create the given directory or directories without reporting errors if they are present.

snakemake.utils.min_version(version)[source]¶: Require minimum snakemake version, raise workflow error if not met.

snakemake.utils.os_sync()[source]¶: Ensure flush to disk

snakemake.utils.read_job_properties(jobscript, prefix='# properties', pattern=re.compile('# properties = (.*)'))[source]¶

Read the job properties defined in a snakemake jobscript.

This function is a helper for writing custom wrappers for the snakemake –cluster functionality. Applying this function to a jobscript will return a dict containing information about the job.

snakemake.utils.report(text, path, stylesheet=None, defaultenc='utf8', template=None, metadata=None, **files)[source]¶

Create an HTML report using python docutils.

This is deprecated in favor of the –report flag.

Attention: This function needs Python docutils to be installed for the python installation you use with Snakemake.

All keywords not listed below are intepreted as paths to files that shall be embedded into the document. They keywords will be available as link targets in the text. E.g. append a file as keyword arg via F1=input[0] and put a download link in the text like this:

report('''
==============
Report for ...
==============

Some text. A link to an embedded file: F1_.

Further text.
''', outputpath, F1=input[0])

Instead of specifying each file as a keyword arg, you can also expand
the input of your rule if it is completely named, e.g.:

report('''
Some text...
''', outputpath, **input)

Parameters:

text (str) – The “restructured text” as it is expected by python docutils.
path (str) – The path to the desired output file
stylesheet (str) – An optional path to a css file that defines the style of the document. This defaults to <your snakemake install>/report.css. Use the default to get a hint how to create your own.
defaultenc (str) – The encoding that is reported to the browser for embedded text files, defaults to utf8.
template (str) – An optional path to a docutils HTML template.
metadata (str) – E.g. an optional author name or email address.

snakemake.utils.simplify_path(path)[source]¶: Return a simplified version of the given path.

snakemake.utils.update_config(config, overwrite_config)[source]¶

Recursively update dictionary config with overwrite_config.

See https://stackoverflow.com/questions/3232943/update-value-of-a-nested-dictionary-of-varying-depth for details.

Parameters:	config (dict) – dictionary to update overwrite_config (dict) – dictionary whose items will overwrite those in config

snakemake.utils.validate(data, schema, set_default=True)[source]¶

Validate data with JSON schema at given path.

Parameters:

data (object) – data to validate. Can be a config dict or a pandas data frame.
schema (str) – Path to JSON schema used for validation. The schema can also be in YAML format. If validating a pandas data frame, the schema has to describe a row record (i.e., a dict with column names as keys pointing to row values). See https://json-schema.org. The path is interpreted relative to the Snakefile when this function is called.
set_default (bool) – set default values defined in schema. See https://python-jsonschema.readthedocs.io/en/latest/faq/ for more information

snakemake.workflow module¶

class snakemake.workflow.Gather[source]¶

Bases: object

A namespace for gather to allow items to be accessed via dot notation.

class snakemake.workflow.RuleInfo(func)[source]¶: Bases: object

class snakemake.workflow.Rules[source]¶

Bases: object

A namespace for rules so that they can be accessed via dot notation.

class snakemake.workflow.Scatter[source]¶

Bases: object

A namespace for scatter to allow items to be accessed via dot notation.

class snakemake.workflow.Subworkflow(workflow, name, snakefile, workdir, configfile)[source]¶

Bases: object

snakefile¶

target(paths)[source]¶

targets(dag)[source]¶

workdir¶

class snakemake.workflow.Workflow(snakefile=None, jobscript=None, overwrite_shellcmd=None, overwrite_config=None, overwrite_workdir=None, overwrite_configfiles=None, overwrite_clusterconfig=None, overwrite_threads=None, overwrite_scatter=None, overwrite_groups=None, group_components=None, config_args=None, debug=False, verbose=False, use_conda=False, conda_frontend=None, conda_prefix=None, use_singularity=False, use_env_modules=False, singularity_prefix=None, singularity_args='', shadow_prefix=None, scheduler_type='ilp', scheduler_ilp_solver=None, mode=0, wrapper_prefix=None, printshellcmds=False, restart_times=None, attempt=1, default_remote_provider=None, default_remote_prefix='', run_local=True, default_resources=None, cache=None, nodes=1, cores=1, resources=None, conda_cleanup_pkgs=None, edit_notebook=False, envvars=None, max_inventory_wait_time=20)[source]¶

Bases: object

add_rule(name=None, lineno=None, snakefile=None, checkpoint=False)[source]¶: Add a rule.

apply_default_remote(path)[source]¶: Apply the defined default remote provider to the given path and return the updated _IOFile. Asserts that default remote provider is defined.

benchmark(benchmark)[source]¶

cache_rule(cache)[source]¶

check()[source]¶

check_localrules()[source]¶

check_source_sizes(filename, warning_size_gb=0.2)[source]¶: A helper function to check the filesize, and return the file to the calling function Additionally, given that we encourage these packages to be small, we set a warning at 200MB (0.2GB).

concrete_files¶

conda(conda_env)[source]¶

config¶

configfile(fp)[source]¶: Update the global config with data from the given file.

container(container_img)[source]¶

cores¶

current_basedir¶: Basedir of currently parsed Snakefile.

cwl(cwl)[source]¶

docstring(string)[source]¶

envmodules(*env_modules)[source]¶

execute(targets=None, dryrun=False, generate_unit_tests=None, touch=False, scheduler_type=None, scheduler_ilp_solver=None, local_cores=1, forcetargets=False, forceall=False, forcerun=None, until=[], omit_from=[], prioritytargets=None, quiet=False, keepgoing=False, printshellcmds=False, printreason=False, printdag=False, cluster=None, cluster_sync=None, jobname=None, immediate_submit=False, ignore_ambiguity=False, printrulegraph=False, printfilegraph=False, printd3dag=False, drmaa=None, drmaa_log_dir=None, kubernetes=None, tibanna=None, tibanna_sfn=None, google_lifesciences=None, google_lifesciences_regions=None, google_lifesciences_location=None, google_lifesciences_cache=False, tes=None, precommand='', preemption_default=None, preemptible_rules=None, tibanna_config=False, container_image=None, stats=None, force_incomplete=False, ignore_incomplete=False, list_version_changes=False, list_code_changes=False, list_input_changes=False, list_params_changes=False, list_untracked=False, list_conda_envs=False, summary=False, archive=None, delete_all_output=False, delete_temp_output=False, detailed_summary=False, latency_wait=3, wait_for_files=None, nolock=False, unlock=False, notemp=False, nodeps=False, cleanup_metadata=None, conda_cleanup_envs=False, cleanup_shadow=False, cleanup_scripts=True, subsnakemake=None, updated_files=None, keep_target_files=False, keep_shadow=False, keep_remote_local=False, allowed_rules=None, max_jobs_per_second=None, max_status_checks_per_second=None, greediness=1.0, no_hooks=False, force_use_threads=False, conda_create_envs_only=False, assume_shared_fs=True, cluster_status=None, report=None, report_stylesheet=None, export_cwl=False, batch=None, keepincomplete=False, keepmetadata=True, executesubworkflows=True)[source]¶

get_rule(name)[source]¶

Get rule by name.

Arguments name – the name of the rule

get_sources()[source]¶

global_container(container_img)[source]¶

global_wildcard_constraints(**content)[source]¶: Register global wildcard constraints.

group(group)[source]¶

include(snakefile, overwrite_first_rule=False, print_compilation=False, overwrite_shellcmd=None)[source]¶: Include a snakefile.

input(*paths, **kwpaths)[source]¶

inputfile(path)[source]¶

Mark file as being an input file of the workflow.

This also means that eventual –default-remote-provider/prefix settings will be applied to this file. The file is returned as _IOFile object, such that it can e.g. be transparently opened with _IOFile.open().

is_cached_rule(rule: snakemake.rules.Rule)[source]¶

is_local(rule)[source]¶

is_rule(name)[source]¶

Return True if name is the name of a rule.

Arguments name – a name

lint(json=False)[source]¶

list_resources()[source]¶

list_rules(only_targets=False)[source]¶

localrules(*rulenames)[source]¶

log(*logs, **kwlogs)[source]¶

message(message)[source]¶

nodes¶

norun()[source]¶

notebook(notebook)[source]¶

onerror(func)[source]¶: Register onerror function.

onstart(func)[source]¶: Register onstart function.

onsuccess(func)[source]¶: Register onsuccess function.

output(*paths, **kwpaths)[source]¶

params(*params, **kwparams)[source]¶

pepfile(path)[source]¶

pepschema(schema)[source]¶

priority(priority)[source]¶

register_envvars(*envvars)[source]¶: Register environment variables that shall be passed to jobs. If used multiple times, union is taken.

report(path)[source]¶: Define a global report description in .rst format.

resources(*args, **resources)[source]¶

rule(name=None, lineno=None, snakefile=None, checkpoint=False)[source]¶

ruleorder(*rulenames)[source]¶

rules¶

run(func)[source]¶

scattergather(**content)[source]¶: Register scattergather defaults.

script(script)[source]¶

shadow(shadow_depth)[source]¶

shellcmd(cmd)[source]¶

subworkflow(name, snakefile=None, workdir=None, configfile=None)[source]¶

subworkflows¶

threads(threads)[source]¶

version(version)[source]¶

wildcard_constraints(*wildcard_constraints, **kwwildcard_constraints)[source]¶

workdir(workdir)[source]¶: Register workdir.

wrapper(wrapper)[source]¶

snakemake.workflow.format_resources(dict_like, *, omit_keys={'_cores', '_nodes'}, omit_values=[])¶

snakemake.workflow.srcdir(path)[source]¶: Return the absolute path, relative to the source directory of the current Snakefile.

snakemake.wrapper module¶

snakemake.wrapper.find_extension(path, extensions=['.py', '.R', '.Rmd', '.jl'])[source]¶

snakemake.wrapper.get_conda_env(path, prefix=None)[source]¶

snakemake.wrapper.get_path(path, prefix=None)[source]¶

snakemake.wrapper.get_script(path, prefix=None)[source]¶

snakemake.wrapper.is_git_path(path)[source]¶

snakemake.wrapper.is_local(path)[source]¶

snakemake.wrapper.is_script(path)[source]¶

snakemake.wrapper.wrapper(path, input, output, params, wildcards, threads, resources, log, config, rulename, conda_env, container_img, singularity_args, env_modules, bench_record, prefix, jobid, bench_iteration, cleanup_scripts, shadow_dir)[source]¶: Load a wrapper from https://github.com/snakemake/snakemake-wrappers under the given path + wrapper.(py|R|Rmd) and execute it.

Module contents¶

snakemake.bash_completion(snakefile='Snakefile')[source]¶: Entry point for bash completion.

snakemake.get_appdirs()[source]¶

snakemake.get_argument_parser(profile=None)[source]¶: Generate and return argument parser.

snakemake.get_profile_file(profile, file, return_default=False)[source]¶

snakemake.main(argv=None)[source]¶: Main entry point.

snakemake.parse_batch(args)[source]¶

snakemake.parse_config(args)[source]¶: Parse config from args.

snakemake.parse_group_components(args)[source]¶

snakemake.parse_groups(args)[source]¶

snakemake.parse_key_value_arg(arg, errmsg)[source]¶

snakemake.parse_set_ints(arg, errmsg)[source]¶

snakemake.parse_set_scatter(args)[source]¶

snakemake.parse_set_threads(args)[source]¶

snakemake.snakemake(snakefile, batch=None, cache=None, report=None, report_stylesheet=None, lint=None, generate_unit_tests=None, listrules=False, list_target_rules=False, cores=1, nodes=1, local_cores=1, resources={}, overwrite_threads=None, overwrite_scatter=None, default_resources=None, config={}, configfiles=None, config_args=None, workdir=None, targets=None, dryrun=False, touch=False, forcetargets=False, forceall=False, forcerun=[], until=[], omit_from=[], prioritytargets=[], stats=None, printreason=False, printshellcmds=False, debug_dag=False, printdag=False, printrulegraph=False, printfilegraph=False, printd3dag=False, nocolor=False, quiet=False, keepgoing=False, cluster=None, cluster_config=None, cluster_sync=None, drmaa=None, drmaa_log_dir=None, jobname='snakejob.{rulename}.{jobid}.sh', immediate_submit=False, standalone=False, ignore_ambiguity=False, snakemakepath=None, lock=True, unlock=False, cleanup_metadata=None, conda_cleanup_envs=False, cleanup_shadow=False, cleanup_scripts=True, force_incomplete=False, ignore_incomplete=False, list_version_changes=False, list_code_changes=False, list_input_changes=False, list_params_changes=False, list_untracked=False, list_resources=False, summary=False, archive=None, delete_all_output=False, delete_temp_output=False, detailed_summary=False, latency_wait=3, wait_for_files=None, print_compilation=False, debug=False, notemp=False, keep_remote_local=False, nodeps=False, keep_target_files=False, allowed_rules=None, jobscript=None, greediness=None, no_hooks=False, overwrite_shellcmd=None, updated_files=None, log_handler=[], keep_logger=False, wms_monitor=None, max_jobs_per_second=None, max_status_checks_per_second=100, restart_times=0, attempt=1, verbose=False, force_use_threads=False, use_conda=False, use_singularity=False, use_env_modules=False, singularity_args='', conda_frontend='conda', conda_prefix=None, conda_cleanup_pkgs=None, list_conda_envs=False, singularity_prefix=None, shadow_prefix=None, scheduler='ilp', scheduler_ilp_solver=None, conda_create_envs_only=False, mode=0, wrapper_prefix=None, kubernetes=None, container_image=None, tibanna=False, tibanna_sfn=None, google_lifesciences=False, google_lifesciences_regions=None, google_lifesciences_location=None, google_lifesciences_cache=False, tes=None, preemption_default=None, preemptible_rules=None, precommand='', default_remote_provider=None, default_remote_prefix='', tibanna_config=False, assume_shared_fs=True, cluster_status=None, export_cwl=None, show_failed_logs=False, keep_incomplete=False, keep_metadata=True, messaging=None, edit_notebook=None, envvars=None, overwrite_groups=None, group_components=None, max_inventory_wait_time=20, execute_subworkflows=True)[source]¶

Run snakemake on a given snakefile.

This function provides access to the whole snakemake functionality. It is not thread-safe.

Parameters:

snakefile (str) – the path to the snakefile
batch (Batch) – whether to compute only a partial DAG, defined by the given Batch object (default None)
report (str) – create an HTML report for a previous run at the given path
lint (str) – print lints instead of executing (None, “plain” or “json”, default None)
listrules (bool) – list rules (default False)
list_target_rules (bool) – list target rules (default False)
cores (int) – the number of provided cores (ignored when using cluster support) (default 1)
nodes (int) – the number of provided cluster nodes (ignored without cluster support) (default 1)
local_cores (int) – the number of provided local cores if in cluster mode (ignored without cluster support) (default 1)
resources (dict) – provided resources, a dictionary assigning integers to resource names, e.g. {gpu=1, io=5} (default {})
default_resources (DefaultResources) – default values for resources not defined in rules (default None)
config (dict) – override values for workflow config
workdir (str) – path to working directory (default None)
targets (list) – list of targets, e.g. rule or file names (default None)
dryrun (bool) – only dry-run the workflow (default False)
touch (bool) – only touch all output files if present (default False)
forcetargets (bool) – force given targets to be re-created (default False)
forceall (bool) – force all output files to be re-created (default False)
forcerun (list) – list of files and rules that shall be re-created/re-executed (default [])
execute_subworkflows (bool) – execute subworkflows if present (default True)
prioritytargets (list) – list of targets that shall be run with maximum priority (default [])
stats (str) – path to file that shall contain stats about the workflow execution (default None)
printreason (bool) – print the reason for the execution of each job (default false)
printshellcmds (bool) – print the shell command of each job (default False)
printdag (bool) – print the dag in the graphviz dot language (default False)
printrulegraph (bool) – print the graph of rules in the graphviz dot language (default False)
printfilegraph (bool) – print the graph of rules with their input and output files in the graphviz dot language (default False)
printd3dag (bool) – print a D3.js compatible JSON representation of the DAG (default False)
nocolor (bool) – do not print colored output (default False)
quiet (bool) – do not print any default job information (default False)
keepgoing (bool) – keep goind upon errors (default False)
cluster (str) – submission command of a cluster or batch system to use, e.g. qsub (default None)
cluster_config (str,list) – configuration file for cluster options, or list thereof (default None)
cluster_sync (str) – blocking cluster submission command (like SGE ‘qsub -sync y’) (default None)
drmaa (str) – if not None use DRMAA for cluster support, str specifies native args passed to the cluster when submitting a job
drmaa_log_dir (str) – the path to stdout and stderr output of DRMAA jobs (default None)
jobname (str) – naming scheme for cluster job scripts (default “snakejob.{rulename}.{jobid}.sh”)
immediate_submit (bool) – immediately submit all cluster jobs, regardless of dependencies (default False)
standalone (bool) – kill all processes very rudely in case of failure (do not use this if you use this API) (default False) (deprecated)
ignore_ambiguity (bool) – ignore ambiguous rules and always take the first possible one (default False)
snakemakepath (str) – deprecated parameter whose value is ignored. Do not use.
lock (bool) – lock the working directory when executing the workflow (default True)
unlock (bool) – just unlock the working directory (default False)
cleanup_metadata (list) – just cleanup metadata of given list of output files (default None)
drop_metadata (bool) – drop metadata file tracking information after job finishes (–report and –list_x_changes information will be incomplete) (default False)
conda_cleanup_envs (bool) – just cleanup unused conda environments (default False)
cleanup_shadow (bool) – just cleanup old shadow directories (default False)
cleanup_scripts (bool) – delete wrapper scripts used for execution (default True)
force_incomplete (bool) – force the re-creation of incomplete files (default False)
ignore_incomplete (bool) – ignore incomplete files (default False)
list_version_changes (bool) – list output files with changed rule version (default False)
list_code_changes (bool) – list output files with changed rule code (default False)
list_input_changes (bool) – list output files with changed input files (default False)
list_params_changes (bool) – list output files with changed params (default False)
list_untracked (bool) – list files in the workdir that are not used in the workflow (default False)
summary (bool) – list summary of all output files and their status (default False)
archive (str) – archive workflow into the given tarball
delete_all_output (bool) remove all files generated by the workflow (default False) –
delete_temp_output (bool) remove all temporary files generated by the workflow (default False) –
latency_wait (int) – how many seconds to wait for an output file to appear after the execution of a job, e.g. to handle filesystem latency (default 3)
wait_for_files (list) – wait for given files to be present before executing the workflow
list_resources (bool) – list resources used in the workflow (default False)
summary – list summary of all output files and their status (default False). If no option is specified a basic summary will be ouput. If ‘detailed’ is added as an option e.g –summary detailed, extra info about the input and shell commands will be included
detailed_summary (bool) – list summary of all input and output files and their status (default False)
print_compilation (bool) – print the compilation of the snakefile (default False)
debug (bool) – allow to use the debugger within rules
notemp (bool) – ignore temp file flags, e.g. do not delete output files marked as temp after use (default False)
keep_remote_local (bool) – keep local copies of remote files (default False)
nodeps (bool) – ignore dependencies (default False)
keep_target_files (bool) – do not adjust the paths of given target files relative to the working directory.
allowed_rules (set) – restrict allowed rules to the given set. If None or empty, all rules are used.
jobscript (str) – path to a custom shell script template for cluster jobs (default None)
greediness (float) – set the greediness of scheduling. This value between 0 and 1 determines how careful jobs are selected for execution. The default value (0.5 if prioritytargets are used, 1.0 else) provides the best speed and still acceptable scheduling quality.
overwrite_shellcmd (str) – a shell command that shall be executed instead of those given in the workflow. This is for debugging purposes only.
updated_files (list) – a list that will be filled with the files that are updated or created during the workflow execution
verbose (bool) – show additional debug output (default False)
max_jobs_per_second (int) – maximal number of cluster/drmaa jobs per second, None to impose no limit (default None)
restart_times (int) – number of times to restart failing jobs (default 0)
attempt (int) – initial value of Job.attempt. This is intended for internal use only (default 1).
force_use_threads – whether to force use of threads over processes. helpful if shared memory is full or unavailable (default False)
use_conda (bool) – use conda environments for each job (defined with conda directive of rules)
use_singularity (bool) – run jobs in singularity containers (if defined with singularity directive)
use_env_modules (bool) – load environment modules if defined in rules
singularity_args (str) – additional arguments to pass to singularity
conda_prefix (str) – the directory in which conda environments will be created (default None)
conda_cleanup_pkgs (snakemake.deployment.conda.CondaCleanupMode) – whether to clean up conda tarballs after env creation (default None), valid values: “tarballs”, “cache”
singularity_prefix (str) – the directory to which singularity images will be pulled (default None)
shadow_prefix (str) – prefix for shadow directories. The job-specific shadow directories will be created in $SHADOW_PREFIX/shadow/ (default None)
wms-monitor (str) – workflow management system monitor. Send post requests to the specified (server/IP). (default None)
conda_create_envs_only (bool) – if specified, only builds the conda environments specified for each job, then exits.
list_conda_envs (bool) – list conda environments and their location on disk.
mode (snakemake.common.Mode) – execution mode
wrapper_prefix (str) – prefix for wrapper script URLs (default None)
kubernetes (str) – submit jobs to kubernetes, using the given namespace.
container_image (str) – Docker image to use, e.g., for kubernetes.
default_remote_provider (str) – default remote provider to use instead of local files (e.g. S3, GS)
default_remote_prefix (str) – prefix for default remote provider (e.g. name of the bucket).
tibanna (bool) – submit jobs to AWS cloud using Tibanna.
tibanna_sfn (str) – Step function (Unicorn) name of Tibanna (e.g. tibanna_unicorn_monty). This must be deployed first using tibanna cli.
google_lifesciences (bool) – submit jobs to Google Cloud Life Sciences (pipelines API).
google_lifesciences_regions (list) – a list of regions (e.g., us-east1)
google_lifesciences_location (str) – Life Sciences API location (e.g., us-central1)
google_lifesciences_cache (bool) – save a cache of the compressed working directories in Google Cloud Storage for later usage.
tes (str) – Execute workflow tasks on GA4GH TES server given by url.
precommand (str) – commands to run on AWS cloud before the snakemake command (e.g. wget, git clone, unzip, etc). Use with –tibanna.
preemption_default (int) – set a default number of preemptible instance retries (for Google Life Sciences executor only)
preemptible_rules (list) – define custom preemptible instance retries for specific rules (for Google Life Sciences executor only)
tibanna_config (list) – Additional tibanna config e.g. –tibanna-config spot_instance=true subnet=<subnet_id> security group=<security_group_id>
assume_shared_fs (bool) – assume that cluster nodes share a common filesystem (default true).
cluster_status (str) – status command for cluster execution. If None, Snakemake will rely on flag files. Otherwise, it expects the command to return “success”, “failure” or “running” when executing with a cluster jobid as single argument.
export_cwl (str) – Compile workflow to CWL and save to given file
log_handler (list) – redirect snakemake output to this custom log handler, a function that takes a log message dictionary (see below) as its only argument (default None). The log message dictionary for the log handler has to following entries:
keep_incomplete (bool) – keep incomplete output files of failed jobs
edit_notebook (object) – “notebook.Listen” object to configuring notebook server for interactive editing of a rule notebook. If None, do not edit.
scheduler (str) – Select scheduling algorithm (default ilp)
scheduler_ilp_solver (str) – Set solver for ilp scheduler.
overwrite_groups (dict) – Rule to group assignments (default None)
group_components (dict) – Number of connected components given groups shall span before being split up (1 by default if empty)

log_handler –

redirect snakemake output to this list of custom log handler, each a function that takes a log message dictionary (see below) as its only argument (default []). The log message dictionary for the log handler has to following entries:

level: the log level (“info”, “error”, “debug”, “progress”, “job_info”)

level=”info”, “error” or “debug”:

msg:	the log message

level=”progress”:

done:	number of already executed jobs
total:	number of total jobs

level=”job_info”:

input:	list of input files of a job
output:	list of output files of a job
log:	path to log file of a job
local:	whether a job is executed locally (i.e. ignoring cluster)
msg:	the job message
reason:	the job reason
priority:	the job priority
threads:	the threads of the job

Returns:

True if workflow execution was successful.

Return type:

bool

snakemake.unparse_config(config)[source]¶

snakemake package¶

Subpackages¶

Submodules¶

snakemake.benchmark module¶

snakemake.checkpoints module¶

snakemake.common module¶

snakemake.conda module¶

snakemake.cwl module¶

snakemake.dag module¶

snakemake.decorators module¶

snakemake.exceptions module¶

snakemake.executors module¶

snakemake.gui module¶

snakemake.io module¶

snakemake.jobs module¶

snakemake.logging module¶

snakemake.output_index module¶

snakemake.parser module¶

snakemake.persistence module¶

snakemake.rules module¶

snakemake.scheduler module¶

snakemake.script module¶

snakemake.shell module¶

snakemake.singularity module¶

snakemake.stats module¶

snakemake.utils module¶

snakemake.workflow module¶

snakemake.wrapper module¶

Module contents¶