Examples

Most of the examples below assume that Snakemake is executed in a project-specific root directory. The paths in the Snakefiles below are relative to this directory. We follow the convention to use different subdirectories for different intermediate results, e.g., mapped/ for mapped sequence reads in .bam files, etc.

Building a C Program

GNU Make is primarily used to build C/C++ code. Snakemake can do the same, while providing a superior readability due to less obscure variables inside the rules.

The following example Makefile was adapted from http://www.cs.colby.edu/maxwell/courses/tutorials/maketutor/.

IDIR=../include
ODIR=obj
LDIR=../lib

LIBS=-lm

CC=gcc
CFLAGS=-I$(IDIR)

_HEADERS = hello.h
HEADERS = $(patsubst %,$(IDIR)/%,$(_HEADERS))

_OBJS = hello.o hellofunc.o
OBJS = $(patsubst %,$(ODIR)/%,$(_OBJS))

# build the executable from the object files
hello: $(OBJS)
        $(CC) -o $@ $^ $(CFLAGS)

# compile a single .c file to an .o file
$(ODIR)/%.o: %.c $(HEADERS)
        $(CC) -c -o $@ $< $(CFLAGS)


# clean up temporary files
.PHONY: clean
clean:
        rm -f $(ODIR)/*.o *~ core $(IDIR)/*~

A Snakefile can be easily written as

from os.path import join

IDIR = '../include'
ODIR = 'obj'
LDIR = '../lib'

LIBS = '-lm'

CC = 'gcc'
CFLAGS = '-I' + IDIR


_HEADERS = ['hello.h']
HEADERS = [join(IDIR, hfile) for hfile in _HEADERS]

_OBJS = ['hello.o', 'hellofunc.o']
OBJS = [join(ODIR, ofile) for ofile in _OBJS]


rule hello:
    """build the executable from the object files"""
    output:
        'hello'
    input:
        OBJS
    shell:
        "{CC} -o {output} {input} {CFLAGS} {LIBS}"

rule c_to_o:
    """compile a single .c file to an .o file"""
    output:
        temp('{ODIR}/{name}.o')
    input:
        src='{name}.c',
        headers=HEADERS
    shell:
        "{CC} -c -o {output} {input.src} {CFLAGS}"

rule clean:
    """clean up temporary files"""
    shell:
        "rm -f   *~  core  {IDIR}/*~"

As can be seen, the shell calls become more readable, e.g. "{CC} -c -o {output} {input} {CFLAGS}" instead of $(CC) -c -o $@ $< $(CFLAGS). Further, Snakemake automatically deletes .o-files when they are not needed anymore since they are marked as temp.

C Workflow DAG

Building a Paper with LaTeX

Building a scientific paper can be automated by Snakemake as well. Apart from compiling LaTeX code and invoking BibTeX, we provide a special rule to zip the needed files for online submission.

We first provide a Snakefile tex.rules that contains rules that can be shared for any latex build task:

ruleorder:  tex2pdf_with_bib > tex2pdf_without_bib

rule tex2pdf_with_bib:
    input:
        '{name}.tex',
        '{name}.bib'
    output:
        '{name}.pdf'
    shell:
        """
        pdflatex {wildcards.name}
        bibtex {wildcards.name}
        pdflatex {wildcards.name}
        pdflatex {wildcards.name}
        """

rule tex2pdf_without_bib:
    input:
        '{name}.tex'
    output:
        '{name}.pdf'
    shell:
        """
        pdflatex {wildcards.name}
        pdflatex {wildcards.name}
        """

rule texclean:
    shell:
        "rm -f  *.log *.aux *.bbl *.blg *.synctex.gz"

Note how we distinguish between a .tex file with and without a corresponding .bib with the same name. Assuming that both paper.tex and paper.bib exist, an ambiguity arises: Both rules are, in principle, applicable. This would lead to an AmbiguousRuleException, but since we have specified an explicit rule order in the file, it is clear that in this case the rule tex2pdf_with_bib is to be preferred. If the paper.bib file does not exist, that rule is not even applicable, and the only option is to execute rule tex2pdf_without_bib.

Assuming that the above file is saved as tex.rules, the actual documents are then built from a specific Snakefile that includes these common rules:

DOCUMENTS = ['document', 'response-to-editor']
TEXS = [doc+".tex" for doc in DOCUMENTS]
PDFS = [doc+".pdf" for doc in DOCUMENTS]
FIGURES = ['fig1.pdf']

include:
    'tex.rules'

rule all:
    input:
        PDFS

rule zipit:
    output:
        'upload.zip'
    input:
        TEXS, FIGURES, PDFS
    shell:
        'zip -T {output} {input}'

rule pdfclean:
    shell:
        "rm -f  {PDFS}"

Hence the user can perform 4 different tasks. Build all PDFs:

$ snakemake

Create a zip-file for online submissions:

$ snakemake zipit

Clean up all PDFs:

$ snakemake pdfclean

Clean up latex temporary files:

$ snakemake texclean

The following DAG of jobs would be executed upon a full run:

LaTeX Workflow DAG