Nextflow
Below is a simple 2-stage Nextflow pipeline, that runs each stage as a Slurm job, and passes a file between stages. More information: https://www.nextflow.io/docs/latest/reference/process.html
workflow {
step1_output = processStep1()
processStep2(step1_output)
}
// Define the first process
process processStep1 {
executor 'slurm' ; queue 'short' ; cpus 1 ; memory '1 GB' ; time '1h 2m'
output:
file 'step1output.txt'
script:
"""
echo "Running Step 1"
echo "Step 1 complete" > step1output.txt
"""
}
// Define the second process
process processStep2 {
executor 'slurm' ; queue 'test' ; cpus 2 ; memory '2 GB' ; time '2m'
input:
path step1_file
script:
"""
echo "Running Step 2"
echo ""
echo "Contents of input file:"
cat ${step1_file}
echo ""
echo "Step 2 complete"
"""
}
The main pipeline script itself does not consume resources, the sub-processes do, so the script can be started from a login server. Nextflow will create and monitor the Slurm jobs.
Put the pipeline in a file named pipeline.nf and run:
$ module load nextflow
$ nextflow run ./pipeline.nf
N E X T F L O W ~ version 24.10.4
Launching `./pipeline.nf` [nauseous_wilson] DSL2 - revision: c96d1353cb
executor > slurm (2)
[00/5bf734] processStep1 | 1 of 1 ✔
[11/fa1948] processStep2 | 1 of 1 ✔
If the entire pipeline exceeds our 7-day limit then you must execute from a login server, and run nextflow in background mode:
$ nextflow run -bg ./pipeline.nf
Then you can logout without interrupting Nextflow.