Running Analyses on the Server

Facility BioData Hub

1.1 Introduction

Bioinformatics analyses require significant computational resourcesand many users may be working on the server simultaneously. Therefore, each user must use the scheduling system.

This guide explains how to use SLURM, the most widely used job scheduler in HPC clusters.


1.2 Creating a SLURM Script

#!/bin/bash
#SBATCH --job-name=example_name
#SBATCH --output=output_%j.txt
#SBATCH --error=log_%j.error
#SBATCH --ntasks=1
#SBATCH --time=00:05:00
#SBATCH --mem=1G

So, what needs to be configured? Assign a name to the job — this helps you recognize it in the queue. Example:

#SBATCH --job-name=genomic_analysis

Specify where to save the standard output (stdout):

#SBATCH --output=logs/output_%j.txt

You can also separate output and errors:

#SBATCH --error=logs/error_%j.err

%j is automatically replaced with the job ID, so you’ll get unique files for each run. Indicates how many instances of the program to run:

#SBATCH --ntasks=1

–ntasks=1 → serial job

–ntasks=8 → parallel job

Set job time limit:

#SBATCH --time=02:00:00   # 2 hours
#SBATCH --time=5-00:00:00 # 5 days

If your job exceeds the maximum allowed time, SLURM will terminate it.

Requests 1 GB of total RAM for the job:

#SBATCH --mem=1G

You can also specify memory per CPU:

#SBATCH --mem-per-cpu=2G

1.3 Useful SLURM Commands

Command Description Example
squeue Show the queue of active jobs squeue -u $USER
sbatch Submit a script as a job sbatch script.sh
scancel Cancel a running job scancel 12345
sinfo Display node status sinfo
sacct Show the history of completed jobs sacct -j 12345
scontrol Inspect or modify jobs scontrol show job 12345
sreport Generate cluster usage reports sreport cluster utilization

1.4 Complete example:

Submit a job with sbatch:

sbatch my_job.sh

Check your job in the list:

squeue -u \$USER

An you will get something like:

JOBID PARTITION NAME    USER    STATE     TIME  NODES CPUS  MIN_MEMORY  NODELIST(REASON)
12345 longer    debug   user1   PENDING   00:00   1     1       1G          node1
12346 service     map   user1   RUNNING   03:20   2     5       3G          node2
12347 service   debug   user1   RUNNING   56:09   1     1       1G          node1

You can copy the JOBID and view the details of if:

sacct -j 12345

Or, kill you job:

scancel 12345

You can check the output and the error files in the designated folder:

cat path/to/log/error_12345.error
cat path/to/log/output_12345.txt

How to work with the server: slurm script generator


⬅ Back to Home