Facility BioData Hub
Bioinformatics analyses require significant computational resourcesand many users may be working on the server simultaneously. Therefore, each user must use the scheduling system.
This guide explains how to use SLURM, the most widely used job scheduler in HPC clusters.
#!/bin/bash
#SBATCH --job-name=example_name
#SBATCH --output=output_%j.txt
#SBATCH --error=log_%j.error
#SBATCH --ntasks=1
#SBATCH --time=00:05:00
#SBATCH --mem=1GSo, what needs to be configured? Assign a name to the job — this helps you recognize it in the queue. Example:
Specify where to save the standard output (stdout):
You can also separate output and errors:
%j is automatically replaced with the job ID, so you’ll get unique files for each run. Indicates how many instances of the program to run:
–ntasks=1 → serial job
–ntasks=8 → parallel job
Set job time limit:
If your job exceeds the maximum allowed time, SLURM will terminate it.
Requests 1 GB of total RAM for the job:
You can also specify memory per CPU:
| Command | Description | Example |
|---|---|---|
squeue |
Show the queue of active jobs | squeue -u $USER |
sbatch |
Submit a script as a job | sbatch script.sh |
scancel |
Cancel a running job | scancel 12345 |
sinfo |
Display node status | sinfo |
sacct |
Show the history of completed jobs | sacct -j 12345 |
scontrol |
Inspect or modify jobs | scontrol show job 12345 |
sreport |
Generate cluster usage reports | sreport cluster utilization |
Submit a job with sbatch:
Check your job in the list:
An you will get something like:
JOBID PARTITION NAME USER STATE TIME NODES CPUS MIN_MEMORY NODELIST(REASON)
12345 longer debug user1 PENDING 00:00 1 1 1G node1
12346 service map user1 RUNNING 03:20 2 5 3G node2
12347 service debug user1 RUNNING 56:09 1 1 1G node1You can copy the JOBID and view the details of if:
Or, kill you job:
You can check the output and the error files in the designated folder: