Running Array Jobs

Job array allows you to run a group of identical/similar jobs.The Slurm script is EXACTLY the same. The only difference between each sub-job is the environment variable, $SLURM_ARRAY_TASK_ID. So it can be a good idea if you want to do some data level parallelization. E.g. let sub-job 1 (SLURM_ARRAY_TASK_ID=1) process data chunk 1, sub-job 2 processes data chunk 2, ... etc.

To do that, just add the following statement in your submission script:

#SBATCH --array=1-20

Or you can do it on submission time, without modify your submission script:

sbatch --array=1-20 job.script

In Slurm, the job array is implemented as a group of single jobs. E.g. if you submit an array job with #SBATCH --array=1-4. When the starting job is ID=1000, the ids of all jobs are: 1000, 1001, 1002, 1003.

Note: There is a limit of 1000 jobs per array. Slurm also has a bug where it will not allow array id's above this limit, this can worked around with a prefix in the script (e.g. for "1001-1020" use --array=01-20 and reference variables with a prefix 10$SLURM_ARRAY_TASK_ID)

