Understanding Slurm¶
Before you begin¶
-
Obtain a Mila account, enable cluster access and MFA, install
uvandmilatools, configure SSH access and connect to the cluster for the first time.
What this guide covers¶
- Discovering the Slurm jobs, steps and tasks
- Launching multiple tasks through an interactive job
- Launch multiple tasks from a script
Key concepts¶
Jobs, steps and tasks¶
Recurrent entities when we speak of Slurm are jobs, steps and tasks. If we want to keep a simple scheme, we could say that:
- a job can have multiple steps
- a step can run multiple tasks.
Check the technical reference for deeper information
Login nodes and compute nodes¶
We will not dive into details here, because these concepts are explained in What is a computer cluster?, but to sum up some notions, we focus on the two following types of nodes:
| Type of node | Use |
|---|---|
| Login node | They are used to connect to the cluster and manage your jobs |
| Compute node | This is where the jobs run, the allocation requested when a job is launched is provided from them |
Do not run jobs on login nodes
Login nodes are entry points to the cluster, you can call Slurm commands from there (sbatch, sinfo, squeue, etc) but computing scripts should be submitted through Slurm in order to get the requested resources, and not directly run on the login nodes.
Commands¶
The three Slurm commands we focus on are:
| Command | Entity created | Description | From where call the command? |
|---|---|---|---|
sbatch |
Batch Slurm job | Submit a batch script to Slurm | From a login node |
salloc |
Interactive job | Obtain a Slurm job allocation (a set of nodes), execute a command, and then release the allocation when the command is finished | From a login node |
srun |
Step | Run tasks | From a job |
Submitting tasks is done through two steps:
- Request a resource allocation by submitting a job (
sbatchorsalloc) - Launch commands by launching tasks from this resource allocation (
srun)
Discover Slurm through an interactive job¶
1. Connect to your favorite cluster (see this section for more information on the connection)
Open a terminal and launch the command:
Here, we arbitrary choose to connect to the Mila cluster. The command ssh mila works thanks to the configuration we previously set inside ~/.ssh/config. This could be done by mila init (see the Getting Started guide).
2. Submit a job
Submitting a job is like booking an allocation: you request which resources you want (GPU, CPU, node, memory), setting your experiment conditions. The Slurm scheduler is then in charge to provide you an allocation.
salloc: --------------------------------------------------------------------------------------------------
salloc: # Using default long-cpu partition (CPU-only)
salloc: --------------------------------------------------------------------------------------------------
salloc: Pending job allocation 9311988
salloc: job 9311988 queued and waiting for resources
salloc: Granted job allocation 9311988
salloc: Nodes cn-f[001-002] are ready for job
Once the allocation is done, you get some information about your job:
- you know what your Job ID is (9311988 in this example)
- you know on which nodes your allocation is (cn-f001 and cn-f002 in this example).
Congrats! You now have a resource allocation.
sallocmeans this is an interactive job--ntasksmeans thatsrunwill invoke 4 tasks--nodesmeans 2 nodes are requested for the previously mentioned tasks to run on--memaspecify the real memory required per node. We could also set--mem-per-gpuor--mem-per-cpu--timeasks for an allocation of 30min. It is a good practice to set it because an interactive job can last until one week, and it is a common mistake to forget to leave an interactive job.
See salloc documentation for more information.
3. Understand where we are
By running the command hostname, you can see "where" the process calling the command runs:
Now, let's try to run steps and tasks by using srun:
Each task returned its own result for the hostname command.
In this example, we can see that:
- three tasks have been launched on the node
cn-f002 - one task has been launched on the node
cn-f001
Tip
For more symmetrical jobs, you can use the --ntasks-per-node parameter instead of --ntasks.
(For instance, --ntasks-per-node=2 in this case.)
-
Note on the command:
- We used
srun hostname, presenting the formatsrun <command>.sruncan also take parameters in the formatsrun <parameters> <command>. See srun documentation for more details.
- We used
-
Notes on the result:
- The
hostnamecommand has been called four times because we ask for four tasks while submitting the job throughsalloc. - By running our four tasks with
srun, we can see that they are not necessarily evenly spread among the nodes.
- The
Launch a non-interactive job¶
In this section, we reproduce the same example as before (same parameters and same command (hostname)) and submit the job through the sbatch command.
1. Connect to your favorite cluster
2. Write the script
You could either:
- write the script directly on the login node (in
$SCRATCHrepository and its children repositories) -
or write it on your local computer and copy it to the scratch directory through:
by using, instead of
user.nameyour own name.
The content of job.sh is:
We add in the beginning of the script the same parameters we used while running salloc for our interactive job.
3. Launch the command
From the login node, run:
Now that the job is submitted, all that is left to do is waiting for it to be scheduled. You can see it status by running the squeue command:
Here, the allocation is requested by sbatch based on the script parameters. Once it is ready, the script is automatically executed (ie the job is running), the allocation is freed at the end of the job.
4. Retrieve the results
Once the job is finished, its output can be retrieved by reading the file slurm-<JOB_ID>.out. (In our case, the file name is: slurm-9321166.out). This can be changed by using the parameter --output.
The content of the output in our example is:
Key concepts¶
- Job
- Global commands executed in a requested resources allocation.
- Task
- Set of commands running on an allocation part. A job can contain multiple tasks.
Next step¶
Now that you can launch multiple commands (such as hostname) in a slurm job, each with their own resources and environment variables, you are ready to learn the fundamentals of distributed programs such as distributed training in PyTorch.
-
Synchronize the output of multiple tasks on different node.