Skip to content

Run Your First Job

This guide covers running a first job on the Mila cluster. Create a minimal PyTorch project that checks CUDA and GPU availability, and develop on the cluster using VSCode on a compute node via the mila code command from milatools.

Before you begin

  • Get Started with the Cluster


    Obtain a Mila account, enable cluster access and MFA, install uv and milatools, configure SSH access and connect to the cluster for the first time.

 

VSCode or compatible editor

The mila code command opens VSCode (or a compatible editor such as Cursor) on a compute node. Install VSCode on a personal computer before starting.

What this guide covers

  • Open VSCode on a compute node with one GPU using mila code.
  • Create a minimal PyTorch project with pyproject.toml and main.py.
  • Run the script with uv run python main.py in VSCode.

Open VSCode on a compute node

Create the project directory on the cluster

From a personal computer, create the project directory on the cluster so that mila code can open it (the path is on the cluster):

ssh mila 'mkdir -p CODE/my_first_job'

Start VSCode on a GPU node

Run mila code with allocation options to request one GPU. This allocates a compute node and opens VSCode in a project path; everything after --alloc is passed to Slurm:

mila code CODE/my_first_job --alloc --gres=gpu:1 --cpus-per-task=2 --mem=16G --time=01:00:00
[17:35:21] Checking disk quota on $HOME...                                                                                                disk_quota.py:31
[17:35:27] Disk usage: 85.34 / 100.00 GiB and 794022 / 1048576 files                                                                      disk_quota.py:211
[17:35:29] (mila) $ cd $SCRATCH && salloc --gres=gpu:1 --cpus-per-task=2 --mem=16G --time=01:00:00 --job-name=mila-code                   compute_node.py:293
salloc: --------------------------------------------------------------------------------------------------
salloc: # Using default long partition
salloc: --------------------------------------------------------------------------------------------------
salloc: Granted job allocation 8888888
[17:35:30] Waiting for job 8888888 to start.                                                                                              compute_node.py:315
[17:35:31] (localhost) $ code --new-window --wait --remote ssh-remote+cn-a003.server.mila.quebec /home/mila/u/username/CODE/my_first_job  local_v2.py:55

Wait until the allocation is granted and VSCode opens, connected to a shell on the compute node.

VSCode stuck on "Opening Remote..."?

If VSCode appears to hang while connecting, it may be waiting for an OTP code in a hidden terminal. Enable Remote.SSH: Show Login Terminal in VSCode settings to make the prompt visible:

  1. Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P).
  2. Run Preferences: Open Settings (UI).
  3. Search for remote.SSH.showLoginTerminal and enable it.

Re-run the mila code command after enabling the setting.

Run a script in VSCode

Create the project files

In VSCode, create the following two files in the project folder (e.g. in the explorer or via File → New File). The files live on the compute node.

main.py
import torch
import torch.backends.cuda


def main():
    cuda_built = torch.backends.cuda.is_built()
    cuda_avail = torch.cuda.is_available()
    device_count = torch.cuda.device_count()

    print(f"PyTorch built with CUDA:         {cuda_built}")
    print(f"PyTorch detects CUDA available:  {cuda_avail}")
    print(f"PyTorch-detected #GPUs:          {device_count}")
    if device_count == 0:
        print("    No GPU detected, not printing devices' names.")
    else:
        for i in range(device_count):
            print(f"    GPU {i}:      {torch.cuda.get_device_name(i)}")


if __name__ == "__main__":
    main()
pyproject.toml
1
2
3
4
5
6
7
[project]
name = "pytorch-setup"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.11,<3.14"
dependencies = ["torch>=2.7.1"]

Run the script in the VSCode terminal

Open the integrated terminal in VSCode (Terminal → New Terminal). You are on the compute node. From the project directory, run:

uv run python main.py
Using CPython 3.12.11
Creating virtual environment at: .venv
Installed 28 packages in 12.08s
PyTorch built with CUDA:         True
PyTorch detects CUDA available:  True
PyTorch-detected #GPUs:          1
    GPU 0:      Quadro RTX 8000

The output should confirm that PyTorch is built with CUDA and detects the GPU. When done, close VSCode and press Ctrl+C in the terminal to end the mila code session and relinquish the allocation.


Key concepts

mila code
Allocates a compute node and opens VSCode on it. Use for interactive development and running scripts with a full editor and terminal.

Next step

 

__skill-mila-run-jobs

Comments

Ask AI