Skip to content

Run Your First Job

This guide covers running a first job on the Mila cluster. Create a minimal PyTorch project that checks CUDA and GPU availability, and develop on the cluster using VSCode on a compute node via the mila code command from milatools.

Before you begin

 

VSCode or compatible editor

The mila code command opens VSCode (or a compatible editor such as Cursor) on a compute node. Install VSCode on a personal computer before starting.

What this guide covers

  • Open VSCode on a compute node with one GPU using mila code.
  • Create a minimal PyTorch project with pyproject.toml and main.py.
  • Run the script with uv run python main.py in VSCode.

Open VSCode on a compute node

Create the project directory on the cluster

From a personal computer, create the project directory on the cluster so that mila code can open it (the path is on the cluster):

ssh mila 'mkdir -p CODE/my_first_job'

Start VSCode on a GPU node

Run mila code with allocation options to request one GPU. This allocates a compute node and opens VSCode in a project path; everything after --alloc is passed to Slurm:

mila code CODE/my_first_job --alloc --gres=gpu:1 --cpus-per-task=2 --mem=16G --time=01:00:00
[17:35:21] Checking disk quota on $HOME...                                                                                                disk_quota.py:31
[17:35:27] Disk usage: 85.34 / 100.00 GiB and 794022 / 1048576 files                                                                      disk_quota.py:211
[17:35:29] (mila) $ cd $SCRATCH && salloc --gres=gpu:1 --cpus-per-task=2 --mem=16G --time=01:00:00 --job-name=mila-code                   compute_node.py:293
salloc: --------------------------------------------------------------------------------------------------
salloc: # Using default long partition
salloc: --------------------------------------------------------------------------------------------------
salloc: Granted job allocation 8888888
[17:35:30] Waiting for job 8888888 to start.                                                                                              compute_node.py:315
[17:35:31] (localhost) $ code --new-window --wait --remote ssh-remote+cn-a003.server.mila.quebec /home/mila/u/username/CODE/my_first_job  local_v2.py:55

Wait until the allocation is granted and VSCode opens, connected to a shell on the compute node.

VSCode stuck on "Opening Remote..."?

If VSCode appears to hang while connecting, it may be waiting for an OTP code in a hidden terminal. Enable Remote.SSH: Show Login Terminal in VSCode settings to make the prompt visible:

  1. Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P).
  2. Run Preferences: Open Settings (UI).
  3. Search for remote.SSH.showLoginTerminal and enable it.

Re-run the mila code command after enabling the setting.

Run a script in VSCode

Create the project files

In VSCode, create the following two files in the project folder (e.g. in the explorer or via File → New File). The files live on the compute node.

main.py
import torch
import torch.backends.cuda


def main():
    cuda_built = torch.backends.cuda.is_built()
    cuda_avail = torch.cuda.is_available()
    device_count = torch.cuda.device_count()

    print(f"PyTorch built with CUDA:         {cuda_built}")
    print(f"PyTorch detects CUDA available:  {cuda_avail}")
    print(f"PyTorch-detected #GPUs:          {device_count}")
    if device_count == 0:
        print("    No GPU detected, not printing devices' names.")
    else:
        for i in range(device_count):
            print(f"    GPU {i}:      {torch.cuda.get_device_name(i)}")


if __name__ == "__main__":
    main()
pyproject.toml
1
2
3
4
5
6
7
[project]
name = "pytorch-setup"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.11,<3.14"
dependencies = ["torch>=2.7.1"]

Run the script in the VSCode terminal

Open the integrated terminal in VSCode (Terminal → New Terminal). You are on the compute node. From the project directory, run:

uv run python main.py
Using CPython 3.12.11
Creating virtual environment at: .venv
Installed 28 packages in 12.08s
PyTorch built with CUDA:         True
PyTorch detects CUDA available:  True
PyTorch-detected #GPUs:          1
    GPU 0:      Quadro RTX 8000

The output should confirm that PyTorch is built with CUDA and detects the GPU. When done, close VSCode and press Ctrl+C in the terminal to end the mila code session and relinquish the allocation.


Key concepts

mila code
Allocates a compute node and opens VSCode on it. Use for interactive development and running scripts with a full editor and terminal.

Next step

 

__skill-mila-run-jobs

Comments

Ask AI