PyTorch Setup
Prerequisites: (Make sure to read the following before using this example!)
The full source code for this example is available on the mila-docs GitHub repository.
job.sh
#!/bin/bash
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=1
#SBATCH --mem=16G
#SBATCH --time=00:15:00
set -e # exit on error.
echo "Date: $(date)"
echo "Hostname: $(hostname)"
# Use `uv run --offline` on clusters without internet access on compute nodes.
uv run python main.py
pyproject.toml
[project]
name = "pytorch-setup"
version = "0.1.0"
description = "Add your description here"
readme = "README.rst"
requires-python = ">=3.12"
dependencies = ["numpy>=2.3.1", "torch>=2.7.1"]
main.py
import torch
import torch.backends.cuda
def main():
cuda_built = torch.backends.cuda.is_built()
cuda_avail = torch.cuda.is_available()
device_count = torch.cuda.device_count()
print(f"PyTorch built with CUDA: {cuda_built}")
print(f"PyTorch detects CUDA available: {cuda_avail}")
print(f"PyTorch-detected #GPUs: {device_count}")
if device_count == 0:
print(" No GPU detected, not printing devices' names.")
else:
for i in range(device_count):
print(f" GPU {i}: {torch.cuda.get_device_name(i)}")
if __name__ == "__main__":
main()
Running this example
This assumes that you already installed UV on the cluster you are working on.
To create this environment, we first request resources for an interactive job. Note that we are requesting a GPU for this job, even though we’re only going to install packages. This is because we want PyTorch to be installed with GPU support, and to have all the required libraries.
# On the Mila cluster: (on DRAC/PAICE, run `uv sync` on a login node)
$ salloc --gres=gpu:1 --cpus-per-task=4 --mem=16G --time=00:10:00
salloc: --------------------------------------------------------------------------------------------------
salloc: # Using default long partition
salloc: --------------------------------------------------------------------------------------------------
salloc: Pending job allocation 2959785
salloc: job 2959785 queued and waiting for resources
salloc: job 2959785 has been allocated resources
salloc: Granted job allocation 2959785
salloc: Waiting for resource configuration
salloc: Nodes cn-g022 are ready for job
$ # Create the virtual environment and install all dependencies
$ uv sync
(...)
$ # Optional: Activate the environment and run the python script:
$ . .venv/bin/activate
$ python main.py
You can exit the interactive job once the environment has been created. Then, you can submit a job to run the example with sbatch:
$ sbatch job.sh