Skip to content

PyTorch examples

To run these scripts you will need to clone the jean-zay-doc repo in your $WORK dir and go to corresponding folder. Follow instructions detalied inside each folder:

cd $WORK &&\
git clone https://github.com/jean-zay-users/jean-zay-doc.git

MNIST

In this tutorial you will learn to train a basic CNN model and tune its hyperparameters on Jean Zay using Slurm Batch and Slurm Job Array. Two implementations are available: single GPU and multi GPU.

Distributed STL10

This example will show you how to train a ResNet18 in a distributed setting (multi GPU and multi node) on the Jean Zay infrastructure using Slurm srun.