PyTorch examples¶
To run these scripts you will need to clone the jean-zay-doc repo in your
$WORK
dir and go to corresponding folder. Follow instructions detalied inside
each folder:
cd $WORK &&\
git clone https://github.com/jean-zay-users/jean-zay-doc.git
MNIST¶
In this tutorial you will learn to train a basic CNN model and tune its hyperparameters on Jean Zay using Slurm Batch and Slurm Job Array. Two implementations are available: single GPU and multi GPU.
Distributed STL10¶
This example will show you how to train a ResNet18 in a distributed setting (multi GPU and multi node) on the Jean Zay infrastructure using Slurm srun.