Nexus/Apptainer
Running containers in a multi-tenant environment has a number of security considerations. While Docker is popular, the most typical setups require a daemon that has administrative level privileges that makes it not tenable. There has been a lot of work in this area but ultimately for HPC environments, Apptainer is a solution that enables the capabilities of container workloads in multi-tenant environments.
The one consideration is that to create an image from a definition file you need to have administrative rights on the machine. For this reason you can't directly create Apptainer images on our supported systems. You can download or pull images from other repositories including the Docker repositories. If you do need to create images on our supported systems, please see Podman.
Bind Mounts
Apptainer containers will not automatically mount data from the outside operating system other than your home directory. Users need to manually bind mounts for other file paths.
--bind /fs/nexus-scratch/username/project1:/mnt
In this scenario we are binding the directory outside the container /fs/nexus-scratch/username/project1
to exist in the path /mnt
inside the container.
Portable images called Singularity Image Format or .sif files can be copied and shared. Nexus maintains some shared containers in /fs/nexus-containers
. These are arranged by the application(s) that are installed.
GPUs
Nvidia has a very specific driver and libraries that are required to run CUDA programs. To ensure that all appropriate devices are created inside the container and that these libraries are made available in the container users need to use the --nv
flag when instantiating their container(s).
Example
If you have the following example file in /fs/nexus-scratch/username/apptainer
.
#!/usr/bin/env python import torch; print(f'Torch cuda is available: {torch.cuda.is_available()}') print(f'Torch cuda number of devices: {torch.cuda.device_count()}') for g in range(torch.cuda.device_count()): print(f'Torch cuda device {g}: {torch.cuda.get_device_name(0)}')
$ apptainer exec --bind /fs/nexus-scratch/username/apptainer:/mnt --nv /fs/nexus-containers/pytorch/pytorch_1.10.2+cu113.sif python3 /mnt/test.py Torch cuda is available: True Torch cuda number of devices: 1 Torch cuda device 0: NVIDIA RTX A4000