Nexus/Apptainer: Difference between revisions

From UMIACS
Jump to navigation Jump to search
No edit summary
No edit summary
Line 1: Line 1:
Running containers in a multi-tenant environment has a number of security considerations.  While Docker is popular the most typical setups require a daemon that has administrative level privileges that makes it not tenable.  There has been a lot of work in this area but ultimately for HPC environments Singularity or as it is now known Apptainer is a solution that enables the capabilities of container workloads in multi-tenant environments.
Running containers in a multi-tenant environment has a number of security considerations.  While Docker is popular the most typical setups require a daemon that has administrative level privileges that makes it not tenable.  There has been a lot of work in this area but ultimately for HPC environments, [[Apptainer]] is a solution that enables the capabilities of container workloads in multi-tenant environments.


The one consideration is that to create an image you need to have administrative rights on the machine.  For this reason you can't directly create Apptainer images on our supported systems.  You can download or pull images from other repositories including the Docker repositories.
The one consideration is that to create an image you need to have administrative rights on the machine.  For this reason you can't directly create Apptainer images on our supported systems.  You can download or pull images from other repositories including the Docker repositories.
Line 11: Line 11:


=Shared Containers=
=Shared Containers=
Portable images called Singularity Image Format or .sif files can be copied and shared.  Nexus maintains some shared containers in <code>/fs/nexus-containers</code>.  These are arranged by the application(s) that are installed.
Portable images called Singularity Image Format or .sif files can be copied and shared.  Nexus maintains some shared containers in <code>/fs/nexus-containers</code>.  These are arranged by the application(s) that are installed.


Line 20: Line 19:
=Example=
=Example=


If you have the following example file in <code>/fs/nexus-scratch/derek/singularity</code>.
If you have the following example file in <code>/fs/nexus-scratch/derek/apptainer</code>.


<pre>
<pre>
Line 34: Line 33:


<pre>
<pre>
$ singularity exec --bind /fs/nexus-scratch/derek/singularity:/mnt --nv /fs/nexus-containers/pytorch/pytorch_1.10.2+cu113.sif python3 /mnt/test.py
$ apptainer exec --bind /fs/nexus-scratch/derek/apptainer:/mnt --nv /fs/nexus-containers/pytorch/pytorch_1.10.2+cu113.sif python3 /mnt/test.py
Torch cuda is available: True
Torch cuda is available: True
Torch cuda number of devices: 1
Torch cuda number of devices: 1
Torch cuda device 0: NVIDIA RTX A4000
Torch cuda device 0: NVIDIA RTX A4000
</pre>
</pre>

Revision as of 18:58, 8 September 2023

Running containers in a multi-tenant environment has a number of security considerations. While Docker is popular the most typical setups require a daemon that has administrative level privileges that makes it not tenable. There has been a lot of work in this area but ultimately for HPC environments, Apptainer is a solution that enables the capabilities of container workloads in multi-tenant environments.

The one consideration is that to create an image you need to have administrative rights on the machine. For this reason you can't directly create Apptainer images on our supported systems. You can download or pull images from other repositories including the Docker repositories.

Bind Mounts

Apptainer containers will not automatically mount data from the outside operating system other than your home directory. Users need to manually bind mounts for other file paths.

--bind /fs/nexus-scratch/derek/project1:/mnt

In this scenario we are binding the directory outside the container /fs/nexus-scratch/derek/project1 to exist in the path /mnt inside the container.

Shared Containers

Portable images called Singularity Image Format or .sif files can be copied and shared. Nexus maintains some shared containers in /fs/nexus-containers. These are arranged by the application(s) that are installed.

GPUs

Nvidia has a very specific driver and libraries that are required to run CUDA programs. To ensure that all appropriate devices are created inside the container and that these libraries are made available in the container users need to use the --nv flag when instantiating their container(s).

Example

If you have the following example file in /fs/nexus-scratch/derek/apptainer.

#!/usr/bin/env python

import torch;

print(f'Torch cuda is available: {torch.cuda.is_available()}')
print(f'Torch cuda number of devices: {torch.cuda.device_count()}')
for g in range(torch.cuda.device_count()):
    print(f'Torch cuda device {g}: {torch.cuda.get_device_name(0)}')
$ apptainer exec --bind /fs/nexus-scratch/derek/apptainer:/mnt --nv /fs/nexus-containers/pytorch/pytorch_1.10.2+cu113.sif python3 /mnt/test.py
Torch cuda is available: True
Torch cuda number of devices: 1
Torch cuda device 0: NVIDIA RTX A4000