Nexus/Apptainer: Difference between revisions

From UMIACS
Jump to navigation Jump to search
No edit summary
(Redirected page to Apptainer)
Tag: New redirect
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
Running containers in a multi-tenant environment has a number of security considerations.  While Docker is popular, the most typical setups require a daemon that has administrative level privileges that makes it not tenable.  There has been a lot of work in this area but ultimately for HPC environments, [[Apptainer]] is a solution that enables the capabilities of container workloads in multi-tenant environments.
#REDIRECT [[Apptainer]]
 
The one consideration is that to create an image from a definition file you need to have administrative rights on the machine.  For this reason you can't directly create Apptainer images on our supported systems.  You can download or pull images from other repositories including the Docker repositories.
 
=Bind Mounts=
Apptainer containers will not automatically mount data from the outside operating system other than your home directory.  Users need to manually bind mounts for other file paths.
 
<code>--bind /fs/nexus-scratch/derek/project1:/mnt</code>
 
In this scenario we are binding the directory outside the container <code>/fs/nexus-scratch/derek/project1</code> to exist in the path <code>/mnt</code> inside the container.
 
=Shared Containers=
Portable images called Singularity Image Format or .sif files can be copied and shared.  Nexus maintains some shared containers in <code>/fs/nexus-containers</code>.  These are arranged by the application(s) that are installed.
 
=GPUs=
Nvidia has a very specific driver and libraries that are required to run CUDA programs.  To ensure that all appropriate devices are created inside the container and that these libraries are made available in the container users need to use the <code>--nv</code> flag when instantiating their container(s).
 
=Example=
If you have the following example file in <code>/fs/nexus-scratch/derek/apptainer</code>.
 
<pre>
#!/usr/bin/env python
 
import torch;
 
print(f'Torch cuda is available: {torch.cuda.is_available()}')
print(f'Torch cuda number of devices: {torch.cuda.device_count()}')
for g in range(torch.cuda.device_count()):
    print(f'Torch cuda device {g}: {torch.cuda.get_device_name(0)}')
</pre>
 
<pre>
$ apptainer exec --bind /fs/nexus-scratch/derek/apptainer:/mnt --nv /fs/nexus-containers/pytorch/pytorch_1.10.2+cu113.sif python3 /mnt/test.py
Torch cuda is available: True
Torch cuda number of devices: 1
Torch cuda device 0: NVIDIA RTX A4000
</pre>

Latest revision as of 17:47, 13 August 2024

Redirect to: