Nvidia Container Toolkit and Podman on Ubuntu 20.04
I did not immediately find a good tutorial on how to use the Nvidia Container Toolkit with Podman to be able to run GPU-accelerated containers. The solution turned out to be quite straightforward.
First, you can install Podman as usual:
sudo apt-get install -y curl wget gnupg2
source /etc/os-release
sudo sh -c "echo 'deb http://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/xUbuntu_${VERSION_ID}/ /' > /etc/apt/sources.list.d/devel:kubic:libcontainers:stable.list"
wget -nv https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable/xUbuntu_${VERSION_ID}/Release.key -O- | sudo apt-key add -
sudo apt-get update && sudo apt-get install -y podman
podman -v
Afterwards, you can install the Nvidia Container Toolkit as usual:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
However, before you can start using both together you need to manually define the OCI hook:
sudo mkdir -p /usr/share/containers/oci/hooks.d/
cat << EOF | sudo tee /usr/share/containers/oci/hooks.d/oci-nvidia-hook.json
{
"version": "1.0.0",
"hook": {
"path": "/usr/bin/nvidia-container-toolkit",
"args": ["nvidia-container-toolkit", "prestart"],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
]
},
"when": {
"always": true,
"commands": [".*"]
},
"stages": ["prestart"]
}
EOF
sudo systemctl restart podman
Once this is done you should be able to run GPU-accelerated images via Podman the way you are familiar with from nvidia-docker:
sudo podman run --rm --gpus all nvidia/cuda:11.6.0-base-ubuntu20.04 nvidia-smi
Finally, please note that if you should hit an error like
Error: OCI runtime error: error executing hook `/usr/bin/nvidia-container-toolkit` (exit code: 1)
that most likely indicates that your driver is not new enough. You can take a look at your card’s CUDA capability and try a driver upgrade. For instance, on my current machine I have two GTX 2080 SUPER with driver 510.73.08 and that means I can run CUDA toolkit 11.6, but not 11.7. If you were on a 470 driver, this number should be lower and you might be able to run 11.4, but not 11.5.
Leave a Reply
Want to join the discussion?Feel free to contribute!