grycap.nvidia_driver

ansible-role-nvidia-driver

An Ansible role to install the NVIDIA driver from the CUDA repositories provided by NVIDIA.

Requirements

When installing the NVIDIA driver, this role will reboot the nodes where it runs. Therefore, it’s highly recommended to run ansible-playbook from a different machine than the GPU nodes where the driver is being installed.

If you try to run Ansible on the same machine as the one where the driver is being installed, this role will either:

  • Stop with an error like Running reboot with local connection would reboot the control node (if using the local connection)
  • Reboot the machine you're on, which will stop the playbook execution! (if using an ssh connection to localhost)

Installing

You can install this role using Ansible Galaxy:

$ ansible-galaxy install grycap.nvidia_driver

Role variables

Variable Default value Description
nvidia_driver_package_version "" The version of the package to install. Make sure this matches the actual version of the deb or RPM package.
nvidia_driver_persistence_mode_on yes Whether to enable persistence mode (true/false)
nvidia_driver_skip_reboot no Whether to skip rebooting the machine during installation
nvidia_driver_module_file "/etc/modprobe.d/nvidia.conf" The filename used for NVIDIA driver settings
nvidia_driver_module_params "" Parameters to pass to the NVIDIA driver

Variables for Red Hat

Variable Default value Description
epel_package "https://dl.fedoraproject.org/pub/epel/epel-release-latest-{{ ansible_distribution_major_version }}.noarch.rpm" Package to install for EPEL support
nvidia_driver_rhel_cuda_repo_baseurl "https://developer.download.nvidia.com/compute/cuda/repos/{{ _rhel_repo_dir }}/" Base URL for CUDA repository
nvidia_driver_rhel_cuda_repo_gpgkey "https://developer.download.nvidia.com/compute/cuda/repos/{{ _rhel_repo_dir }}/7fa2af80.pub" GPG key for the CUDA repository

Variables for Ubuntu

For Ubuntu installs, you can choose to install from either the Canonical repositories or the NVIDIA CUDA repositories.

By default, the Canonical repositories will be used, and the driver installed will be the headless server driver.

Variable Default value Description
nvidia_driver_ubuntu_install_from_cuda_repo no Flag to indicate whether to use the CUDA repository
nvidia_driver_ubuntu_branch 450 Driver branch to use for installation
nvidia_driver_ubuntu_packages ["nvidia-headless-450-server", "nvidia-headless-450-utils"] Package names to install from Canonical repo
nvidia_driver_ubuntu_cuda_repo_baseurl "http://developer.download.nvidia.com/compute/cuda/repos/{{ _ubuntu_repo_dir }}" Base URL for CUDA repository
nvidia_driver_ubuntu_cuda_repo_gpgkey_url "https://developer.download.nvidia.com/compute/cuda/repos/{{ _ubuntu_repo_dir }}/7fa2af80.pub" GPG key for the CUDA repository
nvidia_driver_ubuntu_cuda_repo_gpgkey_id "7fa2af80" GPG key ID for the CUDA repository
nvidia_driver_ubuntu_cuda_package "cuda-drivers" Package name to install from the CUDA repository

Example playbook

- hosts: gpu_nodes
  roles:
  - nvidia.nvidia_driver

Supported distributions

This role currently supports the following Linux distributions:

  • NVIDIA DGX OS 4
  • NVIDIA DGX OS 5
  • Ubuntu 18.04 LTS
  • Ubuntu 20.04 LTS
  • CentOS 7
  • CentOS 8
  • Red Hat Enterprise Linux 7
Informazioni sul progetto

Install the NVIDIA driver

Installa
ansible-galaxy install grycap.nvidia_driver
Licenza
bsd-3-clause
Download
2.1k
Proprietario
Grid y Computación de Altas Prestaciones