nvidia.nvidia_driver

ansible-role-nvidia-driver

This is an Ansible role for installing the NVIDIA driver from the NVIDIA CUDA repositories.

Requirements

When you install the NVIDIA driver using this role, the nodes will be rebooted. Therefore, it's best to run ansible-playbook from a different node than the GPU nodes where the driver is being installed.

If you try to run Ansible on the same node where you're installing the driver, you may encounter one of the following issues:

  • You might get an error like Running reboot with local connection would reboot the control node if using the local connection.
  • The node you are using could reboot, which would stop the playbook execution if connecting via ssh to localhost.

Installing

You can install this role using Ansible Galaxy:

$ ansible-galaxy install nvidia.nvidia_driver

Role Variables

Variable Default Value Description
nvidia_driver_package_state "present" State of NVIDIA driver packages.
nvidia_driver_package_version "" Version of the package to install. Ensure it matches the actual deb or RPM package version.
nvidia_driver_persistence_mode_on yes Enable persistence mode (true or false).
nvidia_driver_skip_reboot no Skip rebooting the node during installation.
nvidia_driver_module_file "/etc/modprobe.d/nvidia.conf" File used for NVIDIA driver parameters.
nvidia_driver_module_params "" Parameters for the NVIDIA driver.
nvidia_driver_branch "515" Default driver branch to install.

Red Hat Specific Variables

Variable Default Value Description
epel_package "https://dl.fedoraproject.org/pub/epel/epel-release-latest-{{ ansible_distribution_major_version }}.noarch.rpm" Package to enable EPEL
nvidia_driver_rhel_cuda_repo_baseurl "https://developer.download.nvidia.com/compute/cuda/repos/{{ _rhel_repo_dir }}/" Base URL for CUDA repo
nvidia_driver_rhel_cuda_repo_gpgkey "https://developer.download.nvidia.com/compute/cuda/repos/{{ _rhel_repo_dir }}/7fa2af80.pub" GPG key for CUDA repo

Ubuntu Specific Variables

For Ubuntu installations, you can choose to install from the Canonical repositories or the NVIDIA CUDA repositories. By default, the Canonical repositories will be used, and the server driver will be installed.

Variable Default Value Description
nvidia_driver_ubuntu_install_from_cuda_repo no Use the CUDA repo if set to yes
nvidia_driver_ubuntu_cuda_repo_baseurl "http://developer.download.nvidia.com/compute/cuda/repos/{{ _ubuntu_repo_dir }}" Base URL for CUDA repo
nvidia_driver_ubuntu_cuda_package "cuda-drivers" Package name from CUDA repo
nvidia_driver_ubuntu_packages_suffix "-server" Suffix for apt packages during installation

Example Playbook

- hosts: gpu_nodes
  roles:
  - nvidia.nvidia_driver

Supported Distributions

This role currently supports the following Linux distributions:

  • NVIDIA DGX OS 4
  • NVIDIA DGX OS 5
  • Ubuntu 18.04 LTS
  • Ubuntu 20.04 LTS
  • CentOS 7
  • Red Hat Enterprise Linux 7
  • CentOS 8
  • Red Hat Enterprise Linux 8
Informazioni sul progetto

Install the NVIDIA driver

Installa
ansible-galaxy install nvidia.nvidia_driver
Licenza
bsd-3-clause
Download
329.2k
Proprietario