Question:
When I run nvidia-smi
to check the usage of GPUs, the Linux feedbacks the following message:
1 | Failed to initialize NVML: Driver/library version mismatch |
This error may happen even you could still use this command 10 minutes ago.
Diagnosis:
The Ubuntu system set up an auto-updating for the nvidia GPU driver and create this driver/kernel mismatch.
Solutions:
Check the version of your GPU kernel module.
cat /proc/driver/nvidia/version
The feedback is something like this:
1 | NVRM version: NVIDIA UNIX x86_64 Kernel Module 460.106.00 Tue Sep 28 12:05:58 UTC 2021 |
Check the recommended nvidia driver version.
ubuntu-drivers devices
You will get the feedback in the following:
1 | == /sys/devices/pci0000:00/0000:00:01.1/0000:01:00.0 == |
Note: You will see a mismatch between the recommended version and existed version for your nvidia driver.
Remove the mismatched driver
sudo apt-get --purge remove nvidia*
Install the recommended nvidia driver version
sudo apt install nvidia-driver-460
Reboot your machine
sudo reboot
Then, check the GPU status again nvidia-smi
.
Reference
[1] Nvidia NVML Driver/library version mismatch [closed]
[2] ubuntu20.04 nvidia-smi命令报错Failed to initialize NVML: Driver/library version mismatch解决办法–重启电脑