Message boards : Graphics cards (GPUs) : Unable to determine the device handle for GPU 0000:02:00.0: Unknown Error
Author | Message |
---|---|
After rebooting the system and restarting the boinc GPUGRID, | |
ID: 58805 | Rating: 0 | rate: / Reply Quote | |
which of your two machines are you talking about? | |
ID: 58807 | Rating: 0 | rate: / Reply Quote | |
After rebooting the system and restarting the boinc GPUGRID, this happens when the driver crashes or the GPU has some kind of problem and drops off. only a reboot can bring it back. check your power and PCIe connections to make sure they are good. I mostly encountered this issue with dodgy power cables. edit- scratch that, I see that these are laptops now. so not much you can do really for checking power connections. it could be that the cards are overheating when trying to run GPUGRID tasks. make sure the laptops have adequate airflow and are maintaining reasonable temps. maybe reduce overclocks if any. that might be all you can do without getting into the weeds and taking it apart to replace thermal paste, etc. ____________ | |
ID: 58809 | Rating: 0 | rate: / Reply Quote | |
which of your two machines are you talking about? it's the linux one. ____________ | |
ID: 58810 | Rating: 0 | rate: / Reply Quote | |
Linux and I think that looks like driver crash as explained here. | |
ID: 58812 | Rating: 0 | rate: / Reply Quote | |
I will try the thermal paste change as soon as I receive it by post. | |
ID: 58822 | Rating: 0 | rate: / Reply Quote | |
Always best to grab Nvidia drivers straight from Nvidia. Get the Studio drivers. | |
ID: 58823 | Rating: 0 | rate: / Reply Quote | |
ok. | |
ID: 58843 | Rating: 0 | rate: / Reply Quote | |
Do a sudo apt purge *nvidia* to get rid of the existing drivers and reboot | |
ID: 58854 | Rating: 0 | rate: / Reply Quote | |
I removed the drivers with: nvidia-installer log file '/var/log/nvidia-installer.log' creation time: Fri May 27 18:38:09 2022 installer version: 510.73.05 PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin nvidia-installer command line: ./nvidia-installer Using: nvidia-installer ncurses v6 user interface -> Detected 8 CPUs online; setting concurrency level to 8. -> Installing NVIDIA driver version 510.73.05. -> The NVIDIA driver appears to have been installed previously using a different installer. To prevent potential conflicts, it is recommended either to update the existing installation using the same mechanism by which it was originally installed, or to uninstall the existing installation before installing this driver. Please review the message provided by the maintainer of this alternate installation method and decide how to proceed: Please use the Debian packages instead of the .run file. (Answer: Continue installation) -> Running distribution scripts executing: '/usr/lib/nvidia/pre-install'... If you want to use the nvidia-installer please uninstall the Debian packages first. The two methods of installation cannot be used at the same time. Terminating nvidia-installer in 1 seconds. Killing nvidia-installer | |
ID: 58870 | Rating: 0 | rate: / Reply Quote | |
Can't help you here. I know nothing about MX-Linux. | |
ID: 58872 | Rating: 0 | rate: / Reply Quote | |
The command sequence found to remove the NVIDIA MXLinux driver is possibly: apt purge nvidia* -y apt-get purge $FORCE $(apt-cache pkgnames | grep nvidia | grep -v detect | grep -v cleanup | cut -d':' -f1) bumblebee* primus* primus*:i386 2>&1 apt autoremove and then new driver version 510.73.05 was installed and the system stopped crashing: Sat May 28 12:40:23 2022 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 510.73.05 Driver Version: 510.73.05 CUDA Version: 11.6 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA GeForce ... Off | 00000000:02:00.0 Off | N/A | | N/A 72C P0 N/A / N/A | 2291MiB / 4096MiB | 3% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 1418 G /usr/lib/xorg/Xorg 4MiB | | 0 N/A N/A 3344 C bin/python 2285MiB | +-----------------------------------------------------------------------------+ | |
ID: 58873 | Rating: 0 | rate: / Reply Quote | |
Congrats, well done! | |
ID: 58874 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : Unable to determine the device handle for GPU 0000:02:00.0: Unknown Error