100
AMD GPUs are cursed for me
(lemmy.world)
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
Happy to help! Tough you are right, this is a rather generic error that doesn't help much just confirms that the GPU is the issue.
At this point it could be a driver issue since there are similar open bug reports. A hardware problem is still possible since you previously said that it's unstable on windows too, and power related issues can also lead to this error message.
EDIT: Tentative solution: CoreCtrl
CoreCtrl allowed me to underclock my Radeon 5600XT GPU (currently set values to GPU 800MHz and memory set to 500MHz). I say "tentative" because this problem has been persistent for years, but I've been running Cyberpunk for 1 hour at 60FPS on High settings (and mostly 60FPS on Ultra, but I had some FPS drops). Even if this solution isn't 100% perfect, I think some combination of changing the GPU values is probably going to make my rig much more functional.
I found CoreCtrl based on a Reddit thread last night but didn't have time to test it until this evening after work. Seems to have made a world of a difference.
Yeah I've tried just about every feasible kernel parameter for
amdgpu
module, updated my kernel, to 6.2 on Linux Mint, and I've tried several different BIOS settings. My system runs everything reasonably. Even Cyberpunk 2077 is generally at 60FPS. But after about 5minutes of gaming on Cyberpunk 2077, it crashes. Other games last longer, which is why I use Cyberpunk 2077 to stress test my system.These are my system specs:
I don't really see where I might be going wrong here. I bought this all ~4 years ago and I've always had these intermittent crashes. It's admittedly worse on Linux, but it still occurred on Windows.
Anyways, I spent about 5 hours last night reading bug forums, testing various amdgpu mod parameters, settings in my BIOS, and even re-configuring my fans to provide (potentially) more optimal cooling. None of this really made a difference. I run two 1080p monitors (not exactly breaking the bank here). I had a lot of hope regarding one forum about
ring gfx_1.0.0 errors
related to how AMD reads the GPU in Linux. My graphics card is detected as:Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5600 OEM/5600 XT / 5700/5700 XT]
and apparently some machines used to accidentally use the total allocated memory for 5700XT instead of the 5600XT. This resulted in some form of corrupt memory allocation. That sort of behavior would make sense for my system since it runs well, but just fails suddenly.Other errors I've seen are:
^ These are all errors which occurred from various tests of
amdgpu
module settings and/or BIOS settings. The common thread is some form ofring XXXX timeout
.These two threads seemed like my best chance, but their proposed solutions didn't help: