GPU issues again. Replace or repair?

Well, people, I’ve been here before. I know this place, I’ve walked its floor.
I had issues with browsers before, and @daniel.m.tripp correctly named it as needing to disable hardware acceleration.
But now, I am beginning to wonder if the Radeon card itself is failing me.
This time, it was while using Skype in Ubuntu Maté. The screen went dark, the keyboard stopped working but the person at the other end could still hear and see me. I tried using the keyboard to open the terminal and enter but I had to reboot with the power key.
This evening, I was using GPUTest, and was able to invoke the problem running PixMark Volplosion (poor numbers but it worked), then opened PixMark Piano and the screen vanished. BUT if I closed the terminal, opened it again and ran Piano, voila.
Now, just a short while ago, I left my computer running with no programs open, came back, opened Skype and it happened again. Without a video chat.
I ran journalctl and copied the events from awakening the computer to rebooting from the power button. If y’all can find anything informative here, tell me! If not, I may just swallow my pride and replace the card. I don’t want to, since I never play games, edit videos or do anything heavier than English classes using any of four video chat apps (Telegram, Skype, Google Meet and Zoom). Only Skype had the issue.
AMD RV630, ATI Radeon HD 2600 XT

Jun 24 22:09:06 cliff-desktop systemd-logind[1398]: Power key pressed.
Jun 24 22:08:41 cliff-desktop kernel: [drm] UVD initialized successfully.
Jun 24 22:08:41 cliff-desktop kernel: [drm] ring test on 5 succeeded in 1 usecs
Jun 24 22:08:41 cliff-desktop kernel: [drm] ring test on 0 succeeded in 1 usecs
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: fence driver on ring 5 use gpu addr 0x00000000000521d0 and cpu addr 0x0000000004dfd583
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000010000c00 and cpu addr 0x000000008c75c95a
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: WB enabled
Jun 24 22:08:41 cliff-desktop kernel: [drm] PCIE GART of 512M enabled (table at 0x0000000000142000).
Jun 24 22:08:41 cliff-desktop kernel: [drm] PCIE gen 2 link speeds already enabled
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: GPU reset succeeded, trying to resume
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80100000
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_000E50_SRBM_STATUS      = 0x200080C0
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008014_GRBM_STATUS2     = 0x00000003
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008010_GRBM_STATUS      = 0xA0003030
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: SRBM_SOFT_RESET=0x00000100
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEF
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008680_CP_STAT          = 0x80018645
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_00867C_CP_BUSY_STAT     = 0x00008086
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008678_CP_STALLED_STAT2 = 0x00008002
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_000E50_SRBM_STATUS      = 0x200000C0
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008014_GRBM_STATUS2     = 0x00110103
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0:   R_008010_GRBM_STATUS      = 0xE4723030
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: GPU softreset: 0x00000009
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: Saved 281 dwords of commands on ring 0.
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: GPU lockup (current fence id 0x000000000002f320 last fence id 0x000000000002f329 on ring 0)
Jun 24 22:08:41 cliff-desktop kernel: radeon 0000:01:00.0: ring 0 stalled for more than 10204msec
Jun 24 22:08:26 cliff-desktop mate-screensaver-dialog[11069]: gkr-pam: unlocked login keyring
Jun 24 22:08:20 cliff-desktop mate-power-mana[2753]: Source ID 125 was not found when attempting to remove it
Jun 24 22:02:51 cliff-desktop canonical-livepatch.canonical-livepatchd[1541]: Client information is recent, not refreshing.

@cliffsloane
I have an ATI card that also will just go blank, all that can be done is hit the reset
and reboot the machine. Been planning on changing the card, but have not yet>

@cliffsloane
AMD RV630, ATI Radeon HD 2600 XT
I looked it up. Its about the same rating as my Geforce 220. Not super expensive. Why not just replace it? There really is no repair option.

Before effectively removing or even throwing away the card, I would always first test it in a Windows environment. With Linux you can never be sure, if Linux just fails to properly use the card.

Many years ago, I had a graphics card doing all kinds of funny stuff on Linux. Then, I tried using it on Windows and it actually worked fine.

So, even if you would replace the card, you would not need to throw it away, if it works on Windows. It could be used by a different computer.


I’ve looked up the card now.

It does not even have 1GB or even 512MB memory. It’s too old anyway, even if it would still “work”.

Since it’s such an obsolete and barely useful card, I would suggest replacement would be the best choice. It does not make much sense to put time & effort into fixing this old stone from before the Roman Empire had fallen.

Any application needs some kind of GPU to display things. editing videos and games are just the buzz words of the main stream for talking about heavy graphics card usage. In fact though, any application with a GUI needs to use the graphics card. So, it does not mean the card is enough, even if you “only” use those lightweight applications.

I would recommend to get a fairly recent card having between 2GB and 4GB memory. They are not too expensive, draw less relative power and are much more powerful nonetheless.

If you are truly minimalistic, you should still have nothing below 2GB of memory on your card.

For a number of reasons, i am super cautious about getting a new card. First, i never had an issue with video chats for the last 5 years; it was only after an update on Skype, so first suspicion is there.
Second, what if it is not the card? What if there is some OS issue? Shouldn’t i rule that out before getting a different one?
Finally, i would want to see the evidence before judging a card that mostly worked well for 6 years. Replacing it is like prescribing antibiotics for everything.
That being said, @Akito had a good point about trying to replicate the problem in Windows. And, i would add, in Mint.

Last question. Is there a way to test if the card is losing functionality with use? Does the duration of a video call itself stress the card? And how can i see that?

Fully understand your points. This is also why I initially tried to see ways of finding out first, what’s going on. However, once I looked up that piece of hardware, I realised it’s so old, I wonder how you were working with it for the past 6 years. It was already too old 6 years ago.

Now, the point of the obsoleteness argument is, that this device is so old and worthless, that it simply does not make any sense trying to figure out what’s actually wrong with the card. It’s such an obsolete card, you should get a new one, either way. Even if you could still fix it. I’m very sure that modern integrated Intel graphics chips are still better than this dedicated graphics card.

If you, for whatever reason, want to use exactly this graphics card model, then you could at least buy a working one from a second hand shop. It’s probably extremely overpriced, like 15 to 30 bucks, but it’s still less painful than trying to figure out the issue with the card.

Basically, if you plug in any other card you can already make out if it’s an operating system problem (or any other problem origin) in general, or if it’s mainly the card that’s broken. Since it’s such an extremely obsolete card, I say that it makes sense to buy a new one, either way. The one you have is too old, even if it still would work.

I am back with an embarrassingly basic hardware question, same domain.

Against my better judgment, I followed @Akito 's suggestion and got a replacement card. Dang, I must have missed a compatibility issue somewhere, 'cause the BIOS screen doesn’t even show up.

It is an Intel DH77EB Board, which is described as taking GDDR3 and PCIe v.3.0 x16. Now, I know that can be 1x16, 2x16 or 3x16, but the online spec sheets I relied on said all three are compatible.

So I found what I THOUGHT would work, NVidia GeoForce GTS240 with 1 GB and PCIe 2x16. It was inexpensive and from a reliable vendor.

I installed it and nothing showed up on the screen upon booting. No beeps either, but it sounded just like a normal boot.
So I looked at BIOS (Legacy). It reads:
Integrated graphics → Enable if primary
Primary → Auto (IGD/Ext PCIe/Manual)
IGD Port → Auto
No Video Beeps :heavy_check_mark:

I put the old card back in and everything works as before.
So tell me. What did I get wrong?

Not sure, but I think your linux may not have the drivers installed for the new card.
Make sure you have non-free repository enabled in /etc/sources.list
then
Look thru the repository for packages containing Nvidia drivers, and if not installed , install them

I have a GT218 Nvidia card (GeForce210) and it has always worked.

I think the issue is, Linux will always detect and install drivers when it is being installed, but yours is already installed and you have swapped hardware on it. It will not automatically install drivers in tahat situation, you have to do it manually.
Regards
Neville

But I don’t even see the BIOS opening screen, let alone GRUB. If I could get as far as opening the OS, that would be another issue.

OK, I misunderstood.
So it gets thru the POST test ( the blinking cursor on blank screen) then the BIOS splash screen does not start?
So the BIOS cant use the video card?
I dont know what to do , but I would try

  • make sure the card is seated in its slot
  • are there any switches on the card?
  • try setting the BIOS to a very simple VGA video setting
    Neville

@cliffsloane will most likely have to chroot into his installed linux and then install the Nvidia driver
for his card. You do the chroot with either a live dvd or with a bootable usb.