Pysol crashes my system? Suggestions on diagnosing problem?

I am having a problem with one of my boxes - its an older but respectable machine…

MOBO - MSI Z68A-GD80 (G3) series
MS-7672 (v3.x) Mainboard (probably?)
Vintage - first release 7/2011

CPU - INTEL I7 2600K 3.4Ghz clock (SandyBridge)

16GB Corsair RAM

The video is using the internal controller on the mobo, which is an Intel 2nd Gen, Family Core Processor.

OS is Debian Buster / KDE

I am addicted to Pysol, a Python solitaire game collection (specifically FreeCell) For some reason, on THIS machine, I get regular crashes when playing. I feel certain that it’s a hardware issue, since I have the same software setup on at least 4 other boxes and never have any issues.

When playing and moving a card, the monitor will frequently freeze for anything from barely noticeable to several seconds. Randomly it will also just go black, and the only recovery is a hard reset / reboot.

This will happen whether Pysol is the only program open, or if I have lots of other stuff open. I’m not sure if the machine is totally hung, or if it is just the video that has gone away. It does NOT happen if I’m not running Pysol, even when I have a lot of other stuff running.

I have run Memtest86 for several days w/o errors, so presumably not RAM. Given that it isn’t happening when running other programs that presumably have higher demands on CPU / Video resources, I am confused about just what this one program is doing to cause problems on this one machine…

Any suggestions on how to troubleshoot / what the suspect hardware is likely to be would be welcome…

ex-Gooserider

1 Like

Which kernel do you have? 4.19.xx or 5.10.xx?
Maybe you are a victim too:

If you are on 5.10.xx kernel, try to go back to 4.19.

How did you install it?

Download?

pip

apt / apt-get ?

I just installed it via apt (Ubuntu 20.04 gnome, Ryzen 5 16 GB RAM) and played it for half an hour or so (I only play Klondike 3 card deal) and it seemed okay to me - but that’s nothing…

I know you ran memtest86, but would it be worth while trying (I’m assuming this is probably 4 x 4 GB DDR3 DIMM modules?) to run it on say just a pair @ 8 GB and see if it gets repeated… then the other pair?

Also - maybe run a performance monitor and something showing temp sensors, while you’re playing?

1 Like

Sorry to take so long getting back on this. The makerspace I spend much of my time at is in the process of moving, and I’ve been busy both with my own packing and volunteer work to help with the general moving process. (Not a small effort to get all the infrastructure out of a 40,00 sq.ft. space)

kovacslt: I’m running the 4.19 kernel on Debian Buster. The way Debian does their releases, it is unlikely to go to a 5.x kernel until the next stable release. I do run updates regularly, but those are primarily security fixes.

daniel.m.tripp: I installed it from the Debian Buster repository using Synaptic, which is a GUI front end to apt…

This is the same way I’ve installed it on all my other machines, which run it with no problems. (my game of addiction is FreeCell BTW)

I will try looking at a performance monitor, although I’d be amazed if it was a temperature issue - I can’t believe that a card game is that resource intensive…

Another thing I probably ought to try is doing an rshell into the machine from a different box to see if the entire system is going down, or just the video.

ex-Gooserider

1 Like

Yes, you can do it easily via installing newer kernel from the backports repo.
I think 5.10.46 is the newest from there:
https://packages.debian.org/buster-backports/kernel/linux-image-5.10.0-0.bpo.8-amd64

Maybe can we take a look at what is in your /etc/X11/xorg.conf.d ?
Do you have some settings for your intel video?

Actually I just checked the Debian site and found they have released the new stable - haven’t upgraded to it yet…

I don’t have an '/etc/X11/xorg.conf.d

/etc/X11$ ls
app-defaults fonts xkb Xresources Xsession.options Xwrapper.config
cursors rgb.txt Xreset Xsession xsm
default-display-manager xinit Xreset.d Xsession.d XvMCConfig

The other text files in the directory don’t mention Intel at all…

/etc/X11$ cat Xsession.options 
# $Id: Xsession.options 189 2005-06-11 00:04:27Z branden $
#
# configuration options for /etc/X11/Xsession
# See Xsession.options(5) for an explanation of the available options.
allow-failsafe
allow-user-resources
allow-user-xsession
use-ssh-agent
use-session-dbus

/etc/X11$ cat XvMCConfig 
libXvMC.so.1

/etc/X11$ cat Xwrapper.config 
# Xwrapper.config (Debian X Window System server wrapper configuration file)
#
# This file was generated by the post-installation script of the
# xserver-xorg-legacy package using values from the debconf database.
#
# See the Xwrapper.config(5) manual page for more information.
#
# This file is automatically updated on upgrades of the xserver-xorg-legacy
# package *only* if it has not been modified since the last upgrade of that
# package.
#
# If you have edited this file but would like it to be automatically updated
# again, run the following command as root:
#   dpkg-reconfigure xserver-xorg-legacy
allowed_users=console

I haven’t found any performance monitors that seemed to track temperature, suggestions?

I have now got things setup so that I can SSH into the box so at least I can check after the next crash whether it is just the video going out, or it it’s the entire machine… When I got it there was a note on the case saying not to reboot it, so I’m wondering if it is just a problem w/ the video built into the mobo? If it is, the next test may be to install an external card and see if that helps.

ex-Gooserider

Further developments… I got an ssh session going on a different machine, logging into the problem box. I also had a terminal session going on the problem box on tty1 (along w/ KDE on tty7, the usual for X…) and played pysol until the system crashed…

The KDE X-session window went disappeared, leaving the tty1 session on the display, but the keyboard didn’t do anything, so it acted like a hang.

However the ssh session to the problem machine is still up and responsive…

This seems to suggest the problem is in the video hardware, so I think the next step is to try putting in an external card and seeing if I get the same problem when using the external card instead of the motherboard card…

(I’m still a bit worried that the keyboard doesn’t work - or so it seems, but that may be just that it is giving me a bunch of ‘command not founds’ that aren’t showing on the screen…)

ex-Gooserider

Sorry, I forgot about this topic…
To me it’s strange, that there isn’t a xorg.conf.d, I always had it. However never really used KDE…
So can you try to create /etc/X11/xorg.conf.d and put a file in it, name it say 30-intel.conf.
For me it looks so:

Section "Device"
Identifier "Intel Graphics"
Driver "intel"
Option "TearFree" "true"
Option "AccelMethod" "sna"
Option "DRI" "3"
EndSection

Most probably you won’t need DRI 3, but mybe you need to set accelmethod.
Try if
Option "AccelMethod" "uxa"
or
Option "AccelMethod" "sna"
works for you better?
If none of them works any better, I’m out of ideas right now.

Yes, I’ve been spotty on chasing this as the Makerspace I’m active in is moving and I’ve been having to deal w/ getting all my stuff packed and moved… The moving part is mostly done, but we now have to wait for the buildout in the new location to get done before I can start unpacking. It does have the advantage that I brought my best machine home so now all my machines are sitting next to each other, which helps even if it does overload the office space…

Before I saw your reply, I tried a couple of external graphics cards w/ some really strange results, and no luck in getting a display on the external card, but I don’t know for certain that either of the cards I tried (both crufted) are actually good… (I need to test them in a known good machine…)

I don’t know if it is a KDE thing, or a Debian thing, but I did some poking, given that you were suprised about not having an xorg.conf.d

@bill-box:~$ sudo find / -iname 'xorg.conf.d' -print
/usr/share/X11/xorg.conf.d
@bill-box:~$ cat /usr/share/X11/xorg.conf.d
cat: /usr/share/X11/xorg.conf.d: Is a directory
@bill-box:~$ ls -al /usr/share/X11/xorg.conf.d
total 28
drwxr-xr-x 2 root root 4096 Apr 20  2021 .
drwxr-xr-x 5 root root 4096 Jun  7 16:22 ..
-rw-r--r-- 1 root root   92 Feb  7  2019 10-amdgpu.conf
-rw-r--r-- 1 root root 1350 Apr 19  2021 10-quirks.conf
-rw-r--r-- 1 root root   92 Apr  6  2019 10-radeon.conf
-rw-r--r-- 1 root root 1429 Mar 30  2019 40-libinput.conf
-rw-r--r-- 1 root root 2747 Jun 26  2017 70-wacom.conf

I know there is an effort to change the locations of a lot of things in the FHS so possibly this is part of that… Also It seems to be a directory, rather than a file, but the individual files seem to use the same format as the sample you posted.

So I will try creating a file based on your sample, although I will call it 10-intel.conf since that is the format used in the other driver files.

ex-Gooserider

Tried it, w/ mixed results - it booted up and started KDE, but did so a bit strangely where several things like the KDE start menu opened in the middle of the screen instead of off the task bar at the bottom… I played multiple games of Pysolf w/o getting a crash, but given that it’s an intermittent problem it doesn’t prove much. I did see the occasional slow response or momentary pauses that I saw before.

However when I tried to exit, it failed to close properly and I ended up where the mouse cursor moved but there was no response to the buttons… I could switch to the other tty’s.

This was trying w/ DRI 3 and AccelMethod sna, I will try changing AccelMethod to UXA and see if that does any better…

ex-Gooserider

More mixed results… w/ AccelMethod uxa, the system booted fine and looked good until I tried to exit pysolf where I ended up w/ the same problem of the mouse cursor moving, but non responsive mouse buttons. I didn’t get any of the hesitation / momentary freezes I did w the original system - but again as an intermittent problem, it’s hard to tell…

ex-Goosrider

Try to remove the DRI 3 option, an try both accelmethods again, to see if there’s any improvement?

I tried removing the DRI 3 option, and leaving it on UXA, limited testing but so far no crashes and I didn’t get the non-responsive buttons, so I can exit and switch windows OK…

ex-Gooserider

1 Like

Next time I tried it things were strange… Even though I had closed the Pysol window, the top border and function menu bar was visible at the top of the screen, the rest of it may have been present but was covered by several Firefox windows. The Pysol window bar had the active highlight, but was not responsive.

The Firefox windows seemed to allow switching between tabs or on links in the tabs, but would not minimize or close…

I tried switching to SNA acceleration and got basically the same results. The system seemed fine at first, but after sitting with the screen saver (blank screen) on for an hour or two did the same thing as above…

ex-Gooserider

Dragging this back up… I decided to give up on messing with the integrated graphics, and put in an external card. I had a couple of unknown history cards, which turned out not to work (but at least hadn’t cost me anything) So I bought a used card off of eBay - an EVGA / Nvidia GeForce GT 610 2GB Single Fan GDDR3. Specs say it is a 2012 vintage, which is about the age of the machine I’m using it in… It has dual DVI outputs so I can go double head eventually… Not fancy by any means but I’m not a heavy graphics person, so should work fine for my needs.

I had a problem when I first started the machine after installing. I ended up in TTY1 instead of KDE, and startx also gave errors… I made a guess that possibly since I was now using an Nvidia GPU card instead of the integrated Intel graphics, the problem might be that the Intel file I added during the earlier efforts was a problem. I went in and renamed the file to 10-intel.notconf and the system now boots into KDE as expected.

Only tried it for a few minutes but it looks OK so far, except that the KDE menu shows up about 1/3 of the way up the screen instead of just above the menu bar. Not sure if this is a major problem but it’s a bit odd…

What I’m now wondering is that there were a bunch of other conf files in the same directory, but nothing labeled for Nvidia… Do I need one? Also what about the other files for things I don’t have?

As a reminder this is what I had -

@bill-box:~$ ls -al /usr/share/X11/xorg.conf.d
total 28
drwxr-xr-x 2 root root 4096 Apr 20  2021 .
drwxr-xr-x 5 root root 4096 Jun  7 16:22 ..
-rw-r--r-- 1 root root   92 Feb  7  2019 10-amdgpu.conf
-rw-r--r-- 1 root root 1350 Apr 19  2021 10-quirks.conf
-rw-r--r-- 1 root root   92 Apr  6  2019 10-radeon.conf
-rw-r--r-- 1 root root 1429 Mar 30  2019 40-libinput.conf
-rw-r--r-- 1 root root 2747 Jun 26  2017 70-wacom.conf

ex-Gooserider