Oom-killer - searching for a trigger

Few days ago froze my Server out of nowhere. Now is happening everyday 2 - 3 times. Every time something differently invoked oom-killer. After 1 - 3 Minutes everything works smoothly.
In that moment when server is freezing, every Processors activity grows to 98%.

Can someone help me to find what is the trigger to invoke oom-killer?

2 Likes

My guess is a memory-leak somewhere. A big one.

The OOM-killer only activates in an out-of-memory situation. It’ll start killing processes until it has enough memory to continue working normally.

I found this article, which explains more.

4 Likes

I also had such a case in the past. Due to a (own) programming fault, a program continuously requested memory but didn’t free anything. You should trace your memory consumption somehow.

3 Likes

Swap is always full. It has 4GB and it is always at 3,99GB or 4GB.
Should I give more space for swap?

1 Like

Not at the moment. One of your processes is requesting too much memory… probably a programming bug. To find out which process, study memory usage with top or htop.
Identify the rogue pricess and kill it.
If the rogue process is something vital, you are going to need to look for alternatives.

4 Likes

What does your server do? What is the main service it “serves”?

i.e. what is the main server software it’s running?

Is it Java based?

I agree - don’t go adding swap just yet…

3 Likes

I decide to reboot the server. Now swap is 0K used.

Over Remote Desktop users connecting to this server to go to internet.

1 Like

I know Java is a resource hog, but in general its garbage collector does a good job, AFAIK.

3 Likes

How many users are we talking?

Which remote desktop server software?

2 Likes

We are talking about maximal 20 users.

We are using xrdp

1 Like

And how much RAM does each instance use on the server? You see, RDP runs the applications of the users on the server they’re connecting with, NOT on the client.

2 Likes

It depends on “how much” firefox they are using. 1GB/user I think.

1 Like

I’m not sure this helps you, but I use early-oom. This kicks in sooner than the kernles oom killer. The built-in oom killer in my experience never finishes in a sane time. Earlyoom just kills most memory eating process before real oom could happen, but memory is running out.
Maybe you could try.
The user, who (or whose process) causes the oom, just experiences a crash of that process without further notifications, etc. This may cause some loss of unsaved anything ffrom that process, but in general the rest of the system survives, and stays responsive.

3 Likes

It’s not my issue, but IMHO prepending another watcher or guardian sounds like symptom fighting.
If a system runs out of resources regularly, spotting and removing the cause is, for me, the only way to go.

4 Likes

That’s right. In my case it means to replace me, as the main cause.
I have some virtual machines to test things. Some of them have a visible amount of memory assigned. That would be fine, but as I’m impatient sometimes or a bit inattentive, rarely, but sometimes launch a new VM before the previous was completely shut down. That creates the oom problem instantly. :slight_smile:
Early oom comes handy that time.

3 Likes

Is it time to upgrade the memory (if possible) to 8 or more if space allows.

No this is not answering your question but even for a Linux server would expect more memory depending on the number of users connected

Up to now, the OP never mentioned the available physical memory of his server. If you refer to the swap size, you might be right. But then there is still something blindly eating memory, thus no solution.

He did write 4 GB early in the exchange

I am not a server expert so it is more of an education guess

These oom-killer apps seem to be suffering from a lack of intelligence.
What may be needed is some sort of extension of the nice command to cover memory usage as well as cpu usage. … For example one might want to flag a process as

  • kill me if I hog memory
  • dont kill me in any circumstances

That would help oom to distinguish between laziness , lack of foresight, and a real need to keep a process going.

4 Likes

Can you post the output of both “free” and “free -h” commands?

1 Like