Oom-killer - searching for a trigger

najkarika · 8 October 2025 14:40

Few days ago froze my Server out of nowhere. Now is happening everyday 2 - 3 times. Every time something differently invoked oom-killer. After 1 - 3 Minutes everything works smoothly.
In that moment when server is freezing, every Processors activity grows to 98%.

Can someone help me to find what is the trigger to invoke oom-killer?

xahodo · 8 October 2025 14:54

My guess is a memory-leak somewhere. A big one.

The OOM-killer only activates in an out-of-memory situation. It’ll start killing processes until it has enough memory to continue working normally.

I found this article, which explains more.

abu · 8 October 2025 15:38

I also had such a case in the past. Due to a (own) programming fault, a program continuously requested memory but didn’t free anything. You should trace your memory consumption somehow.

najkarika · 8 October 2025 19:57

Swap is always full. It has 4GB and it is always at 3,99GB or 4GB.
Should I give more space for swap?

nevj · 8 October 2025 21:18

Not at the moment. One of your processes is requesting too much memory… probably a programming bug. To find out which process, study memory usage with top or htop.
Identify the rogue pricess and kill it.
If the rogue process is something vital, you are going to need to look for alternatives.

daniel.m.tripp · 9 October 2025 01:59

What does your server do? What is the main service it “serves”?

i.e. what is the main server software it’s running?

Is it Java based?

I agree - don’t go adding swap just yet…

najkarika · 9 October 2025 06:51

I decide to reboot the server. Now swap is 0K used.

Over Remote Desktop users connecting to this server to go to internet.

abu · 9 October 2025 08:46

I know Java is a resource hog, but in general its garbage collector does a good job, AFAIK.

xahodo · 9 October 2025 11:12

How many users are we talking?

Which remote desktop server software?

najkarika · 9 October 2025 11:23

We are talking about maximal 20 users.

We are using xrdp

xahodo · 9 October 2025 12:01

And how much RAM does each instance use on the server? You see, RDP runs the applications of the users on the server they’re connecting with, NOT on the client.

najkarika · 9 October 2025 12:38

It depends on “how much” firefox they are using. 1GB/user I think.

kovacslt · 9 October 2025 14:37

I’m not sure this helps you, but I use early-oom. This kicks in sooner than the kernles oom killer. The built-in oom killer in my experience never finishes in a sane time. Earlyoom just kills most memory eating process before real oom could happen, but memory is running out.
Maybe you could try.
The user, who (or whose process) causes the oom, just experiences a crash of that process without further notifications, etc. This may cause some loss of unsaved anything ffrom that process, but in general the rest of the system survives, and stays responsive.

abu · 9 October 2025 16:18

It’s not my issue, but IMHO prepending another watcher or guardian sounds like symptom fighting.
If a system runs out of resources regularly, spotting and removing the cause is, for me, the only way to go.

kovacslt · 9 October 2025 16:29

That’s right. In my case it means to replace me, as the main cause.
I have some virtual machines to test things. Some of them have a visible amount of memory assigned. That would be fine, but as I’m impatient sometimes or a bit inattentive, rarely, but sometimes launch a new VM before the previous was completely shut down. That creates the oom problem instantly.
Early oom comes handy that time.

callpaul.eu · 9 October 2025 17:58

Is it time to upgrade the memory (if possible) to 8 or more if space allows.

No this is not answering your question but even for a Linux server would expect more memory depending on the number of users connected

abu · 9 October 2025 20:50

Up to now, the OP never mentioned the available physical memory of his server. If you refer to the swap size, you might be right. But then there is still something blindly eating memory, thus no solution.

callpaul.eu · 9 October 2025 21:27

He did write 4 GB early in the exchange

I am not a server expert so it is more of an education guess

nevj · 9 October 2025 22:52

These oom-killer apps seem to be suffering from a lack of intelligence.
What may be needed is some sort of extension of the nice command to cover memory usage as well as cpu usage. … For example one might want to flag a process as

kill me if I hog memory
dont kill me in any circumstances

That would help oom to distinguish between laziness , lack of foresight, and a real need to keep a process going.

daniel.m.tripp · 10 October 2025 02:07

Can you post the output of both “free” and “free -h” commands?