I don’t know their algorithm to decide what to kill. I think the tool is something like a last resort for avoiding the destruction of the system.
Setting user quotas probably could have a similar effect, but maybe in a better way. It would deny any further memory assignment to a particular user, ending up in destroying the user’s process, with potential data loss. I assume that a program consuming memory endlessly wouldn’t react gracefully to a memory denial.
Thank All of you for your answers and discussion.
I wrote before that I rebooted the server. Now is the second day and I hadn’t any problems. I didn’t expect that.
Like I wrote before, in log files I founded that every time oom-killer killed some other Process from different users. But I didn’t find the trigger. I am not going to search more.
I also bought 64GB of RAM and the second Processor.
This seems to be achievable in a way, according to the docs:
earlyoom v1.8
Usage: ./earlyoom [OPTION]...
-m PERCENT[,KILL_PERCENT] set available memory minimum to PERCENT of total
(default 10 %).
earlyoom sends SIGTERM once below PERCENT, then
SIGKILL once below KILL_PERCENT (default PERCENT/2).
-s PERCENT[,KILL_PERCENT] set free swap minimum to PERCENT of total (default
10 %).
Note: both memory and swap must be below minimum for
earlyoom to act.
-M SIZE[,KILL_SIZE] set available memory minimum to SIZE KiB
-S SIZE[,KILL_SIZE] set free swap minimum to SIZE KiB
-n enable d-bus notifications
-N /PATH/TO/SCRIPT call script after oom kill
-P /PATH/TO/SCRIPT call script before oom kill
-g kill all processes within a process group
-d, --debug enable debugging messages
-v print version information and exit
-r INTERVAL memory report interval in seconds (default 1), set
to 0 to disable completely
-p set niceness of earlyoom to -20 and oom_score_adj to
-100
--ignore-root-user do not kill processes owned by root
--sort-by-rss find process with the largest rss (default oom_score)
--prefer REGEX prefer to kill processes matching REGEX
--avoid REGEX avoid killing processes matching REGEX
--ignore REGEX ignore processes matching REGEX
--dryrun dry run (do not kill any processes)
--syslog use syslog instead of std streams
-h, --help this help text
So flag the process with a matching ignore REGEX for “dont kill me in any circumstances”.
That will help enormously. I have 64Gb ram… I can run several VM’s simultaneously, and it does not even blink. It is a 15 year old machine… it copes because it has ram to spare.