killing the OOM killer
The default memory allocation scheme for Linux is to be optimistic and allow processes to overcommit, in the hope that some of them will never use the entire slot that they asked for. The official docs in Documentation/vm/overcommit-accounting assure us that "obvious overcommits of address space are refused" but what they don't tell us is what happens when the heuristics fail and more memory is needed than what is available: the OOM killer kicks in. OOM (out of memory) cases are resolved by killing a process with high memory usage (not necessarily the one going haywire with a memory leak).
One of the usual suspects that gets the axe is unfortunately postgresql because of its memory allocation pattern. So an important piece of software is terminated instead of the pyhton/php/perl script that caused the problem. The solution lies in changing the allocation policy to a less arbitrary one: no overcommitting at all. If a process tries to malloc() more memory than available it will get an error right away. This mode is set by changing the sysctl value vm.overcommit_memory to 2.
There's another value that influences memory allocation in this mode: vm.overcommit_ratio . It represents the percentage of physical RAM that along with the entire swap forms the total address space (allocatable memory). Yes, the naming is confusing, since there is no overcommitting. The default is 50 so, quite unintuitively, only half of the RAM will be used. Seeing that the swap is the only component that can change dynamically it would have made sense to specify a percentage for that. Anyway, there are 2 ways to ensure that the allocatable memory will be equal to the RAM size:
1. make the swap the same size as the RAM and set vm.overcommit_ratio=0 - that will work as long as the swap is in use, but as soon as a swapoff is done (manually or during shutdown), you're in for a nasty surprise: no more memory can be allocated. At all! New programs will fail to start.
2. Set vm.overcommit_ratio=100 and disable the swap. This is a stable setup and the preferred method on a machine with enough RAM. Web servers are slowed down anyway by swapping and careful planning will help ensuring that the memory usage limits won't be reached (by setting cache sizes, maximum number of processes, etc.).
Summing it all up, here's the relevant section from /etc/sysctl.conf:
- vm.overcommit_ratio = 100
- vm.overcommit_memory = 2
I have also tested this setup on desktop machines with no unexpected behavior. Even when adding swap to the mix (other than allowing the runaway process to fill in all the swap space and bring the system to a crawl). But the real winner here is the web server. Postgresql can finally relax ;-)



Leave a Comment :