Jenkins builds are failing with error code 137 (SIGKILL received)
Summary
Every build fails due to a SIGKILL, when executed on the Jenkins agent.
I've been investigating this problem for the past week and from what I could gather, this error code is usually produced, when the amount of memory requested by a process is larger than what can be provided by the operating system.
I've been told that the agents themselves can only provide up to 4GB of memory, which is why I've tried setting an upper limit of 1.5GB for our build job. 500MB is used by the build job itself and 1GB is used by the test runner, which is executed in a separate process. I've also set the limit to 500MB for both processes, which quickly led to an OutOfMemory error, indicated that those values are indeed respected. This tells me that some other process (or multiple processes) consume more than 2.5GB of the agents physical memory, even when idle.
I've also tested the job on both the GitHub runners as well as locally. In both cases, the builds were able to finish without problem, which makes me believe that this is not a problem with the build itself, but something specific to the agent.
I've ran the build locally and attached a profiler to keep track of the amount of allocated memory. I've also compared it to an earlier version, which I was still able to build on the Jenkins agent. The latest version requires ~400MB more memory, which brings the minimum amount to what needs to be provided by the agent to roughly 700-800MB.
Steps to reproduce
Execute the following build job: https://ci.eclipse.org/windowbuilder/job/build/job/master/
What is the current bug behavior?
The test crashes after ~10 minutes at an arbitrary point due to a SIGKILL sent by the operating system.
What is the expected correct behavior?
The test finishes after 30-40 minutes, hopefully without any failures.
Relevant logs and/or screenshots
This is the console log produced by the Jenkins build: https://ci.eclipse.org/windowbuilder/job/build/job/master/250/console
And the PR on GitHub, where I further investigated the problem: https://github.com/eclipse-windowbuilder/windowbuilder/pull/750
Priority
-
Urgent -
High -
Medium -
Low
Severity
-
Blocker -
Major -
Normal -
Low
Impact
We're unable to produce a nightly build, as well as a milestone build which is due in two weeks.