Platform JIPP, there is already a pod template available with 6GB of RAM. The label for it is centos-7-6gb
Let ask it differently: if we would create a "pod template" based on centos-8 and configure that to use 6GB, would it be accepted at all?
If yes, can you please outline steps needed to perform that above?
If no, and we hit some restrictions on the infra/project side, could you also outline steps needed to make it possible to increase limits/quotas/whatever?
so per recourse pack we can actually right now use 8GB @fgurr can we make the default pod templates use 8 GB so they utilize a full resource pack (2 VCPU + 8 GB RAM)?
The problem is that this test has several assumptions and bad properties (including exhausting ALL available memory of the JVM), that it does not caused a problem before is just a lucky think as you mentions because simply every change can be the thing that uses that one byte to much what at best makes something fail (e.g. maven) or worse makes a process being killed by a SIGKILL ...
there is a related warning happening short before the first OutOfMemoryError:
12:48:15.616 [INFO] Building jar: /home/jenkins/agent/workspace/eclipse.platform_PR-764/debug/org.eclipse.ui.externaltools/target/org.eclipse.ui.externaltools-3.6.200-SNAPSHOT-javadoc.jar[299.932s][warning][gc,alloc] pool-237-thread-2: Retried waiting for GCLocker too often allocating 625002 words
i don't know what it means, but may be the root cause.
please note that it seem important to use these quite exhaustive options to see the problem, without doing all api checks and javadoc and alike it seems to use lower memory. I see in system monitor that even I limit maven to 2gb the jvm takes about 3.7 GB while running the build before the build ends because of OOM.
The JVM then prints out
[193,009s][warning][gc,alloc] pool-193-thread-7: Retried waiting for GCLocker too often allocating 78127 wordsjava.lang.OutOfMemoryError: Java heap spaceDumping heap to /tmp/dump/java_pid1213.hprof ...Heap dump file created [2328419420 bytes in 2,884 secs][195,894s][warning][gc,alloc] pool-193-thread-9: Retried waiting for GCLocker too often allocating 78127 words[195,903s][warning][gc,alloc] pool-193-thread-12: Retried waiting for GCLocker too often allocating 78127 words[195,903s][warning][gc,alloc] pool-193-thread-11: Retried waiting for GCLocker too often allocating 78127 words
In the dump I see two threads (what seems obvious as I run with -T2 option) retaining each 850MB of space that seems to originate in the JDT compiler CharDeduplication (?) part
Only think I could think of is calling org.eclipse.jdt.internal.compiler.util.JRTUtil.reset() as it seems that there are things retained when calling with -Papi-check without that everything is running, so it seems if you change a lot (like in this PR) Api checks are called often what leads to things pile-up in that map .. I'll debug why we got so mayn there anyways as I would expect only a few but seeing 14 items there...
and everything results in a new cache entry and a new JrtFileSystem and as we use classpath isolation we have potentially two of them (or more ...) if we use more threads.
I'm not sure if JDT can do better here (sharing some cached state) or PDE is using that method wrong...
Would require some JVM parameter -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath and storing the the output as artifact.
I know it can be done that way, but it might be not very nice for the Jenkins master, so maybe @fgurr has some better way (e.g. store it to a dedicated network device)