Skip to content

Jenkins randomly stops and resumes

Summary

In random cases the build suddenly pauses and resumes after a while which causes unpredictable timeouts.

Steps to reproduce

Run GlassFish job, ie. here: https://ci.eclipse.org/glassfish/job/glassfish_build-and-test-using-jenkinsfile/job/PR-25517/

Today most of runs failed in random steps:

  • ant -version paused for more than 10 minutes. At this moment it is still stuck: https://ci.eclipse.org/glassfish/job/glassfish_build-and-test-using-jenkinsfile/job/PR-25517/48/execution/node/431/log/

  • java -version took 39 seconds in the same build:

    09:44:19  + java -version
    09:44:58  openjdk version "17.0.15" 2025-04-15
    09:44:58  OpenJDK Runtime Environment Temurin-17.0.15+6 (build 17.0.15+6)
    09:44:58  OpenJDK 64-Bit Server VM Temurin-17.0.15+6 (build 17.0.15+6, mixed mode, sharing)
    09:44:58  + ant -version
  • in another job domain creations stuck; I tried to pause+resume the build, Jenkins reacted, but it still doesn't move for more than 2 minutes.

    09:57:20  + /home/jenkins/agent/workspace/_test-using-jenkinsfile_PR-25517/glassfish7/glassfish/bin/asadmin --user anonymous --passwordfile /home/jenkins/agent/workspace/_test-using-jenkinsfile_PR-25517/appserver/tests/appserv-tests/temppwd create-domain --adminport 45707 --domainproperties jms.port=45708:domain.jmxPort=45709:orb.listener.port=45710:http.ssl.port=45711:orb.ssl.port=45714:orb.mutualauth.port=45715 --instanceport 45712 domain1
    Pausing
    Resuming
  • another step paused when starting maven:

    09:38:48  + mvn clean package -f /home/jenkins/agent/workspace/_test-using-jenkinsfile_PR-25517/appserver/tests/appserv-tests/lib/pom.xml -Pstaging
    09:43:43  [INFO] Scanning for projects...
  • And again another nearly 3 minute pause

    09:57:20  + /home/jenkins/agent/workspace/_test-using-jenkinsfile_PR-25517/glassfish7/glassfish/bin/asadmin --user anonymous --passwordfile /home/jenkins/agent/workspace/_test-using-jenkinsfile_PR-25517/appserver/tests/appserv-tests/temppwd create-domain --adminport 45707 --domainproperties jms.port=45708:domain.jmxPort=45709:orb.listener.port=45710:http.ssl.port=45711:orb.ssl.port=45714:orb.mutualauth.port=45715 --instanceport 45712 domain1
    10:00:28  Using port 45707 for Admin.

What is the expected correct behavior?

Build should pass or fail in around 30 minutes and run without pausing and especially timeouts.

Priority

  • Urgent
  • High
  • Medium
  • Low

Severity

  • Blocker
  • Major
  • Normal
  • Low

Impact

We are getting behind schedule with releases, TCK updates, lot of work to do.

  • Was there some change in GlassFish project resources, sponsoring?
  • Could it be caused by some file system issues?

I have no idea what to do. I tried at least to use urandom for generating the selfsign certificate, as first what was stuck was keytool, however all other pauses cannot be related to urandom.

Edited by David Matějček