Skip to content

Deploy Single Model as JuypterConnect to HPC Jülich

To the goal is to create a file hpc-solution-jupyter.zip, that contains all files and scripts to submit a model together with a juypterlab-node for execution as sbatch job via Unicore to Jülich HPC with gpu use and Web-UI connections.

Preconditions:

  • the user has an account on the Jülich system and a sufficient amount of CPU/GPU-hours available.
  • the example model: to be defined: can be detr-object-detection-model or an LLM model with GPU usage

Flow

  • the user extracts the hpc-solution-jupyter.zip into his home folder
  • the zip contains the protobuf file of the model
  • hpc-solution.zip contains a script (python or bash), e.g. "submit-slurm-jupyter-job.py" that the user must execute on the commandline
  • the script can ask for user credentials
  • the script creates the necessary sbatch file(s)
  • the script uses unicore to submit the job and connect the web-uis (jupyter/model)
  • the JupyterConnect setup includes a shared folder to which both apptainers are connected
  • finally, the script prints the job-id from slurm and the connection endpoints (grpc + http) of the nodes

reference to the existing jupyterconnect-script for kubernetes:https://gitlab.eclipse.org/eclipse/graphene/kubernetes-client/-/blob/main/deploy/private/jupyter-deployment-script.py?ref_type=heads

Edited by Martin Welss