hpc-deployer
This repository contains the logic to allow users to deploy their Graphene pipeline onto an hpc.
Description of the source code
This repository has two branches. The core of the hpc deployer is under the branch hpc-deployer-pure. It can be used to deploy the Graphene pipelines without effort onto an hpc.
In current development state of the Grahpene the old code of kubernetes deployer has been extended to allow users download the core of hpc deployer and be able to deploy their pipelines to an hpc.
Later on with the newer version of hpc deployer, please update this repository.
Installation
This project uses the current source code of kuberenetes deployer to allow users to get the hpc deployer on their local PC.
It uses Java 11 to make the Image which contains the logic that allows users to download the hpc-deployer. This Image will be used as a service which can get called from the UI and provide users with:
- HPC Deployer
- Orchestrator_client.py
- Protobuf Files
- dockerinfo.json
- blueprint.json
Adding custom repositories
The kubernetes deployer relies on some acomos maven repositories to be compiled. These should first added to your local pc.
The settings file for the repository is under:
Linux: ~/.m2/settings.xml
<?xml version="1.0" encoding="UTF-8"?>
<settings xmlns="http://maven.apache.org/SETTINGS/1.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
<profiles>
<profile>
<id>acumos-dev</id>
<repositories>
<repository>
<id>ai4eu-public-repository</id>
<name>AI4EU Public Maven Repository Group</name>
<url>https://cicd.ai4eu-dev.eu:7443/repository/maven-releases/</url>
</repository>
</repositories>
</profile>
</profiles>
<servers>
<server>
<id>ai4eu-repo</id>
<username>jenkins</username>
<password>789birlinghoven</password>
</server>
</servers>
<activeProfiles>
<activeProfile>acumos-dev</activeProfile>
</activeProfiles>
</settings>
Adding JDK 11
JDK 11 is needed to compile the java code of this repository. Following script allows you to choose different JDK versions based on your need. In the given code JDK 11 is chosen.
Add this script to your ~/.bashrc:
export CLEAN_PATH=$PATH
jdk8 () {
export JAVA_HOME=/lib/jvm/java-8-openjdk-amd64/;export PATH=$JAVA_HOME/bin:$CLEAN_PATH
}
jdk11 () {
export JAVA_HOME=/lib/jvm/java-11-openjdk-amd64/;export PATH=$JAVA_HOME/bin:$CLEAN_PATH
}
jdk17 () {
export JAVA_HOME=/lib/jvm/java-17-openjdk-amd64/;export PATH=$JAVA_HOME/bin:$CLEAN_PATH
}
jdk11
Compile the project
- Navigate to the hpc-deployer page.
- Optionally change the name and version of the docker container in the pom.xml
- Compile the source code with mvn clean install.
After a successfull build a docker image will be generated on your local PC.
Pushing to registery.
- Retag the built image:
docker tag graphene/hpc-deployer:<new_version> cicd.ai4eu-dev.eu:7444/graphene/hpc-deployer:<new_version>
- Push the image to the registery:
docker push cicd.ai4eu-dev.eu:7444/graphene/hpc-deployer:<new_version>
Restarting the kubernetes.
- A deployment is required to allow the pod to start.
- Go to the deployment setting with
kubectl -n graphene edit deployment <deployment_name>
- In the deployment setting, change the image url to cicd.ai4eu-dev.eu:7444/graphene/hpc-deployer:<new_version>
- Get list of current pods:
kubectl -n graphene get pod
- Delete the pod to allow deployment to create a new pod with the given new image url with:
kubectl delete pod <pod_name> -n graphene
- See the logs to check the changes:
kubectl logs <new_pod_name> -n graphene
Using the Deployer.
Download the file. Unzip its content. Install its requirments and start the deployer with python main.py
Refrences
- https://www.unicore.eu/docstore/ucc-6.4.0/ucc-manual.html
- https://unicore-docs.readthedocs.io/en/latest/user-docs/ucc/manual.html#ucc-datamanagement
- https://unicore-docs.readthedocs.io/en/latest/user-docs/rest-api/job-description/index.html#job-description
- https://apps.fz-juelich.de/jsc/hps/juwels/batchsystem.html#job-steps
SLURM Help
Task | Command |
---|---|
Documentation | man sbatch |
Run job | sbatch myscript.sh |
See the project | jutil user projects |
See jobs and their states | sacct --starttime 2023-09-27 |
sacct -b |
|
Show job detail | scontrol show job <jobid> |
sstat -j <jobid> |
|
List all current jobs for a user | squeue -u <username> |
List all running jobs for a user | squeue -u <username> -t RUNNING |
List all pending jobs for a user | squeue -u <username> -t PENDING |
To cancel one job | scancel <jobid> |
To cancel all the jobs for a user | scancel -u <username> |
To cancel all the pending jobs for a user | scancel -t PENDING -u <username> |