CODEF - MDM deployment issue - PV - solution confirmation
Hi @urb , @lga , when I try to deploy MDM via the post_deploy.sh script in CODEF I get some issues with the pods not getting ready and crashing like in the following:
kc get po -n he-codeco-mdm
\\\\\\\`\\\\\\\`\\\\\\\`he-codeco-acm acm-operator-controller-manager-6874997f74-zpkcj 2/2 Running 0 57m 10.244.2.3 athw7 he-codeco-mdm freshness-connector-check-connection-6c845dbfd8-qzqtj 1/1 Running 0 3m47s 10.244.0.12 athm4 he-codeco-mdm k8s-connector-check-connection-684558df6d-hqs4f 1/1 Running 0 3m48s 10.244.0.11 athm4 he-codeco-mdm kubescape-connector-check-connection-84c84fdf9d-v29gh 1/1 Running 0 3m47s 10.244.2.18 athw7 he-codeco-mdm mdm-api-0 1/1 Running 0 3m48s 10.244.1.12 athw8 he-codeco-mdm mdm-controller-0 0/1 CrashLoopBackOff 3 (37s ago) 3m49s 10.244.2.17 athw7 he-codeco-mdm mdm-kafka-0 0/1 Pending 0 46m he-codeco-mdm mdm-neo4j-0 0/1 Pending 0 3m49s he-codeco-mdm mdm-zookeeper-0 0/1 Pending 0 46m \\\\\\\`\\\\\\\`\\\\\\\`I believe the issue in my case is that the PV are stuck in Pending state and I don't have a storageclass as you see in the following:
kc get pvc -A
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
he-codeco-mdm data-mdm-kafka-0 Pending standard <unset> 59m
he-codeco-mdm data-mdm-neo4j-0 Pending standard <unset> 17m
he-codeco-mdm data-mdm-zookeeper-0 Pending standard <unset> 59m
he-codeco-netma mysql-pv-claim Bound mysql-pv 2Gi RWO manual <unset> 69m
kc get storageclass -A
No resources foundThe logs:
kubectl get svc mdm-kafka-headless -n he-codeco-mdm
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mdm-kafka-headless ClusterIP None <none> 9092/TCP,9093/TCP 26h
kubectl get endpoints mdm-kafka-headless -n he-codeco-mdm -o wide
NAME ENDPOINTS AGE
mdm-kafka-headless <none> 26h
kubectl describe pod mdm-zookeeper-0 -n he-codeco-mdm
Name: mdm-zookeeper-0
Namespace: he-codeco-mdm
Priority: 0
Service Account: mdm-zookeeper
Node: <none>
Labels: app.kubernetes.io/component=zookeeper
app.kubernetes.io/instance=mdm-zookeeper
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=zookeeper
app.kubernetes.io/version=3.9.3
apps.kubernetes.io/pod-index=0
controller-revision-hash=mdm-zookeeper-556cc6ffb8
helm.sh/chart=zookeeper-13.8.2
statefulset.kubernetes.io/pod-name=mdm-zookeeper-0
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: StatefulSet/mdm-zookeeper
Containers:
zookeeper:
Image: docker.io/bitnami/zookeeper:3.9.3-debian-12-r15
Ports: 2181/TCP, 8080/TCP
Host Ports: 0/TCP, 0/TCP
SeccompProfile: RuntimeDefault
Command:
/scripts/setup.sh
Limits:
cpu: 375m
ephemeral-storage: 2Gi
memory: 384Mi
Requests:
cpu: 250m
ephemeral-storage: 50Mi
memory: 256Mi
Liveness: exec [/bin/bash -ec ZOO_HC_TIMEOUT=3 /opt/bitnami/scripts/zookeeper/healthcheck.sh] delay=30s timeout=5s period=10s #success=1 #failure=6
Readiness: exec [/bin/bash -ec ZOO_HC_TIMEOUT=2 /opt/bitnami/scripts/zookeeper/healthcheck.sh] delay=5s timeout=5s period=10s #success=1 #failure=6
Environment:
BITNAMI_DEBUG: false
ZOO_DATA_LOG_DIR:
ZOO_PORT_NUMBER: 2181
ZOO_TICK_TIME: 2000
ZOO_INIT_LIMIT: 10
ZOO_SYNC_LIMIT: 5
ZOO_PRE_ALLOC_SIZE: 65536
ZOO_SNAPCOUNT: 100000
ZOO_MAX_CLIENT_CNXNS: 60
ZOO_4LW_COMMANDS_WHITELIST: srvr, mntr, ruok
ZOO_LISTEN_ALLIPS_ENABLED: no
ZOO_AUTOPURGE_INTERVAL: 1
ZOO_AUTOPURGE_RETAIN_COUNT: 10
ZOO_MAX_SESSION_TIMEOUT: 40000
ZOO_SERVERS: mdm-zookeeper-0.mdm-zookeeper-headless.he-codeco-mdm.svc.cluster.local:2888:3888::1
ZOO_ENABLE_AUTH: no
ZOO_ENABLE_QUORUM_AUTH: no
ZOO_HEAP_SIZE: 1024
ZOO_LOG_LEVEL: ERROR
ALLOW_ANONYMOUS_LOGIN: yes
POD_NAME: mdm-zookeeper-0 (v1:metadata.name)
ZOO_ADMIN_SERVER_PORT_NUMBER: 8080
Mounts:
/bitnami/zookeeper from data (rw)
/opt/bitnami/zookeeper/conf from empty-dir (rw,path="app-conf-dir")
/opt/bitnami/zookeeper/logs from empty-dir (rw,path="app-logs-dir")
/scripts/setup.sh from scripts (rw,path="setup.sh")
/tmp from empty-dir (rw,path="tmp-dir")
Conditions:
Type Status
PodScheduled False
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-mdm-zookeeper-0
ReadOnly: false
empty-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
scripts:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: mdm-zookeeper-scripts
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 4m7s (x15 over 66m) default-scheduler 0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.
However I found a solution and I would appreciate if you can confirm it or a way to test if everything is fine.
- kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
- kubectl patch storageclass local-path
-p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' - kubectl apply -f sc-standard.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: standard
provisioner: rancher.io/local-path
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
And after some time I get the pods running and the PVs are bound
kc get po -n he-codeco-mdm -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
freshness-connector-check-connection-6c845dbfd8-qzqtj 1/1 Running 0 37m 10.244.0.12 athm4 <none> <none>
k8s-connector-check-connection-684558df6d-hqs4f 1/1 Running 0 37m 10.244.0.11 athm4 <none> <none>
kubescape-connector-check-connection-84c84fdf9d-v29gh 1/1 Running 0 37m 10.244.2.18 athw7 <none> <none>
mdm-api-0 1/1 Running 0 37m 10.244.1.12 athw8 <none> <none>
mdm-controller-0 1/1 Running 11 (8m35s ago) 37m 10.244.2.17 athw7 <none> <none>
mdm-kafka-0 1/1 Running 1 (5m51s ago) 79m 10.244.2.20 athw7 <none> <none>
mdm-neo4j-0 1/1 Running 0 4m55s 10.244.2.22 athw7 <none> <none>
mdm-zookeeper-0 1/1 Running 0 79m 10.244.0.16 athm4 <none> <none>
kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 18m
standard rancher.io/local-path Delete WaitForFirstConsumer false 17m
kc get pv,pvc -A
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE
mysql-pv 2Gi RWO Retain Bound he-codeco-netma/mysql-pv-claim manual <unset> 82m
pvc-6ec5fd83-221d-42a3-a23e-b49155a4c268 2Gi RWO Delete Bound he-codeco-mdm/data-mdm-zookeeper-0 standard <unset> 12s
pvc-c2e94237-2236-4645-9303-0ef1e619e990 10Gi RWO Delete Bound he-codeco-mdm/data-mdm-neo4j-0 standard <unset> 13s
pvc-fcc8abbb-e8d9-482a-bb6b-b4bed6130840 1Gi RWO Delete Bound he-codeco-mdm/data-mdm-kafka-0 standard <unset> 13s
NAMESPACE NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE
he-codeco-mdm data-mdm-kafka-0 Bound pvc-fcc8abbb-e8d9-482a-bb6b-b4bed6130840 1Gi RWO standard <unset> 76m
he-codeco-mdm data-mdm-neo4j-0 Bound pvc-c2e94237-2236-4645-9303-0ef1e619e990 10Gi RWO standard <unset> 34m
he-codeco-mdm data-mdm-zookeeper-0 Bound pvc-6ec5fd83-221d-42a3-a23e-b49155a4c268 2Gi RWO standard <unset> 76m
he-codeco-netma mysql-pv-claim Bound mysql-pv 2Gi RWO manual <unset> 85m
Edited by Georgios Koukis