Skip to content

CODEF - MDM deployment issue - PV - solution confirmation

Hi @urb , @lga , when I try to deploy MDM via the post_deploy.sh script in CODEF I get some issues with the pods not getting ready and crashing like in the following:

kc get po -n he-codeco-mdm \\\\\\\`\\\\\\\`\\\\\\\`he-codeco-acm acm-operator-controller-manager-6874997f74-zpkcj 2/2 Running 0 57m 10.244.2.3 athw7 he-codeco-mdm freshness-connector-check-connection-6c845dbfd8-qzqtj 1/1 Running 0 3m47s 10.244.0.12 athm4 he-codeco-mdm k8s-connector-check-connection-684558df6d-hqs4f 1/1 Running 0 3m48s 10.244.0.11 athm4 he-codeco-mdm kubescape-connector-check-connection-84c84fdf9d-v29gh 1/1 Running 0 3m47s 10.244.2.18 athw7 he-codeco-mdm mdm-api-0 1/1 Running 0 3m48s 10.244.1.12 athw8 he-codeco-mdm mdm-controller-0 0/1 CrashLoopBackOff 3 (37s ago) 3m49s 10.244.2.17 athw7 he-codeco-mdm mdm-kafka-0 0/1 Pending 0 46m he-codeco-mdm mdm-neo4j-0 0/1 Pending 0 3m49s he-codeco-mdm mdm-zookeeper-0 0/1 Pending 0 46m \\\\\\\`\\\\\\\`\\\\\\\`

I believe the issue in my case is that the PV are stuck in Pending state and I don't have a storageclass as you see in the following:

kc get pvc -A
NAMESPACE         NAME                   STATUS    VOLUME     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
he-codeco-mdm     data-mdm-kafka-0       Pending                                        standard       <unset>                 59m
he-codeco-mdm     data-mdm-neo4j-0       Pending                                        standard       <unset>                 17m
he-codeco-mdm     data-mdm-zookeeper-0   Pending                                        standard       <unset>                 59m
he-codeco-netma   mysql-pv-claim         Bound     mysql-pv   2Gi        RWO            manual         <unset>                 69m
kc get storageclass -A No resources found

The logs:

kubectl get svc mdm-kafka-headless -n he-codeco-mdm
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)             AGE
mdm-kafka-headless   ClusterIP   None         <none>        9092/TCP,9093/TCP   26h
kubectl get endpoints mdm-kafka-headless -n he-codeco-mdm -o wide
 NAME                 ENDPOINTS   AGE
mdm-kafka-headless   <none>      26h
kubectl describe pod mdm-zookeeper-0 -n he-codeco-mdm
 Name:             mdm-zookeeper-0
Namespace:        he-codeco-mdm
Priority:         0
Service Account:  mdm-zookeeper
Node:             <none>
Labels:           app.kubernetes.io/component=zookeeper
                  app.kubernetes.io/instance=mdm-zookeeper
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=zookeeper
                  app.kubernetes.io/version=3.9.3
                  apps.kubernetes.io/pod-index=0
                  controller-revision-hash=mdm-zookeeper-556cc6ffb8
                  helm.sh/chart=zookeeper-13.8.2
                  statefulset.kubernetes.io/pod-name=mdm-zookeeper-0
Annotations:      <none>
Status:           Pending
IP:
IPs:              <none>
Controlled By:    StatefulSet/mdm-zookeeper
Containers:
  zookeeper:
    Image:           docker.io/bitnami/zookeeper:3.9.3-debian-12-r15
    Ports:           2181/TCP, 8080/TCP
    Host Ports:      0/TCP, 0/TCP
    SeccompProfile:  RuntimeDefault
    Command:
      /scripts/setup.sh
    Limits:
      cpu:                375m
      ephemeral-storage:  2Gi
      memory:             384Mi
    Requests:
      cpu:                250m
      ephemeral-storage:  50Mi
      memory:             256Mi
    Liveness:             exec [/bin/bash -ec ZOO_HC_TIMEOUT=3 /opt/bitnami/scripts/zookeeper/healthcheck.sh] delay=30s timeout=5s period=10s #success=1 #failure=6
    Readiness:            exec [/bin/bash -ec ZOO_HC_TIMEOUT=2 /opt/bitnami/scripts/zookeeper/healthcheck.sh] delay=5s timeout=5s period=10s #success=1 #failure=6
    Environment:
      BITNAMI_DEBUG:                 false
      ZOO_DATA_LOG_DIR:
      ZOO_PORT_NUMBER:               2181
      ZOO_TICK_TIME:                 2000
      ZOO_INIT_LIMIT:                10
      ZOO_SYNC_LIMIT:                5
      ZOO_PRE_ALLOC_SIZE:            65536
      ZOO_SNAPCOUNT:                 100000
      ZOO_MAX_CLIENT_CNXNS:          60
      ZOO_4LW_COMMANDS_WHITELIST:    srvr, mntr, ruok
      ZOO_LISTEN_ALLIPS_ENABLED:     no
      ZOO_AUTOPURGE_INTERVAL:        1
      ZOO_AUTOPURGE_RETAIN_COUNT:    10
      ZOO_MAX_SESSION_TIMEOUT:       40000
      ZOO_SERVERS:                   mdm-zookeeper-0.mdm-zookeeper-headless.he-codeco-mdm.svc.cluster.local:2888:3888::1
      ZOO_ENABLE_AUTH:               no
      ZOO_ENABLE_QUORUM_AUTH:        no
      ZOO_HEAP_SIZE:                 1024
      ZOO_LOG_LEVEL:                 ERROR
      ALLOW_ANONYMOUS_LOGIN:         yes
      POD_NAME:                      mdm-zookeeper-0 (v1:metadata.name)
      ZOO_ADMIN_SERVER_PORT_NUMBER:  8080
    Mounts:
      /bitnami/zookeeper from data (rw)
      /opt/bitnami/zookeeper/conf from empty-dir (rw,path="app-conf-dir")
      /opt/bitnami/zookeeper/logs from empty-dir (rw,path="app-logs-dir")
      /scripts/setup.sh from scripts (rw,path="setup.sh")
      /tmp from empty-dir (rw,path="tmp-dir")
Conditions:
  Type           Status
  PodScheduled   False
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-mdm-zookeeper-0
    ReadOnly:   false
  empty-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  scripts:
    Type:        ConfigMap (a volume populated by a ConfigMap)
    Name:        mdm-zookeeper-scripts
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                  From               Message
  ----     ------            ----                 ----               -------
  Warning  FailedScheduling  4m7s (x15 over 66m)  default-scheduler  0/3 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.

However I found a solution and I would appreciate if you can confirm it or a way to test if everything is fine.

  1. kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
  2. kubectl patch storageclass local-path
    -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
  3. kubectl apply -f sc-standard.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
provisioner: rancher.io/local-path
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer

And after some time I get the pods running and the PVs are bound

kc get po -n he-codeco-mdm -o wide
NAME                                                    READY   STATUS    RESTARTS         AGE     IP            NODE    NOMINATED NODE   READINESS GATES
freshness-connector-check-connection-6c845dbfd8-qzqtj   1/1     Running   0                37m     10.244.0.12   athm4   <none>           <none>
k8s-connector-check-connection-684558df6d-hqs4f         1/1     Running   0                37m     10.244.0.11   athm4   <none>           <none>
kubescape-connector-check-connection-84c84fdf9d-v29gh   1/1     Running   0                37m     10.244.2.18   athw7   <none>           <none>
mdm-api-0                                               1/1     Running   0                37m     10.244.1.12   athw8   <none>           <none>
mdm-controller-0                                        1/1     Running   11 (8m35s ago)   37m     10.244.2.17   athw7   <none>           <none>
mdm-kafka-0                                             1/1     Running   1 (5m51s ago)    79m     10.244.2.20   athw7   <none>           <none>
mdm-neo4j-0                                             1/1     Running   0                4m55s   10.244.2.22   athw7   <none>           <none>
mdm-zookeeper-0                                         1/1     Running   0                79m     10.244.0.16   athm4   <none>           <none>
kubectl get storageclass
NAME                   PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path   Delete          WaitForFirstConsumer   false                  18m
standard               rancher.io/local-path   Delete          WaitForFirstConsumer   false                  17m
kc get pv,pvc -A
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                                STORAGECLASS   VOLUMEATTRIBUTESCLASS   REASON   AGE
mysql-pv                                   2Gi        RWO            Retain           Bound    he-codeco-netma/mysql-pv-claim       manual         <unset>                          82m
pvc-6ec5fd83-221d-42a3-a23e-b49155a4c268   2Gi        RWO            Delete           Bound    he-codeco-mdm/data-mdm-zookeeper-0   standard       <unset>                          12s
pvc-c2e94237-2236-4645-9303-0ef1e619e990   10Gi       RWO            Delete           Bound    he-codeco-mdm/data-mdm-neo4j-0       standard       <unset>                          13s
pvc-fcc8abbb-e8d9-482a-bb6b-b4bed6130840   1Gi        RWO            Delete           Bound    he-codeco-mdm/data-mdm-kafka-0       standard       <unset>                          13s

NAMESPACE         NAME                   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   VOLUMEATTRIBUTESCLASS   AGE
he-codeco-mdm     data-mdm-kafka-0       Bound    pvc-fcc8abbb-e8d9-482a-bb6b-b4bed6130840   1Gi        RWO            standard       <unset>                 76m
he-codeco-mdm     data-mdm-neo4j-0       Bound    pvc-c2e94237-2236-4645-9303-0ef1e619e990   10Gi       RWO            standard       <unset>                 34m
he-codeco-mdm     data-mdm-zookeeper-0   Bound    pvc-6ec5fd83-221d-42a3-a23e-b49155a4c268   2Gi        RWO            standard       <unset>                 76m
he-codeco-netma   mysql-pv-claim         Bound    mysql-pv                                   2Gi        RWO            manual         <unset>                 85m
Edited by Georgios Koukis