Skip to content

Ingress Traffic from CODECO monitoring components to Prometheus: network policies need revision (ACM)

Hello,

We (at FOR) are working on deploying CODECO in our local cluster (bare-matel)(K3s) and we had an issue w.r.t the freshness connector pod in MDM component relating to the connection to Prometheus svc being blocked. Default kube-Prometheus doesn't allow ingress traffic (HTTP requests) from any pod to its service (Prometheus-k8s) unless it was granted access in the network policy yaml file, ( prometheus-networkPolicy.yaml) for security purposes probably. (this issue is not reproducible in KinD environment without network policy and isolation).

logs of the freshness connector pod, stating the error:

kubectl logs freshness-connector-check-connection-68fdc47456-p7ch7 -n he-codeco-mdm
{'CAcerts': [], 'crawler': {'k8s': {'api_url': 'https://kubernetes.default.svc'}, 'prometheus': {'entity': 'Freshness_state', 'metric': 'CODECO_freshness', 'server': {'port': 9090, 'url': 'http://prometheus-k8s.monitoring.svc.cluster.local'}}, 'schedule': 300}, 'cronSchedule': '{{ cat (untilStep (mod (randNumeric 3 | atoi) (.Values.frequency | int) | int) 59 (.Values.frequency | int) | join ",") "* * * *" }}', 'frequency': 5, 'image': {'name': 'mdm-connector-prometheus', 'pullPolicy': 'Always', 'repository': 'hecodeco'}, 'k8sauth': {'enabled': False}, 'oidc': {'authServerUrl': 'https://local.oidc.server', 'clientId': 'pf-connector', 'clientSecret': 'xxxxxxxxxxxxxxx', 'enabled': False}, 'pathfinder': {'connector': {'id': 'prometheus-connector-default', 'state': {'access-key': 'XXX', 'bucket-name': 'pathfinder-test-cos-connector-status', 'secret-key': 'YYY', 'service-endpoint': 's3.us-south.cloud-object-storage.appdomain.cloud', 'signing-region': 'us-south', 'type': 'local'}}, 'kubernetesUrl': 'https://kubernetes.default.svc', 'url': 'http://mdm-api.he-codeco-mdm:8090/mdm/api/v1'}, 'podAnnotations': {}, 'podSecurityContext': {}, 'resources': {}, 'securityContext': {}, 'serviceAccount': {'annotations': {}, 'automount': True, 'create': True, 'name': ''}, 'stopMode': 'stop', 'suspended': False}
{'CAcerts': [], 'crawler': {'k8s': {'api_url': 'https://kubernetes.default.svc'}, 'prometheus': {'entity': 'Freshness_state', 'metric': 'CODECO_freshness', 'server': {'port': 9090, 'url': 'http://prometheus-k8s.monitoring.svc.cluster.local'}}, 'schedule': 300}, 'cronSchedule': '{{ cat (untilStep (mod (randNumeric 3 | atoi) (.Values.frequency | int) | int) 59 (.Values.frequency | int) | join ",") "* * * *" }}', 'frequency': 5, 'image': {'name': 'mdm-connector-prometheus', 'pullPolicy': 'Always', 'repository': 'hecodeco'}, 'k8sauth': {'enabled': False}, 'oidc': {'authServerUrl': 'https://local.oidc.server', 'clientId': 'pf-connector', 'clientSecret': 'xxxxxxxxxxxxxxx', 'enabled': False}, 'pathfinder': {'connector': {'id': 'prometheus-connector-default', 'state': {'access-key': 'XXX', 'bucket-name': 'pathfinder-test-cos-connector-status', 'secret-key': 'YYY', 'service-endpoint': 's3.us-south.cloud-object-storage.appdomain.cloud', 'signing-region': 'us-south', 'type': 'local'}}, 'kubernetesUrl': 'https://kubernetes.default.svc', 'url': 'http://mdm-api.he-codeco-mdm:8090/mdm/api/v1'}, 'podAnnotations': {}, 'podSecurityContext': {}, 'resources': {}, 'securityContext': {}, 'serviceAccount': {'annotations': {}, 'automount': True, 'create': True, 'name': ''}, 'stopMode': 'stop', 'suspended': False}
2024-12-04:17:54:46,536 INFO     [eventpublisher.py:114] Load last connector state from json file
2024-12-04:17:54:46,537 INFO     [eventpublisher.py:185] Pf-model-registry url: http://mdm-api.he-codeco-mdm:8090/mdm/api/v1
loop
2024-12-04:17:54:46,539 DEBUG    [connectionpool.py:243] Starting new HTTP connection (1): prometheus-k8s.monitoring.svc.cluster.local:9090
Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/connection.py", line 199, in _new_conn
    sock = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/util/connection.py", line 85, in create_connection
    raise err
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/util/connection.py", line 73, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/connectionpool.py", line 789, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/connectionpool.py", line 495, in _make_request
    conn.request(
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/connection.py", line 441, in request
    self.endheaders()
  File "/usr/lib64/python3.11/http/client.py", line 1298, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib64/python3.11/http/client.py", line 1058, in _send_output
    self.send(msg)
  File "/usr/lib64/python3.11/http/client.py", line 996, in send
    self.connect()
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/connection.py", line 279, in connect
    self.sock = self._new_conn()
                ^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/connection.py", line 214, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x76503feb1950>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.11/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/connectionpool.py", line 843, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/urllib3/util/retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='prometheus-k8s.monitoring.svc.cluster.local', port=9090): Max retries exceeded with url: /api/v1/query?query=CODECO_freshness%7Bkubernetes_pod_name%21%3D%27%27%7D (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x76503feb1950>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/app-root/src/python/connector.py", line 99, in <module>
    main()
  File "/opt/app-root/src/python/connector.py", line 62, in main
    response =requests.get(config.PROMETHEUS_URL +
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/app-root/lib64/python3.11/site-packages/requests/adapters.py", line 700, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='prometheus-k8s.monitoring.svc.cluster.local', port=9090): Max retries exceeded with url: /api/v1/query?query=CODECO_freshness%7Bkubernetes_pod_name%21%3D%27%27%7D (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x76503feb1950>: Failed to establish a new connection: [Errno 111] Connection refused'))

If this is not how the pod should behave, and it should have been able to connect to Prometheus service, please let us know. Otherwise, we have manually fixed this issue:

a possible solution that worked for us is to label the namespace of he-codeco-mdm, then use this along the freshness connector deployment label to add a new network policy rule to allow access: (https://kubernetes.io/docs/concepts/services-networking/network-policies/#allow-all-ingress-traffic)

  1. kubectl label namespace he-codeco-mdm namespace=he-codeco-mdm
  1. kube-prometheus/manifests/prometheus-networkPolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  labels:
    app.kubernetes.io/component: prometheus
    app.kubernetes.io/instance: k8s
    app.kubernetes.io/name: prometheus
    app.kubernetes.io/part-of: kube-prometheus
    app.kubernetes.io/version: 3.0.0
  name: prometheus-k8s
  namespace: monitoring
spec:
  egress:
  - {}
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: prometheus
    ports:
    - port: 9090
      protocol: TCP
    - port: 8080
      protocol: TCP
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: prometheus-adapter
    ports:
    - port: 9090
      protocol: TCP
  - from:
    - podSelector:
        matchLabels:
          app.kubernetes.io/name: grafana
    ports:
    - port: 9090
      protocol: TCP
  - from:  ##Changed
    - podSelector: ##Changed
        matchLabels: ##Changed
          app.kubernetes.io/instance: freshness-connector ##Changed
      namespaceSelector: ##Changed
        matchExpressions: ##Changed
        - key: namespace ##Changed
          operator: In ##Changed
          values: ["he-codeco-mdm"] ##Changed
    ports:
    - port: 9090
      protocol: TCP
    - port: 8080
      protocol: TCP

  podSelector:
    matchLabels:
      app.kubernetes.io/component: prometheus
      app.kubernetes.io/instance: k8s
      app.kubernetes.io/name: prometheus
      app.kubernetes.io/part-of: kube-prometheus
  policyTypes:
  - Egress
  - Ingress
 
  1. after deleting the freshness-connector-check-connection and deploying again, the pod is running, with the following logs:
kubectl logs freshness-connector-check-connection-68fdc47456-m6bql -n he-codeco-mdm

{'CAcerts': [], 'crawler': {'k8s': {'api_url': 'https://kubernetes.default.svc'}, 'prometheus': {'entity': 'Freshness_state', 'metric': 'CODECO_freshness', 'server': {'port': 9090, 'url': 'http://prometheus-k8s.monitoring.svc.cluster.local'}}, 'schedule': 300}, 'cronSchedule': '{{ cat (untilStep (mod (randNumeric 3 | atoi) (.Values.frequency | int) | int) 59 (.Values.frequency | int) | join ",") "* * * *" }}', 'frequency': 5, 'image': {'name': 'mdm-connector-prometheus', 'pullPolicy': 'Always', 'repository': 'hecodeco'}, 'k8sauth': {'enabled': False}, 'oidc': {'authServerUrl': 'https://local.oidc.server', 'clientId': 'pf-connector', 'clientSecret': 'xxxxxxxxxxxxxxx', 'enabled': False}, 'pathfinder': {'connector': {'id': 'prometheus-connector-default', 'state': {'access-key': 'XXX', 'bucket-name': 'pathfinder-test-cos-connector-status', 'secret-key': 'YYY', 'service-endpoint': 's3.us-south.cloud-object-storage.appdomain.cloud', 'signing-region': 'us-south', 'type': 'local'}}, 'kubernetesUrl': 'https://kubernetes.default.svc', 'url': 'http://mdm-api.he-codeco-mdm:8090/mdm/api/v1'}, 'podAnnotations': {}, 'podSecurityContext': {}, 'resources': {}, 'securityContext': {}, 'serviceAccount': {'annotations': {}, 'automount': True, 'create': True, 'name': ''}, 'stopMode': 'stop', 'suspended': False}
{'CAcerts': [], 'crawler': {'k8s': {'api_url': 'https://kubernetes.default.svc'}, 'prometheus': {'entity': 'Freshness_state', 'metric': 'CODECO_freshness', 'server': {'port': 9090, 'url': 'http://prometheus-k8s.monitoring.svc.cluster.local'}}, 'schedule': 300}, 'cronSchedule': '{{ cat (untilStep (mod (randNumeric 3 | atoi) (.Values.frequency | int) | int) 59 (.Values.frequency | int) | join ",") "* * * *" }}', 'frequency': 5, 'image': {'name': 'mdm-connector-prometheus', 'pullPolicy': 'Always', 'repository': 'hecodeco'}, 'k8sauth': {'enabled': False}, 'oidc': {'authServerUrl': 'https://local.oidc.server', 'clientId': 'pf-connector', 'clientSecret': 'xxxxxxxxxxxxxxx', 'enabled': False}, 'pathfinder': {'connector': {'id': 'prometheus-connector-default', 'state': {'access-key': 'XXX', 'bucket-name': 'pathfinder-test-cos-connector-status', 'secret-key': 'YYY', 'service-endpoint': 's3.us-south.cloud-object-storage.appdomain.cloud', 'signing-region': 'us-south', 'type': 'local'}}, 'kubernetesUrl': 'https://kubernetes.default.svc', 'url': 'http://mdm-api.he-codeco-mdm:8090/mdm/api/v1'}, 'podAnnotations': {}, 'podSecurityContext': {}, 'resources': {}, 'securityContext': {}, 'serviceAccount': {'annotations': {}, 'automount': True, 'create': True, 'name': ''}, 'stopMode': 'stop', 'suspended': False}
2024-12-04:18:11:11,697 INFO     [eventpublisher.py:114] Load last connector state from json file
2024-12-04:18:11:11,697 INFO     [eventpublisher.py:185] Pf-model-registry url: http://mdm-api.he-codeco-mdm:8090/mdm/api/v1
loop
2024-12-04:18:11:11,700 DEBUG    [connectionpool.py:243] Starting new HTTP connection (1): prometheus-k8s.monitoring.svc.cluster.local:9090
2024-12-04:18:11:12,726 DEBUG    [connectionpool.py:546] http://prometheus-k8s.monitoring.svc.cluster.local:9090 "GET /api/v1/query?query=CODECO_freshness%7Bkubernetes_pod_name%21%3D%27%27%7D HTTP/11" 200 93
(.venv) codeco@master:~/codeco-on-premise/acm/scripts$ kubectl logs freshness-connector-check-connection-68fdc47456-m6bql -n he-codeco-mdm
{'CAcerts': [], 'crawler': {'k8s': {'api_url': 'https://kubernetes.default.svc'}, 'prometheus': {'entity': 'Freshness_state', 'metric': 'CODECO_freshness', 'server': {'port': 9090, 'url': 'http://prometheus-k8s.monitoring.svc.cluster.local'}}, 'schedule': 300}, 'cronSchedule': '{{ cat (untilStep (mod (randNumeric 3 | atoi) (.Values.frequency | int) | int) 59 (.Values.frequency | int) | join ",") "* * * *" }}', 'frequency': 5, 'image': {'name': 'mdm-connector-prometheus', 'pullPolicy': 'Always', 'repository': 'hecodeco'}, 'k8sauth': {'enabled': False}, 'oidc': {'authServerUrl': 'https://local.oidc.server', 'clientId': 'pf-connector', 'clientSecret': 'xxxxxxxxxxxxxxx', 'enabled': False}, 'pathfinder': {'connector': {'id': 'prometheus-connector-default', 'state': {'access-key': 'XXX', 'bucket-name': 'pathfinder-test-cos-connector-status', 'secret-key': 'YYY', 'service-endpoint': 's3.us-south.cloud-object-storage.appdomain.cloud', 'signing-region': 'us-south', 'type': 'local'}}, 'kubernetesUrl': 'https://kubernetes.default.svc', 'url': 'http://mdm-api.he-codeco-mdm:8090/mdm/api/v1'}, 'podAnnotations': {}, 'podSecurityContext': {}, 'resources': {}, 'securityContext': {}, 'serviceAccount': {'annotations': {}, 'automount': True, 'create': True, 'name': ''}, 'stopMode': 'stop', 'suspended': False}
{'CAcerts': [], 'crawler': {'k8s': {'api_url': 'https://kubernetes.default.svc'}, 'prometheus': {'entity': 'Freshness_state', 'metric': 'CODECO_freshness', 'server': {'port': 9090, 'url': 'http://prometheus-k8s.monitoring.svc.cluster.local'}}, 'schedule': 300}, 'cronSchedule': '{{ cat (untilStep (mod (randNumeric 3 | atoi) (.Values.frequency | int) | int) 59 (.Values.frequency | int) | join ",") "* * * *" }}', 'frequency': 5, 'image': {'name': 'mdm-connector-prometheus', 'pullPolicy': 'Always', 'repository': 'hecodeco'}, 'k8sauth': {'enabled': False}, 'oidc': {'authServerUrl': 'https://local.oidc.server', 'clientId': 'pf-connector', 'clientSecret': 'xxxxxxxxxxxxxxx', 'enabled': False}, 'pathfinder': {'connector': {'id': 'prometheus-connector-default', 'state': {'access-key': 'XXX', 'bucket-name': 'pathfinder-test-cos-connector-status', 'secret-key': 'YYY', 'service-endpoint': 's3.us-south.cloud-object-storage.appdomain.cloud', 'signing-region': 'us-south', 'type': 'local'}}, 'kubernetesUrl': 'https://kubernetes.default.svc', 'url': 'http://mdm-api.he-codeco-mdm:8090/mdm/api/v1'}, 'podAnnotations': {}, 'podSecurityContext': {}, 'resources': {}, 'securityContext': {}, 'serviceAccount': {'annotations': {}, 'automount': True, 'create': True, 'name': ''}, 'stopMode': 'stop', 'suspended': False}
2024-12-04:18:11:11,697 INFO     [eventpublisher.py:114] Load last connector state from json file
2024-12-04:18:11:11,697 INFO     [eventpublisher.py:185] Pf-model-registry url: http://mdm-api.he-codeco-mdm:8090/mdm/api/v1
loop
2024-12-04:18:11:11,700 DEBUG    [connectionpool.py:243] Starting new HTTP connection (1): prometheus-k8s.monitoring.svc.cluster.local:9090
2024-12-04:18:11:12,726 DEBUG    [connectionpool.py:546] http://prometheus-k8s.monitoring.svc.cluster.local:9090 "GET /api/v1/query?query=CODECO_freshness%7Bkubernetes_pod_name%21%3D%27%27%7D HTTP/11" 200 93

Hoping to fix this issue in the main deployment script to avoid this error.

Edited by dalal ali