L2SM network ping issues & LPM pod RTT errors
To test if our network is performing well in our cluster we tried to ping between two test pods on two different nodes. We can see that this works using the internal cluster ip ranges (10.244.0.0/24).
Click to expand
kubectl exec -it test-pod1 -- ping -c 4 10.244.2.241
PING 10.244.2.241 (10.244.2.241): 56 data bytes
64 bytes from 10.244.2.241: seq=0 ttl=62 time=0.836 ms
64 bytes from 10.244.2.241: seq=1 ttl=62 time=0.386 ms
64 bytes from 10.244.2.241: seq=2 ttl=62 time=0.387 ms
64 bytes from 10.244.2.241: seq=3 ttl=62 time=0.458 ms
--- 10.244.2.241 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.386/0.516/0.836 ms
almende@codeco-almende:~/Documents/CODECO$ kubectl exec -it test-pod1 -- sh
/ # ip address
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0@if290: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue qlen 1000
link/ether 1a:2d:93:7c:1a:1d brd ff:ff:ff:ff:ff:ff
inet 10.244.1.190/24 brd 10.244.1.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::182d:93ff:fe7c:1a1d/64 scope link
valid_lft forever preferred_lft forever
Next we tried to ping a node of the l2sm network IP-range, but the pings are not working.
Click to expand
/ # ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2): 56 data bytes
^C
--- 10.0.0.2 ping statistics ---
20 packets transmitted, 0 packets received, 100% packet loss
/ # ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2): 56 data bytes
^C
--- 10.0.0.2 ping statistics ---
5 packets transmitted, 0 packets received, 100% packet loss
/ # ping 10.0.0.4
PING 10.0.0.4 (10.0.0.4): 56 data bytes
^C
--- 10.0.0.4 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
/ #
Also, when in an LPM pod, pinging the other PI works, but internal pings do not.
Click to expand
<<K9s-Shell>> Pod: he-codeco-netma/steroidpi-lpm-79fd884dc8-j8gf6 | Container: lpm-container
root@steroidpi-lpm-79fd884dc8-j8gf6:/usr/src/app# ping 10.244.1.180
PING 10.244.1.180 (10.244.1.180) 56(84) bytes of data.
64 bytes from 10.244.1.180: icmp_seq=1 ttl=62 time=0.432 ms
64 bytes from 10.244.1.180: icmp_seq=2 ttl=62 time=0.453 ms
64 bytes from 10.244.1.180: icmp_seq=3 ttl=62 time=0.382 ms
64 bytes from 10.244.1.180: icmp_seq=4 ttl=62 time=0.378 ms
^C
--- 10.244.1.180 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3073ms
rtt min/avg/max/mdev = 0.378/0.411/0.453/0.032 ms
root@steroidpi-lpm-79fd884dc8-j8gf6:/usr/src/app# ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
^C
--- 10.0.0.2 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2048ms
root@steroidpi-lpm-79fd884dc8-j8gf6:/usr/src/app# ping 10.0.0.4
PING 10.0.0.4 (10.0.0.4) 56(84) bytes of data.
^C
--- 10.0.0.4 ping statistics ---
3 packets transmitted, 0 received, 100% packet loss, time 2041ms
Logs of the lpm pod:
Click to expand
│ time="2025-11-19T09:49:26Z" level=error msg="Could not measure Rtt against ip 10.0.0.5. Ping responds: exit status 1" │
│ time="2025-11-19T09:49:26Z" level=info msg="Couldn't measure net_rtt_ms between node taart5 and node vk-c5f6b0742f2b. Trying again." │
│ time="2025-11-19T09:49:26Z" level=info msg=" net_rtt_ms between node taart5 and node vk-c5f6b0742f2b is 0.000000." │
│ time="2025-11-19T09:49:41Z" level=error msg="Could not measure Rtt against ip 10.0.0.3. Ping responds: exit status 1" │
│ time="2025-11-19T09:49:41Z" level=info msg="Couldn't measure net_rtt_ms between node taart5 and node steroidpi. Trying again." │
│ time="2025-11-19T09:49:58Z" level=error msg="Could not measure Throughput. exit status 1" │
│ time="2025-11-19T09:49:58Z" level=info msg="Couldn't measure net_throughput_kbps between node taart5 and node codeco-almende. Trying again." │
│ time="2025-11-19T09:49:58Z" level=info msg=" net_throughput_kbps between node taart5 and node codeco-almende is 0.000000." │
│ time="2025-11-19T09:50:20Z" level=info msg="Measuring rtt" │
│ time="2025-11-19T09:50:31Z" level=info msg="Measuring rtt" │
│ time="2025-11-19T09:50:39Z" level=error msg="Could not measure Rtt against ip 10.0.0.2. Ping responds: exit status 1" │
│ time="2025-11-19T09:50:39Z" level=info msg="Couldn't measure net_rtt_ms between node taart5 and node codeco-almende. Trying again." │
│ time="2025-11-19T09:50:49Z" level=info msg="Measuring rtt" │
│ time="2025-11-19T09:50:51Z" level=error msg="Could not measure Rtt against ip 10.0.0.3. Ping responds: exit status 1" │
│ time="2025-11-19T09:50:51Z" level=info msg="Couldn't measure net_rtt_ms between node taart5 and node steroidpi. Trying again." │
│ time="2025-11-19T09:51:08Z" level=error msg="Could not measure Rtt against ip 10.0.0.2. Ping responds: exit status 1" │
│ time="2025-11-19T09:51:08Z" level=info msg="Couldn't measure net_rtt_ms between node taart5 and node codeco-almende. Trying again." │
│ time="2025-11-19T09:51:12Z" level=error msg="Could not measure Throughput. exit status 1" │
│ time="2025-11-19T09:51:12Z" level=info msg="Couldn't measure net_throughput_kbps between node taart5 and node vk-c865621e8b24. Trying again." │
│ time="2025-11-19T09:51:13Z" level=info msg="Measuring rtt" │
│ time="2025-11-19T09:51:25Z" level=info msg="Measuring rtt" │
│ time="2025-11-19T09:51:32Z" level=error msg="Could not measure Rtt against ip 10.0.0.3. Ping responds: exit status 1" │
│ time="2025-11-19T09:51:32Z" level=info msg="Couldn't measure net_rtt_ms between node taart5 and node steroidpi. Trying again." │
│ time="2025-11-19T09:51:32Z" level=info msg=" net_rtt_ms between node taart5 and node steroidpi is 0.000000." │
│ time="2025-11-19T09:51:44Z" level=error msg="Could not measure Rtt against ip 10.0.0.2. Ping responds: exit status 1" │
│ time="2025-11-19T09:51:44Z" level=info msg="Couldn't measure net_rtt_ms between node taart5 and node codeco-almende. Trying again." │
│ time="2025-11-19T09:51:44Z" level=info msg=" net_rtt_ms between node taart5 and node codeco-almende is 0.000000." │
│ time="2025-11-19T09:51:55Z" level=info msg="Measuring throughput"