In our GKE environment, we’re running pods as KEDA jobs. Since September 18, 2025, we’ve noticed that some pods are unable to make any outgoing requests.
But we could be able to login to the pod and execute the commands.
GKE Version (Autopilot Cluster): 1.33.4-gke.1134000
This issue doesn’t affect all pods; so far, we’ve only observed it in one specific namespace, and we haven’t been able to reproduce it consistently. The problem seems to occur with just one pod per day within that namespace.
Initially, we suspected a DNS issue, but even when trying to ping external services like google or the IP address of another pod, the requests are failing.
Here’s what we’ve observed:
Ping google:
/ # ping -c 4 google.com
ping: bad address 'google.com'
Ping 8.8.8.8:
/ # ping -c 4 8.8.8.8
PING 8.8.8.8 (8.8.8.8): 56 data bytes
--- 8.8.8.8 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
Routing Table:
/ # route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.56.3.1 0.0.0.0 UG 0 0 0 eth0
10.56.3.0 10.56.3.1 255.255.255.192 UG 0 0 0 eth0
10.56.3.1 0.0.0.0 255.255.255.255 UH 0 0 0 eth0
Ping Gateway (10.56.3.1):
/ # ping -c 4 10.56.3.1
PING 10.56.3.1 (10.56.3.1): 56 data bytes
--- 10.56.3.1 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
DNS Configuration:
/ # cat /etc/resolv.conf
search NAMESPACE.svc.cluster.local svc.cluster.local cluster.local c.CLUSTER_NAME.internal google.internal
nameserver NAMESPACE_SERVER_ADDRESS
options ndots:5
Network Configuration:
/ # arp -a
? (10.56.3.1) at <incomplete> on eth0
/ # ^C
/ # ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: eth0@if29: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 8896 qdisc noqueue state UP qlen 1000
link/ether b6:c2:de:40:00:09 brd ff:ff:ff:ff:ff:ff
inet 10.56.3.24/26 brd 10.56.3.63 scope global eth0
valid_lft forever preferred_lft forever
/ # ifconfig
eth0 Link encap:Ethernet HWaddr B6:C2:DE:40:00:09
inet addr:10.56.3.24 Bcast:10.56.3.63 Mask:255.255.255.192
UP BROADCAST RUNNING MULTICAST MTU:8896 Metric:1
RX packets:5 errors:0 dropped:0 overruns:0 frame:0
TX packets:10917 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:446 (446.0 B) TX bytes:458778 (448.0 KiB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:7376 errors:0 dropped:0 overruns:0 frame:0
TX packets:7376 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:967978 (945.2 KiB) TX bytes:967978 (945.2 KiB)
This leads us to believe the issue could involve network connectivity or routing for this specific pod.
We also noticed that this GKE version was released on Sept 18, 2025 and we’re facing this issue from Sept 19 and we’ve been running these jobs for past 6 months and didn’t notice this kind of issue. GKE release notes (Regular channel) | Google Kubernetes Engine (GKE) | Google Cloud Documentation