ESPv2 random 503 error (upstream_reset_before_response_started)

Dear all,

I have configured a pod with ESPv2 as a sidecar to manage requests with Cloud Endpoint. The configuration seems to work well, but occasionally I receive a 503 error (upstream_reset_before_response_started{connection_failure,immediate_connect_error:_Cannot_assign_requested_address}). My upstream backend service is still running and functioning correctly. I’m unsure how to address this issue. Do you have any suggestions?

Here is the ESPv2 sidecar configuration:

- name: esp
          imagePullPolicy: Always
          image: gcr.io/endpoints-release/endpoints-runtime:2
          args: [
            "--listener_port=9000",
            "--backend=127.0.0.1:8080", 
            "--service=api-prod.endpoints.MY-PROJECT.cloud.goog",
            "--rollout_strategy=managed",
            "--tracing_sample_rate=1.0",
            "--ssl_server_cert_path", "/etc/esp/ssl",
          ]
          volumeMounts:
            - mountPath: /etc/esp/ssl
              name: esp-ssl
              readOnly: true
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          ports:
            - containerPort: 9000
              protocol: TCP

Thank you

Hi

I would recommend following these steps to troubleshoot and directly address the issue:

(1) Check Pod Network

Ensure the Kubernetes pod can communicate with the upstream service. Review Kubernetes network policies and firewall rules.

kubectl describe networkpolicy <policy-name>

(2) Test Connectivity

From within the ESPv2 container, verify connectivity to the upstream service.

kubectl exec -it <pod-name> -c esp -- curl -I http://127.0.0.1:8080

(3) Inspect Pod Resources

Check for any resource limits on the pod that might impact network connections.

kubectl describe pod <pod-name>

(4) Check System Parameters

Adjust network-related system parameters on the node, if necessary. For example, to expand the local port range:

sysctl -w net.ipv4.ip_local_port_range="1024 65535"

(5) Increase the logging level

args: [
"--listener_port=9000",
"--log_level=debug", # Add this line
# other args...
]

(6) Ensure the ESPv2 image is up-to-date

kubectl set image deployment/<deployment-name> esp=gcr.io/endpoints-release/endpoints-runtime:latest

(7) Configure Application-Level Retries

Implement retry logic in the client application to handle intermittent 503 errors gracefully.

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

retry_strategy = Retry(
    total=3,
    status_forcelist=[503],
    method_whitelist=["HEAD", "GET", "OPTIONS"]
)

adapter = HTTPAdapter(max_retries=retry_strategy)
http = requests.Session()
http.mount("http://", adapter)
http.mount("https://", adapter)

response = http.get("http://127.0.0.1:9000")

I hope that helps

Regards

Mahmoud