Google ADK + Next.js SSE Streaming stops working on Cloud Run

The Setup:
I have an Agentic application deployed on GCP Cloud Run:

  1. UI: Next.js (App Router) deployed on Cloud Run.

  2. Backend: Python service using Google ADK on Cloud Run.

  3. Network: Usually sits behind a Global External Load Balancer, but we have tested bypassing it.

Client (Browser) → External HTTPS Load Balancer → Cloud Run (Next.js UI) → Cloud Run (Python Backend)

The Problem:
My ADK Agent streaming response (SSE) is being buffered.

  • The agent takes ~10 seconds to stream the full response.

  • Locally: It works perfectly (text streams token-by-token).

  • On Cloud Run: The browser hangs for 10 seconds (loading state), then receives the entire response all at once.

Crucial Observations:

  1. Fails without Load Balancer: We tested hitting the Next.js Cloud Run URL directly (bypassing the External LB). The streaming still fails/buffers. This proves the issue is closer to the container (Cloud Run Ingress or Next.js config).

  2. Headers:

    • Python (ADK) Output: Sends Transfer-Encoding: chunked.

    • Next.js Internal Log: Receives Transfer-Encoding: chunked from the ADK backend.

    • Browser (Final Response): Receives Content-Length: 38002 (INCORRECT).

    • Conclusion: Something is waiting for the stream to finish to calculate the content length.

  3. Environment Drift: This setup was working fine on a client environment with a Load Balancer, then stopped working automatically without code changes. Redeploying the “known good” old image does not fix it, suggesting a drift in Cloud Run default configurations.

What We Have Tried:

  • CPU Allocation: Set to “CPU is always allocated” on both services. (Backend logs confirm the ADK agent is active and printing logs during the 10s wait, so it’s not freezing).

  • Headers: Added X-Accel-Buffering: no, Cache-Control: no-cache, Content-Type: text/event-stream.

  • Code: Verified the exact same Docker image works locally.

My Questions:

  1. Since this started happening after a redeploy (even with old images), did Cloud Run or GCP Load Balancers change default behaviors?
  2. Is there a specific configuration for Google ADK (Python) + Next.js that prevents the middleware from buffering the stream?
1 Like

Hey,

Hope you’re keeping well.

Cloud Run itself hasn’t changed default behavior to buffer SSE, but streaming can break if any hop in your chain sets a Content-Length or fully reads the body before passing it on. In Next.js (especially with the App Router), some middleware or route handlers may implicitly buffer the response, which prevents chunked transfer from propagating. To preserve SSE, you need to ensure the handler uses new Response(stream, { headers }) without reading the stream, and that Content-Length is never set. Also check that Cloud Run is using HTTP/1.

Thanks and regards,
Taz

1 Like

Thanks for the insights!

I completely agree that Content-Length indicates something is buffering the stream, but I am fairly certain the issue isn’t in the Next.js handler code itself, and here is why:

This issue appeared immediately after a new deployment to our existing Cloud Run instance. To debug, we rolled back and redeployed the exact same Docker image (SHA) that was working perfectly just a few days ago. Surprisingly, streaming is now broken on that old image as well.

Since the code/container is identical to the version that was previously streaming successfully, I suspect the act of redeploying might have reset a service-level configuration rather than it being a logic issue in the route handler.

Has anyone experienced Cloud Run changing ingress buffering rules upon a fresh revision deployment? That’s the only variable that changed between ‘working’ and ‘not working’ with the same image.

1 Like