Vertex AI Fails After First Request in Cloud Run – “Could Not Resolve project_id” Error

Hi everyone,

I’m working on a Flask application deployed on Google Cloud Run that uses Vertex AI via the langchain_google_vertexai.ChatVertexAI module. The application is containerized and works perfectly when I test it in Google Cloud Shell or run it locally using my own credentials.

However, when deployed in Cloud Run, the app only works for the first request after deployment. Any subsequent requests result in errors like:

ValueError: Could not resolve project_id

Here’s a summary of my setup:

:white_check_mark: I’ve set the correct project and location parameters in ChatVertexAI.

:white_check_mark: I’m using the default Cloud Run service account, which has the Vertex AI User role.

:white_check_mark: My generate() function creates a fresh LLM and agent for each request.

:repeat_button: The first request always succeeds, but any follow-up requests fail (unless I wait several minutes, then it works once again).

? Logs point to issues in litellm token fetching: “Could not resolve project_id”, even though it’s passed explicitly.

It seems like either:

Token refresh or credential caching is broken in Cloud Run,

The default service account doesn’t work well with Vertex AI in this setup,

Or concurrency/memory limits are causing unexpected failures.

I’ve already tried setting GOOGLE_CLOUD_PROJECT, manually refreshing credentials, and verifying the project ID. None of these fixed the issue.

Has anyone faced this issue with LangChain + Vertex AI + Cloud Run?

What’s the best way to make this stable in production?

Hi @VON1,

Welcome to Google Cloud Community!

Here are some possible causes and suggestions that may help resolve the issue:

  • The error you’re encountering possibly suggests an issue with how Cloud Run handles authentication refresh, especially with Cloud Run’s instance lifecycle. Try credential refresh before each ChatVertexAI request.
  • As you mentioned, you’re using the default Cloud Run Service account. Consider trying a user-managed service account to eliminate potential causes.
  • Check for any compatibility and version issues between the services you’re using, as these can sometimes cause problems.
  • Check for more detailed Cloud Logging to get a better understanding of the approach to litellm credential handling.

In addition, you might find it helpful to check and follow this similar and open Github discussion.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

Hi @marckevin and @VON1 ,
I’m facing a similar issue here.

I’m using crew ai based crews in langgraph based graph’s nodes in a python fastapi backend.
I have a graph invocation when hit /invoke in server.

langchain_google_vertexai import ChatVertexAI
I’m using ChatVertexAI python module, and passing its object to crews , or invoking from its object directly.

llm_gemini = ChatVertexAI(
	model="gemini-2.0-flash-001",
	temperature=0.5,
	top_p=0.2,
	project=project_id,
	location=location,
)

Example of langgraph node

def generate_bio(state: StateIn) -> StateIn:
    """
    Crafts a professional and engaging author biography for each book.

    Args:
        state (StateIn): The current state containing books.

    Returns:
        StateIn: The updated state with author biographies for each book.
    """
    bio_agent = Agent(
        role="Author Bio Architect",
        goal="Craft a compelling, trust-building author bio that resonates deeply with the book's philosophical and professional themes.",
        backstory=(
            "You are an accomplished narrative strategist and editorial consultant. "
            "You specialize in transforming professional backgrounds into relatable and inspiring biographies. "
            "Your bios bridge the gap between expertise and human connection, making the author feel both credible and approachable."
        ),
        llm=llm,
    )

    def create_bio_task(book) -> Task:
        return Task(
            description=(
                f"Write an engaging and professional author biography for the book '{book.book_title}'. "
                f"Write the biography in {state['output_lang']} language."
            ),
            expected_output="A biography string in the target language.",
            agent=bio_agent,
        )

    for book in state["books"]:
        bio_task = create_bio_task(book)
        crew = Crew(
            agents=[bio_agent],
            tasks=[bio_task],
        )
        bio_results = crew.kickoff()
        book.bio = str(bio_results)
    return {}

So after 1 or 2 llm calls, it thows error Project not Found error.

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/litellm/main.py", line 2465, in completion
    model_response = vertex_chat_completion.completion(  # type: ignore
  File "/usr/local/lib/python3.10/site-packages/litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py", line 1410, in completion
    _auth_header, vertex_project = self._ensure_access_token(
  File "/usr/local/lib/python3.10/site-packages/litellm/llms/vertex_ai/vertex_llm_base.py", line 135, in _ensure_access_token
    return self.get_access_token(
  File "/usr/local/lib/python3.10/site-packages/litellm/llms/vertex_ai/vertex_llm_base.py", line 335, in get_access_token
    raise ValueError("Could not resolve project_id")
ValueError: Could not resolve project_id

I am deploying the application on GCP cloud run service.
In env vars, I have not mentioned GOOGLE_APPLICATION_CREDENTIALS as Cloud Run uses Application Default Credentials (ADCs) directly in GCP’s VM resource. Instead I used default google auth.
creds, project_id = google.auth.default(scopes=scope)

Thanks for confirming — I ran into the same behavior.

I also passed the project and location explicitly in ChatVertexAI and assigned the Vertex AI User role to the Cloud Run service account, but it still failed after the first request.

I first tried upgrading litellm (which is used internally by langchain_google_vertexai) to >=1.70.2, but the issue persisted. It also caused compatibility issues.

What finally worked was building a custom LLM integration in CrewAI, following this guide:
:link: Custom LLM Implementation - CrewAI

This bug is being tracked here in the LiteLLM repo:
:link: [Bug]: GCP Service Account Credentials not refreshing appropriately when set via env var · Issue #9863 · BerriAI/litellm · GitHub

I followed the official google-genai Python SDK to make the Vertex AI calls directly.

Since switching to this approach, everything has been stable in Cloud Run.