CX Agent Studio - Calling an authenticated Cloud Run Function from an Agent

What is the recommended way to call an authenticated Cloud Run Function from a CES Agent Studio agent?

  • Python tool calls do not pass an auth header with the CES Requests functions, and there does not appear to be one available in the Python environment. Nor does there seem to be a secret manager for the Tools, meaning Auth would need to be handrolled inside the function.
  • OpenAPI tools calls leverage an LLM, and the token limit on that is tiny - 1000 tokens per minute. This limit is maxed in one call to this API. You can technically call this from a Python function, but you can’t bypass the LLM, which means unnecessary token costs for no gain if you do call it (and then you hit the limit)
  • There is no Cloud Function Integration Connector available.

How do I call an authenticated Cloud Run Function from CES Agent Studio?

Background

I’m attempting to port an existing agent from CX over to Agent Studio. The old agent leverages an Open API tool for various retrievals. This was working in CX, but when I ported it into CES Agent Studio the API token limit immediately maxed out.

I then attempted to set up a Python call to the Cloud function, but unlike the OpenAPI tool, there is no auth embedded when using the ces requests helper function.

I reviewed the online documentation, and I was unable to locate a way to call a cloud function without hand rolling auth and embedding a secret in the code. (There does not seem to be a secret manager attached to the tools either, so there doesn’t appear to be a safe way to manage credentials for this use case)

I could call the OpenAPI tool from the Python tool. This gives me the authenticated connection to the cloud function, however this then hits the LLM token limits, and I don’t need the LLM processing this step.

Ideally, I would have my Python code call the cloud function directly with no LLM in the loop. This allows me to post process the API result into a simplified response for the Agent to consume, as well as cache large object returns for later consumption without loading them into the LLM context.

Short of rewriting the API or spinning up an MCP server, is there any CES Agent Studio native solution for calling a Cloud Function as part of a Python tool call?

Hi,

I’m not a total specialist in the deep internals of the CES Agent Studio yet, but I’ve been doing some heavy digging into this specific transition from Dialogflow CX. While we wait for one of the Google engineers to give a definitive “official” word, I’ve synthesized a few architectural strategies that should help you bypass those 1000 TPM (Tokens Per Minute) limits and the authentication hurdles.

It sounds like you’re caught between the “black box” of OpenAPI tools and the “empty box” of the Python sandbox. Here is how you can bridge that gap.

1. The “Metadata Server Bridge” for Authentication

Since the Python tool sandbox lacks the google-auth libraries and Application Default Credentials (ADC) aren’t automatically injected into ces_requests, you have to go “low-level.”

Even in a restricted sandbox, you can usually reach the Google Cloud Metadata Server. This is the most native way to get an OIDC token without hardcoding anything.

The Workflow:

  1. Grant Access: Ensure the CES Service Agent (usually service-{PROJECT_NUMBER}@gcp-sa-ces.iam.gserviceaccount.com) has the Cloud Run Invoker role for your target function.

  2. Fetch the Token: Use ces_requests to call the internal metadata endpoint.

  3. Inject the Header: Use that token in your actual call to the Cloud Run Function.

Example Logic:

  • Endpoint: http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/identity?audience=[YOUR_FUNCTION_URL]

  • Header required: Metadata-Flavor: Google

2. Managing Secrets via REST API

Since there is no “Secrets” tab in the UI, you can treat Secret Manager as just another authenticated API.

  • Grant the CES Service Agent the Secret Manager Secret Accessor role.

  • Use the ces_requests utility to hit the Secret Manager REST API:

    https://secretmanager.googleapis.com/v1/projects/{PROJECT}/secrets/{NAME}/versions/latest:access

  • You’ll need an OAuth2 access token (also retrievable via the Metadata Server) to authorize this call.

3. Beating the 1000 TPM Limit: “Code-as-Filter”

The reason your OpenAPI tools are crashing is that the LLM is acting as a JSON parser, which is incredibly “token-expensive.” To fix this, your Python tool needs to perform Context Engineering.

Instead of returning the raw JSON to the agent, your Python code should prune, summarize, or filter the data first.

Data Type Raw Size (Tokens) Pruned Size (Tokens) Token Savings
Product Catalog ~2,500 ~150 94%
User History ~1,200 ~100 91%

The “Pointer” Pattern

If you have a massive dataset (like a 50MB PDF or a huge log), don’t return it to the LLM. 1. Save the data to Firestore or Cloud Storage.

2. Return a “Pointer” and a 2-sentence summary to the Agent (e.g., “Found the error log; it’s 500 lines long. I’ve stored it under ID_123. Would you like a summary of the specific error?”).

4. The Long-Term Play: Model Context Protocol (MCP)

If your logic is getting too complex for the Python sandbox (e.g., you need specific libraries like pandas or numpy that aren’t available), the recommended enterprise path is MCP.

By hosting an MCP server on Cloud Run:

  • You get a full, unrestricted Python environment.

  • Agent Studio handles the OIDC authentication natively (using the “Service Agent ID Token” config).

  • You decouple the “Reasoning” (the Agent) from the “Execution” (your Tool).

Final Strategy Summary

To optimize for cost and performance, keep this “Token Velocity” formula in mind:

image

By using the Python Tool as a high-pass filter (cleaning the data before the LLM sees it), you drastically reduce the “Tool Overhead” and stay well under that 1000 TPM ceiling ?

I hope this helps get your migration back on track ! Let’s see if anyone else has a more “out-of-the-box” solution, as I understand the Metadata Server + Pruning approach is usually the most robust way to handle this today.

Quick Checklist for you:

  • [ ] Is the CES Service Agent added as a Run Invoker?

  • [ ] Are you using ces_requests for the Metadata handshake?

  • [ ] Are you stripping out all the “JSON noise” before the return statement?

Good luck with the porting !

@oamne - it looks like you’ve used Generative AI to provide a non-relevant answer.

  1. The Metadata server is not accessible from the Python tool call. Likely sandboxed. No relevant debugging information is provided by CES Requests to resolve this.
  2. Irrelevant given 1.
  3. Hasn’t understood my question. The LLM is baked into the OpenAPI tool calls as described in my post.
  4. MCP - I already noted this as a high effort, not ideal workaround.

You probably thought a generative response might assist. However, this is a very specific technical question requiring practical experience in the code and/or implementation. Generative is out of its depth at this point. All other investigative paths were exhausted prior to raising my question. Please don’t respond with further Generative AI responses.

@aniketagrawal - thoughts? How would you go about calling an authenticated Cloud Function from a Python tool within CES Agent Studio?

(Nutshell Notes: ces_requests is unauthenticated, the OpenAPI tools have a very low token [1000] restriction which is maxing for the API, and there is no MCP server available - I’m looking for a straightforward, deterministic way to hit a basic, authenticated, cloud run endpoint)