Vertex AI agent response using Gemini-1.5-flash truncated to 512 tokens

I am usinging the Vertex AI Agent Builder to create a playbook with agents and tools. I am getting good results but the output is truncated to 512 tokens even when I set the output token limit to 1024 tokens. My input token limit is set to 8K and my prompt input is < 8K.

Please find attached screenshot of the settings of my playbook on the left and the truncated output on the right.

  • Am I missing any settings?

  • I checked the pricing (https://ai.google.dev/pricing) and don’t see any limits specified even for the free version and I seem to be well within 8K for my input and output.

Any insights would be appreciated.

1 Like

It sounds like you’re encountering a token truncation issue, which may stem from a few potential causes despite setting your output limit to 1024 tokens. Here are some factors you can investigate:-

  1. System-Level Token Cap

  2. Tool-Specific Token Management

  3. Post-Processing Steps

  4. Playbook Agent Setting Overrides

1 Like

Thanks for your response @sahilnaircool . It was helpful as it made me realize I had selected different gen AI models on my tools. Probably prevented me from setting one output token limit across them. It is now working.

1 Like

Your Welcome!