Vertex AI url file truncation issue - gemini-2.5pro

Hello everyone,

I am writing to inquire about a potential issue with audio file truncation when using a CDN URL for audio transcription with Vertex AI.

When attempting audio transcription in Vertex AI, I used a CDN URL from a Cloudflare R2 storage bucket as the audio file source. Based on the data returned by the API, I noticed that when using the URL method, the AUDIO token count in promptTokensDetails was 54,075, specifically:

promptTokensDetails: [
  { modality: 'TEXT', tokenCount: 7 },
  { modality: 'AUDIO', tokenCount: 54075 }
]
AUDIO token count: 54075

However, when I passed the exact same file using Base64 encoding, the AUDIO token count was reported as 105,150:

promptTokensDetails: [
  { modality: 'TEXT', tokenCount: 7 },
  { modality: 'AUDIO', tokenCount: 105150 }
]
AUDIO token count: 105150

Crucially, the Base64 method yielded the complete transcription result, while the URL method only transcribed the initial portion of the audio file.

This suggests that Vertex AI might not be reading the entire file when accessed via the CDN URL.

Are there any known limitations or configuration requirements for using external CDN URLs (especially Cloudflare R2) for large file transcription? Alternatively, could this issue be related to a difference in how Base64 encoding versus URL access handles large audio files?

I appreciate any guidance or assistance you can provide to help us successfully use the URL method for complete audio transcription.

Thank you,
Jiang Feng

Hey,

Hope you’re keeping well.

It’s likely that the URL-based method is hitting a size or streaming limitation when Vertex AI fetches external audio over HTTP. If the CDN (Cloudflare R2) is serving the file with range requests disabled, partial content responses, or aggressive timeouts, the model may only process the initial segment. For large audio files, Vertex AI generally works best when the source is in Google Cloud Storage, as GCS supports resumable streaming and is fully optimized for Vertex AI ingestion.

To isolate the issue, try uploading the same file to GCS and using the gs:// URL in your request.

Thanks and regards,
Taz

1 Like

Thank you very much, actually I saw on the Vertex website that there is a 15M limit for external URLs