Insane Cloud Text-to-Speech Output Token Usage

Almutasem_Albaadani · November 13, 2025, 2:46pm

I have been testing the new gemini-2.5-flash-tts google cloud text to speech streaming feature and I used about 47k input tokens and saw a charge of 48m output tokens which is crazy (about 544 hours of audio from 47k tokens). Is this even possible or correct billing ?. We were considering using it in production but not anymore.

AnRan · November 20, 2025, 1:19am

I have the same issue. Specifically on Nov 13, I see an unrealistic 9.7M output token generated by gemini-2.5-flash-tts which is about 6600+ minutes of audio. This is on a test account, not accessible to anyone and looking at service logs that I invoked, my entire usage could not have been more than 100-200k tokens. I double checked the token usage metadata reported on the test ‘text’ I was using it was about 2.5k output token. I might have run the test no more than 40-50 times so about 100-125k would be realistic. 9.7M is 100x higher ! I tried contacting billing support but since it is on trial period, I could only get access to AI which was not helpful at all.

Topic		Replies	Views
Unexpectedly High Minimum Charges for Gemini 1.5 Flash API Usage AI APIs cloud-natural-language-api	5	174	March 20, 2026
Unexpected 'Number of videos generated' billing for Gemini API text requests – possible API key leak? Generative AI & Foundational Models gemini	2	210	September 5, 2025
Text-to-Speech Studio Voice Pricing Change? AI APIs text-to-speech	2	78	April 5, 2024

Insane Cloud Text-to-Speech Output Token Usage

AI Suggested topics