Hello,
I am looking for support to have my project allowlisted for Gemini native audio output (response_modalities: ["AUDIO"]) and the Multimodal Live API.
I posted this request on the Google AI for Developers forum to get allowlisted, but was told that I should post here instead for assistance.
I am migrating my architecture from a traditional STT → LLM → TTS pipeline to a native audio-to-audio multimodal flow using gemini-2.5-flash and gemini-2.5-pro. This architectural shift is critical for my use case to reduce latency and natively handle multi-language switching (e.g., English, Vietnamese, Thai, French, Spanish) during live, conversational evaluations without relying on transcription steps.
Grateful for advise on the next steps or any additional information required to allowlist my project for native audio output?
Thank you,
Ali