Telephony STT in production

2010b9 · April 1, 2026, 8:58am

Is anyone using (or has used) the telephony STT for voice AI use cases where a user speaks to an AI agent through voice in realtime?

I’ve read in the Cloud Speech-to-Text V2 docs that the telephony model should be used "for audio that originates from an audio phone call, typically recorded at an 8 kHz sampling rate. Ideal for customer service, teleconferencing, and automated kiosk applications”.

I’ve tested the model – it seems to work well in noisy environments and it is faster than the chirp_2 and chirp_3 models. However, I haven’t found anything online about people using this model. If anyone has used it, or is using it, can you please share your experience?

Thanks in advance!

Topic		Replies	Views
Chirp Model with Multi Channel Recordings Seems Broken AI APIs speech-to-text	1	75	April 26, 2024
Stability and hallucination issues with STT Chirp 3 Preview model for Cantonese (yue-Hant-HK) AI APIs speech-to-text	0	25	May 15, 2026
What is the Rate Limit for the Chirp TTS Model? AI APIs text-to-speech	2	293	April 7, 2025

Telephony STT in production

AI Suggested topics