Diarization Support for V2 Model

bjulliana · August 13, 2025, 11:16pm

Hello,

Has anyone successfully implemented the diarization feature with Text-to-Speech V2?
From what I can tell, none of the available models or languages seem to support it—even the medical_conversation model that is supposed to support doesn’t work. Does that mean I have to downgrade to V1 to be able to use diarization?

Thank you

marckevin · August 14, 2025, 1:11pm

Hi bjulliana,

Speaker diarization is a feature of Cloud Speech-to-Text, and if you are referring to Cloud Speech-to-Text V2, based on the latest documentation, speaker diarization is currently in preview with limited regions available for en-US under medical_conversation. This means it may not yet offer the expected quality and might have limited support.

Alternatively, you can explore the Chirp 3 model, which is only available in Speech-to-Text API V2 and offers new key features such as diarization. However, please note the language availability for diarization and regional availability of the Chirp 3 model. Currently, this model is in private preview, meaning you need to be added to the allowlist to use it. To proceed, I suggest contacting Google Cloud Support, as they have better visibility into the underlying system and can assist you with specific issues.

For complete details on Chirp 3’s features, support, and limitations, please refer to the documentation.

For future updates regarding broader diarization support on Cloud Speech-to-Text V2, please keep an eye on the release notes for any new features or changes.

Topic		Replies	Views
Speaker Diarization is disabled even for supported languages in STT API V2 AI APIs speech-to-text	11	484	June 2, 2025
Problem with diarization using Portuguese Brazil (pt-BR) AI APIs speech-to-text	3	202	March 7, 2024
Speech To Text Diarization AI APIs text-to-speech , speech-to-text	2	250	February 27, 2024

Diarization Support for V2 Model

AI Suggested topics