Mixed-language transcription issue in Gemma-3n-e4b-it ASR

Dibyajyoti_Mishra1 · November 4, 2025, 6:06am

Hello team,

I am testing the Gemma-3n-e4b-it model for automatic speech recognition (ASR) tasks, where I provide an audio file as input and expect the spoken text transcription as output.

While the model performs well in general, I am consistently observing an unexpected language-mixing behavior for some languages, Like Punjabi.

Issue details:

I provide audio samples that contain only Punjabi speech.

My prompt explicitly specifies Punjabi as the target transcription language, for example:

"Transcribe this audio to Punjabi. Output only the transcription."
"Transcribe ONLY the spoken Punjabi words exactly as heard. Stop immediately when the audio ends."

However, for some segments, the model outputs text partially or entirely in Hindi (Devanagari script) instead of Punjabi (Gurmukhi script).

Examples:

🎧 File: Punjabi_audio_chunks/chunk_0005.wav  
💬 Output: ਟੈੱਕੇ ਬਜ਼ੀ ਨਾ ਫੈਂਟੂ ਬਹੁਤ ਸਭ ਤੇਰੇ ਪੰਗੇ ਪਾਏ ਹੋਏ ਆ।   ✅ (Correct, Punjabi - Gurmukhi)

🎧 File: Punjabi_audio_chunks/chunk_0007.wav  
💬 Output: अच्छा दी चेटि ला च   ❌ (Incorrect, Hindi - Devanagari)

Is there a way to force or lock the output language/script in Gemma-3n-e4b-it (for example, through language tokens or prompt parameters)? Please review this issue and help me to resolve it.

Topic		Replies	Views
Inconsistent transcription language with gemini-2.5-flash-preview-native-audio-dialog Custom ML & MLOps gemini-in-looker , agent-platform	1	203	July 4, 2025
Tamil as a language is getting blocked AI APIs cloud-natural-language-api	0	21	July 9, 2024
Stability and hallucination issues with STT Chirp 3 Preview model for Cantonese (yue-Hant-HK) AI APIs speech-to-text	0	72	May 15, 2026

Mixed-language transcription issue in Gemma-3n-e4b-it ASR

Issue details:

Examples:

AI Suggested topics