Hello, I’m trying to get a transcription from audio with the “chirp” model using node.js Google.Cloud.Speech.V2 library.
I need to use the “chirp” model because I need transcription with punctuation in Polish, Hebrew, Malay, and some other languages, and “chirp” is the only supported model.
When trying to send a short audio file in Hebrew, GCP returned an empty transcription. Generally, sometimes it returns the correct transcription for some audio files, and sometimes an empty result.
Moreover, I sent the same request several times (using the same audio file and the same configuration), and on the 3rd or 4th time, I received the transcription.
Please note that this happens not only with Hebrew but with other languages as well. I have seen it happen with Malay and Hungarian.
Audio file example - I’m pretty sure if you submit a request using this audio recording 3-4 times you will get a transcript.
Request configuration:
import { v2 as speechV2 } from '@google-cloud/speech'
const client = new speechV2.SpeechClient({
apiEndpoint: 'us-central1-speech.googleapis.com',
})
const recognizeRequestConfig = {
recognizer: 'projects/<PROJECT_ID>/locations/us-central1/recognizers/_',
config: {
features: { enableWordTimeOffsets: true, enableAutomaticPunctuation: true },
autoDecodingConfig: {},
languageCodes: [ 'iw-IL' ],
model: 'chirp'
},
uri: 'gs://<URL>/audio/<ID>.webm'
}
const [recognizeResponse] = await client.recognize(recognizeRequestConfig)
Response:
{
recognizeResponse: {
results: [
{
alternatives: [ { words: [], transcript: '', confidence: 0 } ],
channelTag: 0,
resultEndOffset: { seconds: '1', nanos: 0 },
languageCode: 'iw-IL'
}
],
metadata: { totalBilledDuration: { seconds: '2', nanos: 0 } }
}
}