Hi everyone,
I came across a error on Google Text to Speech long audio synthesis that i can’t solve. hope you can help:
code is written in nest.js while using the client library
@google-cloud/text-to-speech
import textToSpeechCaller from '@google-cloud/text-to-speech';
const text = './//Really long text with a length of 907033 characters';
const texttospeechClient =
new textToSpeechCaller.TextToSpeechLongAudioSynthesizeClient({
credentials: cred,
});
const parent = `projects/${projectNumber}/locations/global`;
const outputGcsUri = `gs://${bucketName}/output20.wav`;
const [operation] = await texttospeechClient.synthesizeLongAudio(
{
parent: parent,
// select the type of audio encoding
audioConfig: { audioEncoding: 'LINEAR16' },
input: { text: text },
// Select the language and SSML voice gender (optional)
voice: { languageCode: 'en-US', name: 'en-US-Standard-F' },
outputGcsUri: outputGcsUri,
},
{ timeout: 5000000, },
);
return operation.promise();
to give you some context the text is 907033 characters long and after running the package i get a bunch of files in my google cloud storage bucket(*image below) but then i get the error
ERROR [ExceptionsHandler] Failed to write to GCS URI: gs://${bucketName}/output20.wav
and so i thought google cloud timed out and put in the timeout in the above code and tried again then i get the error
ERROR [ExceptionsHandler] INTERNAL: [type.googleapis.com/util.ErrorSpacePayload=‘SpeechErrorSpace::TTS_BACKEND_REQUEST_RPC_ERROR’]
Error: INTERNAL: [type.googleapis.com/util.ErrorSpacePayload=‘SpeechErrorSpace::TTS_BACKEND_REQUEST_RPC_ERROR’]
at Operation._unpackResponse (C:\Users\me\Documents\projects\backend\node_modules@google-cloud\text-to-speech\node_modules\google-gax\build\src\longRunningCalls\longrunning.js:143:31)
at C:\Users\me\Documents\projects\backend\node_modules@google-cloud\text-to-speech\node_modules\google-gax\build\src\longRunningCalls\longrunning.js:123:18
and then i tried minimizing the characters and after some trial and error i found out that as long as the character length is below 600K then everything works but if i try more than that i keep getting the error so i thought maybe its a character limit as specified in the docs “Long Audio Synthesis asynchronously synthesizes up to 1 million bytes on input” but then when i checked the audios they are all generated meaning all the text has been changed to speech fully with a text length of 907033 characters but it can’t generate the file that holds it all output20.wav.
I am currently stuck here and don’t know what do next
