videointelligence works for short videos, times out for longer ones

I’m a newbie, and I followed to use videointelligence to characterize a video. My python code (similar to the video except I log as well as print) works on short videos, but on a 25 minute video, I get

google.api_core.exceptions.RetryError: Timeout of 600.0s exceeded, last exception: 504 Deadline Exceeded

even though I have increased the timeout to 12000 seconds. Here is my code which I emphasize runs on short (< 7 minutes) videos, but fails on longer (25 minute) videos.

import io
from dotenv import load_dotenv
from google.cloud import videointelligence

maxtime = 12000
filename = ‘1998_01_South_Padre.mp4’
#filename = ‘kayak.mp4’
logfile = ‘log.txt’ #tagerrorlog lists problem tags that have to be fixed by TagThatPhoto or by editing the pix.cvs file
log = open(logfile, ‘a’)

def main():
video_client = videointelligence.VideoIntelligenceServiceClient()

features = [videointelligence.Feature.LABEL_DETECTION]

with open(filename,‘rb’) as media_file:
input_content = media_file.read()

operation = video_client.annotate_video(
request={
‘features’:features,

‘input_uri’:‘gs://resource-tutorial/cat.mp4’,

‘input_content’: input_content,
}
)

print(‘\nProcessing video for label annotations: ‘+filename)
log.write(’\n\nProcessing video for label annotations: ‘+filename+’\n’)
result = operation.result(timeout=maxtime)
print(‘\nFinished processing.’)

segment_labels = result.annotation_results[0].segment_label_annotations

for segment_label in segment_labels:
print(‘Video label description: {}’.format(segment_label.entity.description))
log.write(‘Video label description: {}’.format(segment_label.entity.description))

for category_entity in segment_label.category_entities:
print(‘\tLabel category: {}’.format(category_entity.description))
log.write(‘\tLabel category: {}’.format(category_entity.description))

for i, segment in enumerate(segment_label.segments):
start_time = (
segment.segment.start_time_offset.seconds

  • segment.segment.start_time_offset.microseconds / 1e6
    )
    end_time = (
    segment.segment.end_time_offset.seconds
  • segment.segment.end_time_offset.microseconds / 1e6
    )
    positions = ‘{}s to {}s’.format(start_time, end_time)
    confidence = segment.confidence
    print(‘\tSegment {}: {}’.format(i,positions))
    print(‘\tConfidence: {}’.format(confidence))
    log.write(‘\tSegment {}: {}’.format(i,positions))
    log.write(‘\tConfidence: {}’.format(confidence))
    log.write(‘\n’)
    print(‘\n’)
    log.close()

if name == ‘main’:
load_dotenv()
main()

Hi @prestonmcafee,

Welcome to Google Cloud Community!

It appears that you’re experiencing issues with the Google Cloud Video Intelligence API when processing videos of longer duration. The error message “google.api_core.exceptions.RetryError: Timeout of 600.0s exceeded, last exception: 504 Deadline Exceeded” indicates that the processing time for these videos is exceeding the allocated timeout limit, or that the server is unable to complete the operation within the expected timeframe.

Here are workarounds that might help you resolve the error:

  1. Split Video into Segments - Try to split your 25-minute video into smaller segments. Process each segment individually, then combine the results. This approach allows the API to handle each segment within its time limits.
  2. Simplify Content - If possible, remove unnecessary content like long pauses or irrelevant scenes. This can significantly reduce the processing load.
  3. Check API Call Limits and Timeout - It’s possible that Google Cloud Video Intelligence API has an internal processing time limit for certain types of analysis, even if you’ve increased the timeout.

For more information about Cloud Video Intelligence API (LABEL_DETECTION), you can read this documentation.

If the issue persists, I suggest contacting Google Cloud Support as they can provide more insights to see if the behavior you’ve encountered is a known issue or specific to your project.

I hope the above information is helpful.

Thanks, I came to much the same conclusion, and have found that splitting video into 180 second segments solves timeouts and ‘file too large’ errors in both annotation and transcript extraction.

In case others come to this question, I use ffmpeg to take a sequence of 3 minute clips via:

ffmpeg_extract_subclip(filename, clipstart, clipend, targetname=“clip.mp4”)

where my python code iterates over clipstart (0,180,360,540…) and clipend = clipstart+180. Then I call videointelligence on clip, and append the outcome to a text file.

Here is working python code for reference :

import io, os, sys, time, datetime
from dotenv import load_dotenv
from google.cloud import videointelligence
from glob import glob
cliplength = 180
minconf = 0.2

from moviepy.editor import VideoFileClip

#fps            = vid.fps
#width, height  = vid.size

load_dotenv()
video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.SPEECH_TRANSCRIPTION]
config = videointelligence.SpeechTranscriptionConfig(language_code="en-US", enable_automatic_punctuation=True)
video_context = videointelligence.VideoContext(speech_transcription_config=config)
from moviepy.video.io.ffmpeg_tools import ffmpeg_extract_subclip

files = glob("MP4/*.mp4")

logfile = 'transcripts.txt'
log = open(logfile, 'a')

for filename in files:
    log.write('\n\n'+filename)
    vid = VideoFileClip(filename)
    duration = vid.duration
    print(filename+' duration= '+str(duration))
    n = round(duration // cliplength) + 1
    print('The number of segments is '+str(n+1))

    for iter in range(n):
        clipstart = iter*cliplength
        clipend = clipstart + cliplength
        if clipend > duration:
            clipend = duration
        if clipstart > duration:
            continue

        ffmpeg_extract_subclip(filename, clipstart, clipend, targetname="clip.mp4")
        print('\n\n'+filename+' from '+str(clipstart)+' to '+str(clipend))

        with open('clip.mp4','rb') as media_file:
            input_content = media_file.read()

        operation = video_client.annotate_video(request={'features': features, 'input_content': input_content, "video_context": video_context,})

        print("\nProcessing video for speech transcription.")

        result = operation.result(timeout=600)

        # There is only one annotation_result since only one video is processed.
        annotation_results = result.annotation_results[0]
        for speech_transcription in annotation_results.speech_transcriptions:
            # The number of alternatives for each transcription is limited by
            # SpeechTranscriptionConfig.max_alternatives.
            # Each alternative is a different possible transcription
            # and has its own confidence score.
            for alternative in speech_transcription.alternatives:
                if((alternative.confidence < minconf) or (not(alternative.words))):
                    continue
                print("Transcript: {}".format(alternative.transcript))
                conf = round(100*alternative.confidence,2)
                print("Confidence: {}%".format(conf))
                seconds = round(clipstart + alternative.words[0].start_time.seconds + alternative.words[0].start_time.microseconds * 1e-6,2)
                stime = str(round(seconds//60))+':'+str(round(seconds%60)).zfill(2)
                print("Start time: {}\n".format(stime))
                log.write("\n{}: ".format(stime))
                log.write("{} ".format(alternative.transcript))
                log.write("({}%)".format(conf))

log.close()
  • Use input_uri by uploading the video to Google Cloud Storage for efficient processing.
  • Split the video into smaller chunks (under 7 minutes) using (URL Removed by Staff) or FFmpeg.
  • Increase timeout to 12000 and configure retry with Retry(deadline=12000).
  • Use the streaming annotate video method for dynamic, real-time processing.