Vertex AI Search for Media - Recommend API latency spike after data requirement issue

Hi everyone,

I’m running two media recommendation apps using Vertex AI Search for Media (Discovery Engine), and I’m experiencing a persistent latency issue with one of them. I’d appreciate any insights from the community.

Background

Both apps use the RecommendationService.Recommend API (v1beta). One app uses a CVR15 model with a generic context event type, and the other uses a CVR20 model with a homepage context event type.

What happened

Around May 23, I noticed that the Recommend API latency for the CVR20 app had spiked significantly:

  • Average latency: ~0.1s → ~0.2s
  • P99 latency: ~0.5s → ~1.9s

At around the same time, I discovered that the CVR20 app had fallen below the data requirements — specifically, the number of days of view-home-page events was insufficient.

What I’ve tried

  1. Manually ingested dummy view-home-page events to cover the missing 30 days, which brought the data status back to OK.
  2. Disabled the CVR20 app from serving and routed all traffic to the CVR15 app only.
  3. Confirmed that direct API calls from GCP Cloud Shell return normal responses.

Despite these actions, the P99 latency remains elevated at around 1.9s and has not returned to its original level. The error rate (499/504) also persists.

Questions

  • Could unmet data requirements cause a lasting impact on latency even after the requirements are met again?
  • Is there any known issue with Vertex AI Search for Media that could explain sustained latency degradation?
  • Are there any other possible causes we should investigate?

Any advice would be greatly appreciated. Thank you!