Vertex AI Search: Document indexing stuck for hours with "Resource temporarily exhausted" status

Hi all,

We’re building a multi-tenant RAG chatbot service using Vertex AI Search (Discovery Engine) and have been experiencing severe document indexing delays. We’d appreciate guidance on
whether this is expected behavior, a quota issue, or something we should report.

Our Setup

  • Product: Vertex AI Search (Discovery Engine), discoveryengine_v1 Python SDK
  • Architecture: One unstructured data store per tenant (1:1 mapping with a dedicated search engine)
  • Data store config:
    • content_config: CONTENT_REQUIRED
    • industry_vertical: GENERIC
    • solution_types: [SOLUTION_TYPE_SEARCH]
    • default_parsing_config: ocr_parsing_config with use_native_text=True
  • Engine config:
    • search_tier: SEARCH_TIER_ENTERPRISE
    • search_add_ons: [SEARCH_ADD_ON_LLM]
  • Region: global
  • Project: A single GCP project hosts multiple tenants, each with its own data store

Ingestion Flow

  1. User uploads a file via our backend API
  2. We upload the file to a GCS bucket: gs:///tenants//…/.
  3. We call DocumentServiceClient.import_documents() with:
    ImportDocumentsRequest(
    parent=f"{data_store_name}/branches/default_branch",
    gcs_source=GcsSource(input_uris=[gcs_uri], data_schema=“content”),
    reconciliation_mode=ReconciliationMode.INCREMENTAL,
    )
  4. The call returns a long-running operation which we don’t block on (async indexing)

The Problem

Small files (a few MB or less) take hours to index, sometimes never completing.

Specific examples:

  • A 3 MB PDF has been “import in progress, 0/1 completed” for 40+ minutes in one data store
  • A 94 KB plain text file (.txt) never finished indexing after multiple attempts (tried for over an hour, then re-imported with fresh GCS URIs — still stuck)
  • A document that did manage to index initially shows the following status in the console “Documents” tab:

▎ Document segmentation has finished. Document indexing is working in progress. Resource is temporarily exhausted. Processing will be retried later when resource is available.

  • This same document remains in this state indefinitely

Meanwhile, in the same data store, other documents (e.g., a different PDF of similar size) can index successfully in under 10 seconds. There’s no obvious pattern to which files succeed
and which get stuck.

What We’ve Tried

  1. Re-importing with a new GCS URI (different filename, same content) — still stuck
  2. Cancelling pending operations via operations_client.cancel_operation() — returns success, but operation remains PENDING indefinitely
  3. Deleting pending operations via operations_client.delete_operation() — returns 501 Method not found
  4. Deleting the unindexed document via delete_document() — succeeds, but new imports for the same source still fail with: “Document was previously imported but failed to process. The
    error is: Document with name … does not exist.”
  5. Switching between discoveryengine_v1 and discoveryengine_v1beta — same behavior
  6. Creating a brand new data store — same behavior on the new store

Questions

  1. Is the “Resource is temporarily exhausted” status a known soft quota / fair-share throttling that we should expect during normal usage? If so, what’s the retry SLA?
  2. Is there a per-project or per-data-store quota for concurrent import operations or indexing throughput that we might be hitting? We can’t find it documented under Discovery Engine
    quotas.
  3. Why do some imports complete in 8 seconds while others (smaller files) take hours or never finish in the same data store?
  4. Is there a way to inspect why a specific document is stuck in the segmentation/indexing pipeline beyond what the console shows?
  5. Are pending import-documents operations supposed to be cancellable? cancel_operation returns success but doesn’t actually cancel, and delete_operation returns 501 Method not found.
  6. Is there any best practice for production multi-tenant RAG ingestion at scale that we should be following instead of the per-tenant data store + import_documents from GCS pattern?

We’re trying to assess whether Vertex AI Search is the right backend for our production RAG chatbot. Any pointers, similar experiences, or workarounds would be hugely appreciated.

Thanks!

4 Likes

I have a same problem.

2 Likes

I am experiencing the same issue. It appears there is a problem with the Global region, as the data import works successfully in the US region

3 Likes

I am experiencing a similar issue as well.