Summary:
-
Documents show “indexed” status in datastore, however are not searchable/retrievable using metadata struct_field filters.
-
I have verified that the datastore schema has
searchable, indexable, retrievableall set -
Issue occurs intermittently (e.g. 1/10 upload attempts).
-
Currently, I’m using a “test” document (small text document with struct fields) – nothing complex.
-
I’m using a “test” datastore, with very few documents ingested (e.g. < 100)
Expectation:
- My expectation is that once the document shows “Indexed status: Indexed”, that I can reliably retrieve and query the document
Background:
-
I am using Vertex AI Search for with custom documents, which I upload to a datastore using the Python SDK (google.cloud.discoveryengine_v1), using the latest version of the discovery engine package (google-cloud-discoveryengine v0.16.0).
-
I’m using the
discoveryengine.CreateDocumentRequestto stream the request (rather than batch ingestion). -
This works great, and the document is uploaded to the data store, and the datastore shows “Index Status: Indexed 1/22/2026, 12:07:03 AM, America/Vancouver”.
-
After upload, I am verifying whether the document is retrievable using metadata struct_field. For example, I have “ingestion_task_id type=string Array=no Searchable=true Indexable=True Retrievable=True” in my schema.
-
I then verify searchability using the following code
request = discoveryengine.SearchRequest(
serving_config=serving_config,
query=””,
filter=”ingestion_task_id: Any()”,
page_size=page_size,
content_search_spec=content_search_spec,
)
- Frustratingly, this works about 90% of the time, but intermittently and periodically fails
Observations:
- Sometimes the documents do become retrievable after a long time (e.g. hours), but significantly after the Indexed status shows “Indexed”