Hi everyone,
I’m designing a CX Agent Studio assistant that needs to handle two distinct types of knowledge: unstructured policy documents and highly structured, temporal data (like checking if a specific request form is currently active before sharing the link).
I can see three main architectural paths for the structured data component. I would love to hear your pros, cons, and real-world experiences. I really like how simple the Vertex AI Data Store is to set up, but I want to avoid architectural dead-ends, especially since data store imports from Cloud Storage are often one-time operations or require serverless or batch sync.
Here are the three approaches I’m weighing:
1. Pure Data Store Approach
Relying heavily on Vertex AI Search Data Stores for both documents and structured FAQ/tabular data.
-
Pros: Extremely simple to configure and provides native semantic search.
-
Cons: RAG inherently struggles with deterministic logic (like comparing today’s date against a form’s expiration date), and syncing updates isn’t completely automated.
2. Data Store + Cloud SQL / Firestore Integration Connectors
Using the Data Store for unstructured text, but using CX Agent Studio Integration Connectors to talk directly to a Cloud SQL or Firestore database for the structured workflow data.
-
Pros: Allows direct database CRUD operations without needing to build custom middleware .
-
Cons: It tightly couples the agent to the database schema. If a column changes, the tool might break. It also forces the agent to handle the raw data rather than having an API handle the business logic.
3. OpenAPI or Model Context Protocol (MCP) Wrappers
Using a Data Store for unstructured documents, but deploying a middleware API (via OpenAPI) or an MCP server in front of the structured database.
-
Pros: Maximum control. The middleware can execute complex business logic (like date math) securely away from the LLM, returning a strict, validated payload. Furthermore, utilizing an MCP server allows the agent to dynamically discover available tools and schemas.
-
Cons: Requires deploying, authenticating, and maintaining separate middleware infrastructure (e.g., deploying the API on Cloud Run).
Which approach have you found most reliable in production? Does utilizing OpenAPI or MCP tools provide enough stability and control to justify the extra infrastructure overhead compared to native database connectors?
Thanks in advance for any insights!

