RAG | Vector Search | Vertex AI Search | Grounding

MJane · February 3, 2025, 10:13pm

Welcome to the Google Cloud Community!

I see you have a detailed set of questions regarding your document search implementation using Vertex AI Search and related services. This includes vector search, data stores, the Grounding API, and Pub/Sub integration. I understand you’ve also included details on cost, multi-user access, and conversational aspects. Let’s go through each of your questions for possible solutions.

1. What’s the purpose of using corpus?

When using Vertex AI Search data stores with the RAG pipeline, direct filtering based on rag_file_ids is not supported. The corpus is the underlying storage and indexing mechanism. Even if you create a data store, the corpus is still there. Data stores provide an abstraction layer to interact with the corpus, not a method to bypass it.

2. Automatic Schema

Schema Usage - The schema, visible in the console logs, helps you understand how your documents are indexed and structured. You can define which metadata fields should be indexed.
Metadata for Filtering - Since rag_file_ids isn’t supported, you can achieve a similar result by adding a file ID to the document’s metadata. Vertex AI Search uses this metadata to filter documents. By including a file ID as part of the metadata, you can imitate the functionality of rag_file_ids effectively.

3. Corpus Costs with Data Store

In Your Scenario, when you attach your data store to the corpus and index your documents, you are not bypassing costs. The underlying corpus is what’s being used to store and process your documents. The cost will depend on the amount of indexed data, the amount of storage, and amount of queries you are performing.

4. Vertex AI Search x Grounding API and rag_file_ids

While you can’t use rag_file_ids directly, you can achieve the same outcome by using the alternative approach we discussed:

Add file_id as Metadata - Make sure that when you ingest a document, you include its file ID as a metadata field example metadata.file_id.
Filter on file_id Metadata During Search - When you formulate your search request using the Grounding API, include a filter clause that targets the file_id metadata field to include only the desired documents based on the ID(s).

5. Pub/Sub for Data Store Import

Create a Pub/Sub topic to publish updates about new documents. Then, set up a Cloud Function subscriber that listens for these messages. When a new document is uploaded or updated, the Cloud Function is triggered, and it uses the Vertex AI Search API to ingest, update, or delete the document in your data store.

6. Conversational Style (Multi-Turn)

Store previous prompts and responses for each user/thread in a database or cache. Then, include this history in your next prompt. Be mindful of context window limits.

7. Cost Calculation (Vertex AI Search x Grounding API)

Input prompt - Cost of processing your input prompt for the grounding, which includes grounding facts . This is based on character count.
Output - Cost of the output generated by the model. This is also based on character count.
Grounded Generation for grounding on your own retrieved data - This is the cost for using the grounding functionality to generate grounded answers.
Data Retrieval : Vertex AI Search (Enterprise edition) - This is the query cost for the retrieval of documents by Vertex AI Search.

These costs are usually per 1,000 requests.The input prompt cost is related to the token processing of the prompt for the model, while Data Retrieval is the request to Vertex AI Search.

8. File Personas (Metadata)

Yes, you can create personas by adding metadata such as “file_category,” “author,” “user_group”. During querying, you can use filters on these metadata fields to tailor results to specific use cases.

9. Conversational Agents and Generative Fallback

Conversational Agents could be useful if you intend to expand the application beyond document Q&A. For this particular use case, you would need to customize it a lot, so grounding is better.

Generative Fallback - Useful if you have search results with low relevancy, you can provide an answer using the model.

10. Grounding Multi-Turn and Cost

Multi-Turn - The Grounding API uses the history of your conversation as the context for the next request.
Cost - The cost is the same as for single requests but using the entire input prompt (including conversation history), output length, and the amount of documents retrieved.

11. Grounded Generation API vs. Grounding API

The Grounding API is what you are using, and it is not called Grounded Generation API. It uses the underlying search engine to fetch relevant data for your prompt and grounds the response accordingly.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

Topic		Replies	Views
Vector Search x RAG x Serverless Vertex AI - Custom ML & MLOps vertex-ai-platform	1	12	January 30, 2025
Struggling to build a simple RAG solution Vertex AI - Custom ML & MLOps vertex-ai-platform	6	44	January 8, 2025
Customized Grounding "Vertex AI search" - where to get path? Vertex AI - Custom ML & MLOps vertex-ai-platform	8	34	February 11, 2025

RAG | Vector Search | Vertex AI Search | Grounding

AI Suggested topics