Vertex AI RAG Engine Price [retrieveContexts]

Hi Team,

Want to ask about the pricing for VertexAI RagEngine without grounding (Retrieve Contexts endpoint):

POST /v1/projects/{project}/locations/{location}/ragCorpora:retrieveContexts,
retrieveContexts doesn’t invoke an LLM, I assume I will pay for:

  • Embeddings (at ingest time)

  • Embeddings (the query)

  • Vector DB storage

is this correct?

while generate with LLM + RagEngine + grounding = $2.5 / 1,000 requests (grounding w/ your data) + prev costs?

https://cloud.google.com/vertex-ai/generative-ai/pricing

https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/rag-api

Hi @tester88 ,

You’re correct. When using retrieveContexts without grounding or LLM generation, costs apply to:

  • Embeddings at ingest time

  • Embeddings at query time

  • Vector DB storage

There’s no LLM cost for retrieveContexts alone.

For generate with RAG + grounding, the $2.5 / 1,000 requests applies in addition to embedding and storage costs.