Let's talk about Vertex AI Memory Bank. Here's a quick FAQ on the top 5 questions we've seen

Hello there,

Since Memory Bank launched in Preview a few weeks ago, I’ve been getting a lot of questions in my DMs and on different threads (thanks for all the engagement!).

Rather than answering questions one by one, the Vertex AI Memory team and I created this FAQ to ensure everyone has the same information.

1. “Why not just build this myself?”

Memory Bank handles the messy state management you really don’t want to build yourself.

Sure, you can spin up your own memory system, but the real headache is writing the code that has to:

  • Summarize conversation history on the fly.
  • Figure out what’s a key fact vs. just noise.
  • Know that when a user says their favorite color is “red” now, it replaces the “blue” from last week.
  • De-dupe memories so you don’t have five of the same facts.

Memory Bank solves the complex memory issues, allowing you to dedicate your efforts to agent development.

2. “So what’s actually under the hood?”

Here’s what goes on under the hood:

  • Memory Generation: We use Gemini to do the “thinking”—understanding the chat and pulling out and continuously updating the facts.
  • Memory Retrieval: We use Google’s embedding models for the super-fast semantic search to find the right memory when you need it.

If you really want to go deeper into the method, I recommend reading the research paper that influenced the design.

3. “What about important features like TTL (Time-to-Live) or de-duping memories?”

Yes, these are critical for any memory management service.

  • Consolidation/De-duping: This is automatic. The generate_memories method is built to handle conflicts and update facts as new info rolls in.
  • Data Aging (TTL): This isn’t in the initial Preview, but it’s on the roadmap. We know you need it to manage costs and keep memories relevant.

4. “Can I use this with Gemma, Llama, Mistral, or other open-source models?”

No, not at the moment. Memory Bank is built to work tightly with the Gemini family. That close integration is how it automates all the memory management features reliably. With that being said, Memory bank is in preview and we are still in time to change it.

5. “Let’s talk about cost. We’re a little worried about pricing…”

Right now, it’s in Preview and provided at no cost. We’ll announce pricing publicly at a later date.

When it goes GA, the price will be on the main Vertex AI pricing page, and we hear your feedback about needing clear pricing information.


We hope you will find these answers helpful. Also this is the perfect time to test Vertex AI Memory Bank and share your feedback with us.

If you want to get started, below you have all resources you need

What other questions do you have? Share what you are building in the comments!

9 Likes

Can we also alter/remove memories from memory bank ?
How do I control what goes in memory vs what doesn’t ?

1 Like

Excellent catch, @ishank47 , thank you for pointing this out. This is a gap in our documentation, and we’re working to update it now. I’ll update the FAQ and post a link here as soon as it’s live.

Is there going to be a dashboard in GCP that shows information about the memories and general statistics?

3 Likes

Hello @ilnardo92, were you able to make this change? Curious if we can update/delete/alter memories or whether we can manually control them. Cheers!

2 Likes

This seems great, but I’m very reluctant to use it; it’s free now but since I have no idea how much the pricing will be I’m afraid of being stung. Is there any way you can indicate what Google is considering vis-a-vis pricing?

Any plans to integrate it with Gemini CLI as well ?

Hi everyone,

Thank you for your great feedback on Vertex AI Memory Bank! We listened, and I’m happy to share that the new release integrates many of your suggestions.

This update is all about giving you more control. Ever wished you could:

  • Make memories automatically expire with a set TTL?
  • Teach your agent custom topics tailored to your use case?
  • Show it exactly how to extract information using few-shot examples?
  • Choose the underlying models to optimize for performance or cost?

You can now do all of this and more. I’ve put together a comprehensive post with all the details and code snippets.

Find the full announcement here: Announcing customization features for Vertex AI Memory Bank

Thanks again for helping us shape the product!

We can currently define a custom topic in Vertex AI Memory Bank, which then creates facts for that topic over time.

What if we don’t want to predefine topics, but instead let the memory system automatically analyze conversations, infer the topics, and generate evolving summaries as the dialogue grows?

For example, if I talk about my wife multiple times, a “wife” memory should be created automatically and continue to grow whenever I bring it up. Each time I mention it, the entity’s summary should be updated with proper timestamps.

Also, I’ve noticed that the agent’s own responses are not being included in memory. Ideally, I’d like to see a consolidate

1 Like