Vertex AI Prompt Optimizer: New algorithm & SDK for the general availability

ilnardo92 · August 12, 2025, 9:22pm

This blog has been co-authored with George Lee, Product Manager, Cloud AI Research.

TL;DR

Crafting the perfect prompt has always been more of an art, but it’s a time-consuming one. The Vertex AI Prompt Optimizer is now Generally Available (GA) and is set to automate this process. This guide will walk you through its two new modes: Zero-Shot for rapid refinement and Data-Driven for deep, performance-based tuning, all using a new SDK.

From the manual grind of prompt engineering…

If you’ve worked with Large Language Models (LLMs), you know the routine. You have a great idea and a powerful model, but bridging the gap between them is a winding road of manual prompt-tweaking. And prompt engineering is often tedious and fraught with friction.

The process often looks like this:

Endless Iteration: Manually rewriting prompts to see what sticks.
Model Drift: A prompt perfected for one model version can break with the next update, forcing you back to the drawing board.
Subjective “Good”: It’s hard to prove a new prompt is actually better without a systematic way to measure its performance.

This cycle slows down prototyping and keeps innovative applications from reaching production.

To automated, performance-driven prompt tuning!

The Vertex AI Prompt Optimizer directly addresses this pain point. First launched in preview a year ago, it’s now officially Generally Available (GA), and this release is packed with two major upgrades. The core service has been improved to provide a better experience:

Powerful New Algorithms: We’ve upgraded the core engines for both the Zero-Shot and Data-Driven optimizers to deliver smarter, more effective prompt enhancements.
A Dedicated SDK Experience: We’ve made programmatic interaction easier than ever. The new SDK offers a dedicated and intuitive interface, allowing you to integrate prompt tuning directly into your development workflows.

This tool handles the heavy lifting of prompt optimization for you, replacing manual guesswork with a repeatable, programmatic way to enhance your prompts.

This guide provides a complete walkthrough of both methods, based on our new notebook tutorial.

Part 1: Quick wins with the Zero-Shot Optimizer

The fastest way to get started is with the zero-shot approach. You can either generate a prompt from scratch by describing your goal or refine an existing one. The optimizer rewrites your prompt for clarity, structure, and effectiveness based on established best practices.

Using the Vertex AI SDK, you can improve a prompt with a single method call. The service uses a sophisticated meta-prompt to analyze and rewrite your input. Let’s say you want to generate a well-structured prompt for a Q&A assistant. You can start with a simple description of the task.

# Make sure you've authenticated and initialized the client
import vertexai
client = vertexai.Client(project="your-project", location="your-location")

prompt = "Generate system instructions for a question-answering assistant"
response = client.prompt_optimizer.optimize_prompt(prompt=prompt)

display(Markdown(response.suggested_prompt))

# You are a Prompt Engineering Assistant. Your goal is to help users craft a comprehensive set of system instructions for a question-answering AI assistant....

The response.suggested_prompt will contain a complete, well-structured prompt ready for use, generated in seconds without any data required. Below you have an interactive Gradio application that shows the complete output:

A gif of the custom VAPO Results Viewer Gradio interface shows a platform for optimizing prompts using Vertex AI's zero-shot optimization, with fields to input an original prompt and view the suggested optimized prompt.

Notice how the Zero-Shot Optimizer also provides detailed output you can review with applicable_guidelines attribute. This output not only shows how your prompt was improved but also provides some insights to help you write better prompts.

Part 2: Deep prompt tuning with the Data-Driven Optimizer

When you need the absolute best performance for a specific task, the Data-Driven Optimizer is your tool of choice. It runs a batch job that systematically evaluates and rewrites your prompt against your own dataset and metrics.

Let’s walk through the three main steps from the tutorial.

Step 1: Prepare your Data and configuration

First, you need a dataset. For our example, we use a JSONL dile where each line contains a question, context, and a ground-truth target answer, which is crucial for evaluation.

input_data_path = "gs://github-repo/prompts/prompt_optimizer/rag_qa_dataset.jsonl"
prompt_optimization_df = pd.read_json(input_data_path, lines=True)
prompt_optimization_df.head()

Next, you define the optimization job using the OptimizationConfig class from the notebook. This Pydantic model ensures your configuration is structured and validated correctly. You’ll specify the initial prompt, the target model (e.g., gemini-2.5-flash), evaluation metrics (question_answering_correctness, fluency), and your data paths.

output_path = f"{BUCKET_URI}/optimization_results/"

vapo_data_settings = {
    "system_instruction": "You are an helpful assistant. Given a question with context, provide the correct answer to the question.",
    "prompt_template":  "Some examples of correct answer to a question are:\\nQuestion: {question}\\nContext: {ctx}\\nAnswer: {target}",
    "target_model": "gemini-2.5-flash",
    "optimization_mode": "instruction",
    "eval_metrics_types": ["question_answering_correctness", "fluency"],
    "eval_metrics_weights": [0.8, 0.2],
    "aggregation_type": "weighted_sum",
    "input_data_path": input_data_path,
    "output_path": output_path,
    "project": PROJECT_ID,
}

vapo_data_config = OptimizationConfig(**vapo_data_settings)
vapo_data_config_json = vapo_data_config.model_dump()

Step 2: Run the prompt optimization job

With the configuration saved to a file in a Google Cloud Storage (GCS) bucket, you launch the optimization job as shown below. We set wait_for_completion=True to block until the results are ready.

vapo_data_run_config = {
    "config_path": "gs://your-bucket/config.json",
    "wait_for_completion": True,
    "service_account": "your-service-account"
}

result = client.prompt_optimizer.optimize(method="vapo", config=vapo_data_run_config)

The notebook uses the Vertex AI client to start the process, which runs as a custom job on Vertex AI. You can monitor the job in the Custom jobs UI under Training in the Vertex AI console.

The image displays a Vertex AI interface with a left navigation pane and a main content area showing a table of "Custom jobs" with details such as job name, ID, status, type, duration, last updated time, and creation date.

Step 3: Analyze the results

The job output includes detailed logs, evaluation results for each tested prompt, and, most importantly, the best-performing prompt. The tutorial notebook includes helper functions to programmatically retrieve the top prompt from the output files in GCS.

best_instruction, _ = get_best_vapo_results(output_path)
print("The optimized instruction is:" , best_instruction)

To make analysis even easier, you can use the same interactive Gradio application which lets you visually explore and compare all the generated prompts and their evaluation scores with confidence levels and explanations. In this straightforward scenario, with a basic dataset, the data-driven optimizer converged on the optimal instruction prompt pretty quickly, requiring minimal runs and ceasing further optimization.

A gif of the custom VAPO Results Viewer Gradio interface showing Data-Driven Optimization results.

Some considerations for optimizing your prompts with Vertex AI Prompt Optimizer

Here are a few things to keep in mind to get the most out of the Vertex AI Prompt Optimizer.

When to use which optimizer

Here’s a quick guide on when to use each optimizer:

Use the Zero-Shot Optimizer when you…
- Need a quick improvement or want to generate a prompt from a simple description.
- Are adapting existing prompts to a newer model version.
- Don’t have a labeled dataset for your task.
- It is model-independent. It can optimize prompt for any model.
Use the Data-Driven Optimizer when you…
- Need to maximize performance for a specific, critical task.
- Have a dataset of at least 5-10 examples (50-100 is recommended for best results) with ground-truth data.
- Want to optimize against specific evaluation metrics, including custom ones, and an expected outcome.
- Currently available for generally available Gemini models.

Permissions, regions & pricing

The Data-Driven optimizer runs a job that requires specific permissions, known as IAM roles (Vertex AI User, Storage Object Admin, etc.). Ensure the service account has these roles before starting. Also, this optimizer cannot be used for models in preview, as they are often only available in the global region, which isn’t supported by the underlying job system. For other models, use region-specific locations like us-central1 to meet data residency requirements (rules about where data must be physically stored).

Regarding pricing, the Vertex AI Prompt Optimizer operates on a pay-as-you-go basis:

Zero-Shot Optimizer: This real-time call is priced as part of the standard cost of calls to the Gemini API.
Data-Driven Optimizer: This feature runs a custom job. You are billed for the underlying compute resources used (like a specific type of virtual machine, or n1-standard-4, as specified in the notebook) and for the API calls made during the optimization process.

Conclusion

The Vertex AI Prompt Optimizer offers a solution to one of the most time-consuming parts of the Gen AI lifecycle. With its dual-mode approach, you can either perform rapid, data-free refinements with the Zero-Shot optimizer or conduct deep, metric-driven tuning with the Data-Driven optimizer. Finally, this SDK-driven tool provides a practical way to improve model outputs, saving valuable development time. More is coming with respect to the UI support. So stay tuned!

What’s next?

To get started on your prompt optimization journey with Vertex AI Prompt Optimizer, check out these resources:

Get Hands-On: Follow the full tutorial notebook to run these examples yourself.
Deep Dive: Explore the official documentation for comprehensive details.
Learn More: Read about prompting strategies for more background.

We’d love to hear from you! Share your feedback and connect on LinkedIn, X/Twitter.

Happy building!

Topic		Replies	Views
Unlocking the Power of Gen AI: A Guide to Automating Information Extraction with Vertex AI Community Articles googler-article , best-practices , ai-ml , developer-tools	4	16	January 4, 2024
Modernize IT Operations with Gemini: Two use cases that boost efficiency Community Articles googler-article , vertex-ai-platform	1	82	August 5, 2025
Introducing the Experimental Vertex AI Gen AI Eval SDK - We are looking for your feedback! Generative AI & Foundational Models vertex-ai-generative-ai-evaluation-service	1	140	July 28, 2025