Deploying Agents with Inline Source on Vertex AI Agent Engine

Authors: Xuanzhi Lin, Shawn Yang, Ivan Nardini

TLDR: We’re introducing Inline Source Deployment for Vertex AI Agent Engine. Now, you can deploy agents directly from your source files—bypassing the need to provision and manage Cloud Storage buckets entirely. This method aligns your agents with standard engineering requirements, enabling support for Git version control, CI/CD pipelines, and security scanning. It’s the definitive shift from “works in my notebook” to “deploy my agent”.

Introduction

You’ve built an AI agent. It works perfectly in your development environment. But now comes the hard part: deploying it to production in a way that fits your team’s existing workflows. If you’re like most development teams, you have:

  • Version control requirements for auditing and rollback
  • CI/CD pipelines that deploy from Git repositories
  • Infrastructure as Code (Terraform) managing your cloud resources
  • Security policies requiring source code scanning before deployment

While deploying an agent by serializing and pickling an in-memory Python object to Cloud Storage is an option on Vertex AI Agent Engine, this approach presents significant challenges. Pickle files, for instance, lack version control, hinder CI/CD integration, and complicate security reviews, rendering them unsuitable for production workflows.

We introduced Inline Source Deployment to enable direct deployment of AI agents from source files. This method eliminates the need for Cloud Storage buckets and serialization, allowing code to flow seamlessly through your existing deployment infrastructure.

.

This guide will walk you through inline source deployment, explaining its critical role in production workflows and providing a step-by-step implementation guide.

What is Inline Source Deployment?

Inline source deployment packages your source code files into a compressed archive and sends them directly through the Vertex AI API to deploy agents to Vertex AI Agent Engine. The following diagram illustrates this process.

Instead of creating an in-memory agent object for SDK serialization, you specify directories and files on disk. The SDK packages these into a tarball, base64-encodes it, and includes it inline within the API request body. This process defines the “inline source” approach.

Why Inline Source Deployment Matters

Inline Source Deployment offers several advantages over previous agent deployment methods on Vertex AI Agent Engine:

No Cloud Storage Bucket Required

Agent object deployment uploads your serialized agent to a GCS bucket you must provision and manage. With inline source deployment, the code travels directly through the API request—one less piece of infrastructure to manage. The following diagram visually compares the ‘Before’ (Agent Object) and ‘After’ (Inline Source) deployment flows, highlighting the reduced complexity.

Built for CI/CD Pipelines

Modern deployment workflows deploy from Git repositories. Inline source deployment fits this model perfectly:

# Example GitHub Actions workflow
- name: Deploy Agent
  run: |
    python deploy_agent.py \
      --source-packages agent_package \
      --entrypoint deployment.agent \
      --project ${{ secrets.GCP_PROJECT }}

Your agent code resides in version control alongside your application code. When you merge to main, your CI pipeline automatically deploys the latest version.

Auditable and Reversible

Every deployment is tied to a specific Git commit. To roll back, deploy from a previous commit. To audit changes, run git diff. This visibility is lost with pickled objects.

Security Scanning Compatible

Security teams can scan your agent’s source code before deployment using standard tools (Snyk, SonarQube, etc.). You can’t scan a pickle file.

Deploying an Academic Research Agent

This section provides a complete example of deploying an agent using inline source deployment. The example demonstrates building an academic research agent that can answer questions using Google Search via ADK, with the following package structure:

my_agent/
├── core/
│   └── agent.py             # Your agent definition
├── deployment/
│   └── agent_app.py         # Wrap ADKApp 
│   └── deploy.py            # Deployment script
└── requirements.txt          # Python dependencies

Notice that the same process and concepts can be applied to any agent built with your favorite framework. Let’s get started!

First, you define your core/agent.py

from google.adk.agents import LlmAgent
from google.adk.tools import google_search

# Define the agent
root_agent = LlmAgent(
    name="academic_research_agent",
    model="gemini-2.5-flash",
    description="An AI agent that helps with academic research",
    instruction="""You are an expert academic research assistant.
    Use Google Search to find recent papers, articles, and scholarly resources.
    Always cite your sources with URLs.""",
    tools=[google_search],
)

Next, prepare your deployment/agent_app.pyscript. This script utilizes the ADKApp template to enable tracing capabilities on the Vertex AI Agent Engine platform.

from vertexai import agent_engines
from my_agent.core.agent import root_agent

# Wrap the agent for Agent Engine deployment
adk_app = agent_engines.AdkApp(
    agent=root_agent,
    enable_tracing=True,
)

You also define the required requirements.txt dependencies to deploy the agent.

google-cloud-aiplatform[adk,agent_engines]>=1.70.0

And you create the deployment/deploy.py deployment script using the inline source configuration.

import vertexai

# Initialize the Vertex AI client
client = vertexai.Client(
    project="your-project-id",
    location="us-central1",
)

# Define the API methods your agent will expose
class_methods = [
    {
        "name": "async_stream_query",
        "api_mode": "async_stream",
        "description": "Stream responses from the agent",
        "parameters": {},
    },
    {
        "name": "async_create_session",
        "api_mode": "async",
        "description": "Create a new session for conversation",
        "parameters": {},
    },
]

# Deploy using inline source
print("🚀 Deploying agent with inline source...")

agent = client.agent_engines.create(
    config={
        "display_name": "Academic Research Agent",
        "description": "AI agent for academic research assistance",
        "labels": {"team": "research", "env": "production"},

        # Inline source configuration
        "source_packages": [
            "core",
            "deployment",
            "requirements.txt"
        ],
        "entrypoint_module": "deployment.deploy",
        "entrypoint_object": "adk_app",
        "class_methods": class_methods,
    }
)

print(f"✅ Deployment complete!")
print(f"   Resource: {agent.api_resource.name}")

Inline source deployment requires several parameters that might seem complex at first. Let’s break them down:

  • Entrypoint_module: The Python module path (using dot notation) that Agent Engine should import. For this example, entrypoint_module = “deployment.deploy”.
  • Entrypoint_object: The variable name inside the entrypoint module that contains your agent or AdkApp instance. For this example, entrypoint_object = “adk_app”.
  • Class_methods: These API method schemas define how clients can interact with your agent. Unlike agent object deployment, which auto-introspects your agent’s methods, inline source requires explicit declaration of available API methods.
  • Requirements_file: Path to a requirements.txt file within your source_packages. This file is installed during container build; if not specified, only base dependencies are installed.

To deploy your agent, first authenticate, then initiate the deployment process. This process typically takes 5-10 minutes.

# Authenticate with Google Cloud
gcloud auth application-default login

# Deploy the agent
python deploy_script.py

When deploying an agent using inline source, the SDK validates that all files included in the source_packages exist and are within your current working directory.

# From the Vertex AI SDK
real_file_path = os.path.realpath(file)
if not real_file_path.startswith(project_dir):
    raise ValueError(f"File path '{file}' is outside the project directory")

Your source files are compressed into a .tar.gz archive entirely in memory, then base64-encoded for inclusion in the JSON API request.

def _create_base64_encoded_tarball(source_packages):
    tar_fileobj = io.BytesIO()
    with tarfile.open(fileobj=tar_fileobj, mode="w|gz") as tar:
        for file in source_packages:
            tar.add(file)
    tarball_bytes = tar_fileobj.getvalue()
    return base64.b64encode(tarball_bytes).decode("utf-8")

Once Vertex AI receives your request, a series of steps are executed to deploy your agent, from decoding the tarball, loading the entrypoint object to exposing the agent. Important: All paths must be relative to your current working directory and cannot reference files outside the project.

You’ll see output like:

🚀 Deploying agent with inline source...
✅ Deployment complete!
   Resource: projects/123.../locations/us-central1/reasoningEngines/456...

Once deployed, you can test the agent as shown below.

import vertexai

# Connect to your deployed agent
client = vertexai.Client(project="your-project-id", location="us-central1")
agent = client.agent_engines.get(name="projects/.../reasoningEngines/...")

# Create a session
session = await agent.async_create_session(user_id="researcher_001")

# Query the agent
print("Query: What are the latest breakthroughs in quantum computing?")
print("-" * 70)

async for event in agent.async_stream_query(
    user_id="researcher_001",
    session_id=session["id"],
    message="What are the latest breakthroughs in quantum computing?",
):
    if event.get("content", {}).get("parts"):
        for part in event["content"]["parts"]:
            if "text" in part:
                print(part["text"], end="", flush=True)

print("\n" + "-" * 70)

# The latest breakthroughs in quantum computing ...

Conclusion

Inline source deployment is an additional method Vertex AI Agent Engine offers for deploying your agents in production. By deploying from source files instead of serialized objects, you gain:

  • Version control - Every deployment tied to a Git commit
  • CI/CD integration - Automated deployments from your pipeline
  • Auditability - Clear history of what changed and when
  • Security - Source code scanning before deployment
  • Reproducibility - Deterministic builds from source

Below is a comparison of the two main deployment approaches offered by Vertex AI Agent Engine.

Aspect Serialized Object Inline Source
Primary Use Case Interactive notebook development CI/CD pipelines
Authentication Service account (ADC) Service account (ADC)
GCS Bucket Required? Yes (for staging) No
How Code is Packaged Cloudpickle serialization Tarball + base64
Version Control Friendly? Difficult (binary pickle) Excellent (source files)
Reproducibility Medium (pickle fragility) Excellent
Best For Notebook experimentation Moving to production

For teams building production AI agents, inline source deployment is more than a convenience; it’s a required feature. It enables you to apply the same engineering rigor to AI agents as you would to the rest of your systems.

What’s Next

You now understand how inline source deployment works and why it’s the optimal choice for agent deployments. Here’s what to explore next:

  1. Try it yourself: Convert one of your existing agents to inline source deployment
  2. Set up a pipeline: Integrate agent deployment into your CI/CD workflow
  3. Monitor your agents: Use Cloud Logging and Cloud Trace to observe behavior in production

If you want to learn more:

Questions or feedback? Connect with me on LinkedIn or X/Twitter.
Did you find this guide helpful? Consider sharing it with your team to foster best practices in AI agent deployment on Vertex AI.

7 Likes

Great thanks to Ivan’s break-down.

Feel free to consult @Marcus_Lin and I (@Shawn_Yang ) if you encountered any issues during development.

Thanks @ilnardo92 for sharing this. I will try this approach.

Do you have any insights on when will the Terraform Resource “google_vertex_ai_reasoning_engine” will start supporting the in-source deployment. Current resource expects a pickle_object_gcs_uri path

The version 7.13.0 of the provider, published two days ago, supports inline source deployment to Agent Engine.

Check out the source_code_spec argument of google_vertex_ai_reasoning_engine! :wink: