This blog has been co-author with Shaoxiong Z., Vertex AI Agent Engine, Code Execution, Software Engineer.
TL;DR: Vertex AI Agent Engine now includes Code Execution in preview*, a managed service providing a secure sandbox for running AI-generated code. This enables you to build more powerful applications that can perform calculations, analyze data, and create visualizations, while eliminating the security risks and operational overhead of creating your own execution environment. You can call it directly via the SDK or integrate it as a tool for any LLM and agent frameworks like the Agent Development Kit (ADK).
When building AI applications like agents, the ability to execute code generated by a Large Language Model (LLM) is a powerful capability. This enables complex calculations, data analysis, and programmatic system interaction. However, implementing this safely presents a significant engineering challenge, requiring a secure sandbox or isolated runtime to manage dependencies and prevent misuse of untrusted code.
To solve this, we’re introducing Code Execution on Vertex AI Agent Engine as a managed service in preview. This new feature provides a fully managed API for a secure, isolated, and stateful sandbox environment, ideal for various AI applications. Instead of building your own execution infrastructure, you can now make a simple API call to safely run Python or Javascript code, turning your agents into functional problem-solvers.
Why a managed Code Execution sandbox?
Building your own secure code execution environment is complex. You have to worry about containerization, resource management, state persistence, and, most importantly, preventing malicious code from impacting your systems. Agent Engine’s Code Execution handles this for you.
- Managed & simple: No infrastructure to maintain. You get a simple API to create, execute, and manage runtimes, letting you focus on your application logic.
- Flexible & agnostic: It’s designed to be a fundamental building block. You can use it directly via the API, with any LLM, and integrate it into any agent framework.
- Security first: Code runs in a hardened, isolated sandbox. It has no access to your host system’s files, network, or metadata service, mitigating risks from untrusted, LLM-generated code.
- Stateful & scalable: Sandboxes are persistent for up to 14 days, enabling multi-turn, stateful interactions similar to a local Jupyter notebook. The service is built for production workloads with sub-second latency for both sandbox creation and execution.
Using Code Execution: From direct calls to full agent integration
You can leverage this new capability in several ways, depending on your application’s needs. You can find the full tutorial here.
Use case 1: Direct API for Programmatic Execution
At its core, Code Execution is a straightforward API. If you need a secure environment to run code programmatically—without an LLM—you can call it directly.
First, you create an AgentEngine instance, which acts as an entry interface. Then, you create a SandboxEnvironment associated with it.
import vertexai
from vertexai import types
# Initialize the Vertex AI Client
client = vertexai.Client(project="your-gcp-project-id", location="us-central1")
# Create a top-level Agent Engine
agent_engine = client.agent_engines.create()
# Create a secure sandbox for code execution
sandbox_operation = client.agent_engines.sandboxes.create(
spec={"code_execution_environment": {}},
name=agent_engine.api_resource.name,
config=types.CreateAgentEngineSandboxConfig(
display_name="my-first-sandbox"
),
)
sandbox_resource_name = sandbox_operation.response.name
print(f"Sandbox created: {sandbox_resource_name}")
With the sandbox ready, you can send any Python code or Javascript code (depending on the code execution configuration) you pass string to it. The service executes the code and returns the standard output. Below you have a code execution example using Python.
import json
# Execute the code in the sandbox
exec_response = client.agent_engines.sandboxes.execute_code(
name=sandbox_resource_name,
input_data={"code": "import math\nprint(math.sqrt(15376))"}
)
# Parse and display the result
result = json.loads(exec_response.outputs[0].data.decode('utf-8'))
print(f"Execution result: {result.get('msg_out')}")
Use case 2: Powering LLM-driven applications
The real magic happens when you connect the sandbox to a Large Language Model, allowing the LLM to generate code on the fly to solve problems.
A simple pattern involves first generating code from an LLM and then passing it to the sandbox for execution. This works with any model, including Gemini and Claude. For more complex tasks, the sandbox can even handle libraries like matplotlib and return generated files, such as charts and graphs.
A more advanced pattern is to expose the sandbox to the LLM as a tool. The LLM can then decide when and how to use it. This is the foundation of building autonomous agents.
from vertexai.generative_models import GenerativeModel, Tool, FunctionDeclaration
# Define a Python function that calls the sandbox
def execute_python_code(code: str) -> str:
"""Executes Python code in a secure sandbox."""
# ... (implementation from the tutorial)...
response = client.agent_engines.sandboxes.execute_code(...)
# ...
return result.get("msg_out")
# Create a tool from the function
code_tool = Tool(
function_declarations=[FunctionDeclaration.from_func(execute_python_code)]
)
# Give the tool to Gemini and ask a question
model = GenerativeModel("gemini-2.5-flash", tools=[code_tool])
response = model.generate_content("What is the compound interest for $1000 at 5% for 10 years?")
Use case 3: Building structured agents with the Agent Development Kit (ADK)
When building production-grade, multi-tool agents, a framework provides essential structure. Code Execution integrates seamlessly with the Agent Development Kit (ADK).
You can wrap the sandbox as an ADK FunctionTool, making it a reusable component for any of your agents.
from google.adk.agents import LlmAgent
from google.adk.tools import FunctionTool, ToolContext
def execute_python_tool_for_adk(code: str, tool_context: ToolContext) -> dict:
"""Executes code and uses ToolContext to interact with the agent session."""
# ... implementation details ...
return response_dict
# Create an ADK FunctionTool
custom_code_executor = FunctionTool(execute_python_tool_for_adk)
# Create an ADK Agent that uses the tool
data_analyst_agent = LlmAgent(
model='gemini-2.5-flash',
name="data_analyst",
instruction="You are an expert data analyst. Use Python and pandas to analyze data.",
tools=[custom_code_executor],
)
This approach allows you to build complex, stateful agents that can intelligently orchestrate tool use, manage conversational memory, and solve multi-step problems.
Managing your resources
Lifecycle management is simple. The SDK provides methods to list, get details for, and delete your sandboxes and the parent Agent Engine.
# List all sandboxes in the agent engine
sandboxes = client.agent_engines.sandboxes.list(name=agent_engine.api_resource.name)
for sandbox in sandboxes:
print(f"- {sandbox.display_name}: {sandbox.name}")
# Delete a specific sandbox
client.agent_engines.sandboxes.delete(name=sandbox_resource_name)
# Delete the parent Agent Engine and all its resources
agent_engine.delete(force=True)
What’s next?
Code Execution on Agent Engine is a foundational capability for building more powerful agentic applications. By providing a secure, managed runtime, we enable you to create more reliable systems that can reason, plan, and compute.
Ready to get started?
- Explore the full tutorial notebook: Get started with Code Execution on Vertex AI Agent Engine
- Refer to the Vertex AI Agent Engine documentation: Learn more about deploy and scale your agents with the Vertex AI Agent Engine.
Thank you for exploring this overview. We welcome your feedback and encourage you to connect with the Google Cloud Community for further discussion and resources. You can also connect with me on LinkedIn or X/Twitter for feedback and to suggest future content.



