This blog has been co-author with Joyce Liu, Software Engineer and Rajesh Velicheti, Software Engineer Manager at Google Cloud.
TL;DR: This post demonstrates how to build and deploy AI agents using the new integration between the Agent2Agent (A2A) protocol and Vertex AI Agent Engine in preview. Our step-by-step guide will show you how to create, test, and deploy fully-managed, interoperable AI agents, preparing you for advanced collaborative multi-agent systems applications.
When building multi-agent systems for production, you often encounter significant challenges. While highly specialized AI agents excel at individual tasks, orchestrating their collaboration becomes a massive engineering hurdle. Each agent frequently possesses a unique API, compelling developers to construct custom, one-off integrations for every connection. This approach is not scalable, leading to complex, difficult-to-maintain code and hindering innovation in AI agent deployment.
This is where the Agent2Agent (A2A) protocol comes in. A2A is an open standard that acts as a universal API for AI agents, ensuring they can communicate effectively. When you combine this standard with Vertex AI Agent Engine—a fully-managed, serverless platform—you get a scalable solution for deploying interoperable AI agents in production.
Why A2A on Vertex AI Agent Engine?
Combining the A2A protocol with Agent Engine provides several key advantages for developers building multi-agents systems.
First and foremost, this new integration unlocks a simpler deployment pattern. Previously, using A2A with Agent Engine often meant deploying an A2A client on the platform, while the agent itself had to be hosted on a separate runtime like Cloud Run. This new pattern eliminates the significant “glue code” and architectural complexity that came with managing two services. That complexity is now gone, allowing you to directly deploy the entire A2A agent as a single Agent class, just like agents built with other frameworks on one unique platform.
This, in turn, streamlines the deployment process, enabling you to package your agent and scale it on a secure, enterprise-grade endpoint with only a few lines of code. Finally, by adopting A2A on Agent Engine, you’re creating a service with a clean, well-defined, and reusable API that makes it easier for other applications to interact with it.
Getting started with A2A on Agent Engine
This step-by-step guide will walk you through building and deploying your first A2A agent, based on our sample notebook. In this scenario, we’ll construct a straightforward system where Agent A requires Agent B’s capabilities. This will serve as an introduction to the core components of A2A and Agent Engine integration.
Step 1: Set up your environment
First, you’ll need to install the necessary packages. a2a-sdk is the foundational open-source SDK for building A2A-compliant agents, and google-cloud-aiplatform is the Vertex AI SDK, which contains the new Agent Engine template we’ll use for deployment.
%pip install --upgrade --quiet "a2a-sdk >= 0.3.4"
%pip install --upgrade --quiet "google-cloud-aiplatform[agent_engines, adk]>=1.112.0"
Next, configure your Google Cloud project information and initialize the Vertex AI client. The http_options parameter is used here to access the new, pre-release features in the v1beta1 version of the API.
import vertexai
from google.genai import types
PROJECT_ID = "[your-project-id]"
LOCATION = "us-central1"
BUCKET_URI = f"gs://{PROJECT_ID}-a2a-bucket"
# Initialize Vertex AI session
vertexai.init(project=PROJECT_ID, location=LOCATION, staging_bucket=BUCKET_URI)
# Initialize the GenAI client
client = vertexai.Client(
project=PROJECT_ID,
location=LOCATION,
http_options=types.HttpOptions(
api_version="v1beta1",
base_url=f"https://{LOCATION}-aiplatform.googleapis.com/"
),
)
Step 2: Define the ADK Agent
Before building an A2A agent, define the core agent logic that will power it. The Agent Development Kit (ADK) is used for this purpose. This ADK agent will handle the actual agent components (reasoning loop, LLM calls, and tool use), which will then be wrapped with the A2A protocol layer to ensure interoperability.
from google.adk.agents import LlmAgent
from google.adk.tools import google_search_tool
# Create your agent
qna_agent = LlmAgent(
model='gemini-2.5-flash',
name='qa_assistant',
description='I answer questions using web search.',
instruction="""You are a helpful Q&A assistant.
When asked a question:
1. Use Google Search to find current, accurate information
2. Synthesize the search results into a clear answer
3. Cite your sources when possible
4. If you can't find a good answer, say so honestly
Always aim for accuracy over speculation.""",
tools=[google_search_tool.google_search],
)
Step 3: Define the AgentCard and AgentExecutor for A2A Communication
The next crucial step is to define the network interface for our ADK agent to facilitate external communication. The A2A protocol necessitates two key components for this: an AgentCard and an AgentExecutor.
The AgentCard serves as a digital “business card” for your agent—a structured JSON document that outlines its capabilities and interaction protocols for other agents.
Below you can see how to create the agent card with the new create_agent_card helper function.
from a2a.types import AgentSkill
from vertexai.preview.reasoning_engines.templates.a2a import create_agent_card
# Define a skill for your agent
qna_agent_skill = AgentSkill(
id='web_qa',
name='Web Q&A',
description='Answer questions using current web search results',
examples=['What is the current weather in Tokyo?']
)
# Create the Agent Card
qna_agent_card = create_agent_card(
agent_name='Q&A Agent',
description='A helpful assistant agent that can answer questions.',
skills=[qna_agent_skill]
)
Note the first level of integration here: The create_agent_card helper function we’re using comes directly from the vertexai.preview.reasoning_engines package and builds an agent card which will be compatible with Vertex AI Agent Engine deployment. You can take a look at the structure using the model_dump method. You can see key fields like name, description, skills, and the url.
{...
'name': 'Q&A Agent',
'preferredTransport': 'HTTP+JSON',
'protocolVersion': '0.3.0',
...
'skills': [{'description': 'Answer questions using current web search results',
'examples': ['What is the current weather in Tokyo?',
'Who won the latest Nobel Prize in Physics?',
'What are the symptoms of the flu?',
'How do I make sourdough bread?'],
'id': 'web_qa',
'inputModes': ['text/plain'],
'name': 'Web Q&A',
'outputModes': ['text/plain'],
'tags': ['question-answering', 'search', 'research']}],
'supportsAuthenticatedExtendedCard': True,
'url': 'http://localhost:9999/',
'version': '1.0.0'}
Currently, the URL points to localhost, which is suitable for local testing. When we deploy to Agent Engine, this URL will be automatically updated to point to the managed endpoint.
With the ADK agent and its “business card” (the AgentCard) defined, the AgentExecutor is the final essential component. The AgentExecutor acts as the bridge between the standardized A2A protocol and your agent’s custom internal logic.
This class acts as the bridge. Its responsibilities include:
- Receive requests from the A2A server.
- Translate the user’s query into a format the ADK agent can understand.
- Manage the lifecycle of the task (e.g., submitted -> working -> completed).
- Package the agent’s final answer into a standard A2A Artifact (an object containing the agent output) to send back to the client.
Here is the complete implementation for our QnAAgentExecutor.
# Imports for the Agent Executor
from a2a.server.agent_execution import AgentExecutor, RequestContext
from a2a.server.events import EventQueue
from a2a.server.tasks import TaskUpdater
from a2a.types import TaskState, TextPart, UnsupportedOperationError
from a2a.utils.errors import ServerError
from google.adk import Runner
from google.adk.artifacts import InMemoryArtifactService
from google.adk.memory.in_memory_memory_service import InMemoryMemoryService
from google.adk.sessions import InMemorySessionService
from google.genai import types
# Assumes 'qna_agent' (the ADK LlmAgent) is already defined
class QnAAgentExecutor(AgentExecutor):
"""
Agent Executor that bridges the A2A protocol with our ADK agent.
"""
def __init__(self):
"""Initializes the executor with a lazy-loading pattern for efficiency."""
self.agent = None
self.runner = None
def _init_agent(self):
"""
Lazy initialization of the ADK Runner. This avoids loading the agent
and its resources until the first request is received.
"""
if self.agent is None:
self.agent = qna_agent
# The Runner orchestrates the agent's execution, managing LLM calls,
# tool execution, and state.
self.runner = Runner(
app_name=self.agent.name,
agent=self.agent,
# For this tutorial, we use in-memory services. In production,
# you might use persistent storage like Vertex AI Memory Bank.
artifact_service=InMemoryArtifactService(),
session_service=InMemorySessionService(),
memory_service=InMemoryMemoryService(),
)
async def execute(
self,
context: RequestContext,
event_queue: EventQueue,
) -> None:
"""
Processes a user query by running the ADK agent and returning the answer.
"""
self._init_agent()
query = context.get_user_input()
updater = TaskUpdater(event_queue, context.task_id, context.context_id)
# Update task status through its lifecycle
await updater.submit()
await updater.start_work()
try:
session = await self.runner.session_service.get_or_create_session(
app_name=self.runner.app_name,
user_id='user', # In a real app, use an actual user ID
session_id=context.context_id,
)
content = types.Content(role='user', parts=[types.Part(text=query)])
# Run the agent asynchronously. This may involve multiple LLM calls and tool uses.
async for event in self.runner.run_async(
session_id=session.id,
user_id='user',
new_message=content
):
# We listen for the final response from the agent's run.
if event.is_final_response():
answer_text = " ".join([part.text for part in event.content.parts if part.text])
# Add the answer as an A2A Artifact. This is the official "result" of the task.
await updater.add_artifact(
[TextPart(text=answer_text)],
name='answer'
)
# Mark the task as successfully completed.
await updater.complete()
break # Exit the loop once we have the final answer.
except Exception as e:
# Inform the client if anything goes wrong.
await updater.update_status(
TaskState.failed,
message=new_agent_text_message(f"An error occurred: {str(e)}")
)
raise
async def cancel(self, context: RequestContext, event_queue: EventQueue):
"""Handles task cancellation requests."""
# This simple agent doesn't support cancellation, so we raise the
# standard error to inform the client.
raise ServerError(error=UnsupportedOperationError())
With this final component in place, all necessary elements are available to assemble, test, and deploy the A2A agent.
Step 4: Test the Agent locally
Before deploying, you can test the A2A agent locally. The A2aAgent class from the Vertex AI SDK wraps your AgentCard and AgentExecutor together and allows you to simulate calls to the agent as if it were deployed.
from vertexai.preview.reasoning_engines import A2aAgent
a2a_agent = A2aAgent(agent_card=qna_agent_card, agent_executor_builder=QnAAgentExecutor)
a2a_agent.set_up()
This marks the second level of integration: A2A agents are now incorporated into Vertex AI Agent Engine’s existing Agent templates. These templates provide the most streamlined path for agent deployment, automatically managing critical development aspects like object serialization and abstracting initialization logic from prompt response handling. Specifically, they instantiate necessary A2A components—such as the RESTHandler for request routing and a DefaultRequestHandler that interfaces with your custom AgentExecutor—and map them directly to corresponding A2A API methods (e.g., on_message_send, on_get_task). This crucial enhancement eliminates the boilerplate code previously required, significantly simplifying development.
Once set up, you can simulate a client discovering the agent by requesting its “business card.” The response will be the AgentCard JSON, confirming the local test server is configured correctly.
from pprint import pprint
# build_get_request helper to create a mock HTTP GET request for the test.
request = build_get_request(None)
# We call the method on our local agent instance to fetch its card.
response = await a2a_agent.handle_authenticated_agent_card(request=request, context=None)
# The output is the agent's full card.
pprint(response)
# {'name': 'Q&A Agent',
# 'url': # 'https://your-region-aiplatform.googleapis.com/v1beta1/projects/your-project/locations/your-region/reasoningEngines/test-agent-engine/a2a',
# ...}
Then, you can call the on_message_send method, which is the standard A2A endpoint for starting a new task. The system immediately acknowledges the request and returns a Task object in the TASK_STATE_SUBMITTED state, along with a unique task_id we can use to track it.
# Import the os module and the post request helper
import os
# This dictionary is the A2A-formatted payload containing the user's query.
message_data = {
"message": {
"messageId": f"msg-{os.urandom(8).hex()}",
"content": [{"text": "What is the capital of France?"}],
"role": "ROLE_USER",
},
}
# build_post_request helper to create a mock HTTP POST request with our payload.
request = build_post_request(message_data)
# This call initiates the task on the local agent.
response = await a2a_agent.on_message_send(request=request, context=None)
# We parse the JSON response to get the unique ID for the task we just created.
task_id = response['task']['id']
print(f"\nThe Task ID is: {task_id}")
#The Task ID is: 0b6a3b02-5904-44ba-ac20-add5d20a3891
With the task_id in hand, the result can now be polled. The on_get_task method is called to retrieve the task’s current status. As the agent’s task is simple, it should already be completed. The response will show the TASK_STATE_COMPLETED status and include the final answer within the artifacts field.
# We create a simple payload with the ID of the task we want to query.
task_data = {"id": task_id}
request = build_get_request(task_data)
# This call fetches the task's current state and, if complete, its final result.
task_data={"id": task_id}
request = build_get_request(task_data)
response = await a2a_agent.on_get_task(request=request, context=None)
for artifact in response['artifacts']:
# Access the text through the 'root' attribute of the Part object
if artifact['parts'] and 'text' in artifact['parts'][0]:
print(f"**Answer**:\n {artifact['parts'][0]['text']}"))
else:
print("Could not extract text from artifact parts.")
# Answer: Paris is the capital and largest city of France....
Step 5: Deploy to Agent Engine
The next step is to deploy the agent to Vertex AI Agent Engine, a fully-managed, scalable platform.
With a single create() call, the Vertex AI SDK takes your local a2a_agent object, packages it, and provisions a secure, scalable, and fully-managed serverless endpoint on Agent Engine to host it.
# Assuming client and a2a_agent has been initialized
remote_a2a_agent = client.agent_engines.create(
agent=a2a_agent,
config={
"display_name": a2a_agent.agent_card.name,
"description": a2a_agent.agent_card.description,
"requirements": [
"google-cloud-aiplatform[agent_engines,adk]>=1.110.0",
"a2a-sdk >= 0.3.4",
],
"http_options": {
"base_url": f"https://your-location-aiplatform.googleapis.com",
"api_version": "v1beta1",
},
"staging_bucket": your-bucket-uri
}
)
Once the deployment successfully ends, you will find your agent in the Vertex AI Agent Engine console as shown below.
Step 6: Interact with the Deployed Agent
Once deployed, you can interact with your agent through its standardized API using various methods, including the Vertex AI SDK for Python, an A2A Client, or direct HTTP requests, ensuring flexibility for diverse developer needs. The simplest is using the Vertex AI SDK, which provides a proxy object that makes remote calls feel just like local ones.
import os
# Create a message
message_data = {
"messageId": f"msg-{os.urandom(8).hex()}",
"role": "user",
"parts": [{"kind": "text", "text": "What is the capital of Italy?"}],
}
# Invoke the agent
response = await remote_a2a_agent.on_message_send(**message_data)
# The response contains a Task object with status and ID
task_object = None
async for chunk in response:
# Assuming the first chunk contains the task object
if isinstance(chunk, tuple) and len(chunk) > 0 and hasattr(chunk[0], 'id'):
task_object = chunk[0]
break
# Get the task id
task_id = task_object.id
# Get the task result
task_data ={
"id":task_id,
"historyLength": 1, # Include conversation history
}
result = await remote_a2a_agent.on_get_task(
**task_data
)
# Artifacts contain the actual results
for artifact in result.artifacts:
# Access the text through the 'root' attribute of the Part object
if artifact.parts and hasattr(artifact.parts[0], 'root') and hasattr(artifact.parts[0].root, 'text'):
print(f"**Answer**:\n {artifact.parts[0].root.text}"))
else:
print("Could not extract text from artifact parts.")
# Answer: Rome is the capital of Italy...
You can then use the task_id from the response to poll for the result using the on_get_task method.
What’s next
You now have a solid foundation for building and integrating your own A2A agents on Vertex AI Agent Engine. To further your AI agent deployment journey, here are some resources to start building:
- Explore the Documentation: Read the complete A2A Protocol Documentation site for a comprehensive overview of building interoperable AI agents.
- Review Github samples: Check out the examples in the A2A and Agent Engine GitHub repository for more complex multi-agent system integrations.
- Build Your Own Agent: Try creating a new A2A agent using your favorite Python agent framework and implementing the
a2a.server.AgentExecutorinterface to bridge your agent’s logic with the A2A protocol. Consider exploring Vertex AI Agent Engine documentation for more insights.
Thank you for exploring this overview. We welcome your feedback and encourage you to connect with the Google Cloud Community for further discussion and resources. Also happy to connect on LinkedIn or X/Twitter for feedback and any content you would like to see next.





