Building Agentic GraphRAG on Vertex AI: Part 2 - Deployment, visualization & production

In Part 1, we successfully defined our Cybersecurity Threat Agent, hydrated our Neo4j Knowledge Graph, and integrated them using the LangChain GraphCypherQAChain.

Now, we face the most challenging phase: Deployment. Deploying a stateful, tool-using Python agent to a managed cloud runtime involves navigating complex serialization (pickling) rules and context management.

This guide provides a transparent look at the engineering solutions required to deploy this agent to Vertex AI Reasoning Engine and visualize its reasoning trace.


:cloud: 6. Deploying to Vertex AI Reasoning Engine

The Reasoning Engine is a managed runtime that hosts your agent code. It abstracts away the infrastructure scaling but requires your code to be “picklable” and stateless during the upload process [^1].

:warning: Critical error analysis: Deployment failures

During the build process, we encountered three distinct roadblocks. Understanding these is key to successful deployment.

Error Type The Symptom Root Cause The Engineering Fix
PicklingError Serialization failure during upload. Local app objects often maintain active thread locks or socket connections (to Neo4j) that cannot be pickled. Create a Fresh Instance of the agent specifically for the deployment payload.
FailedPrecondition 400 Reasoning Engine Execution failed. Remote containers do not inherit local notebook auth context. The SDK fails to initialize. Re-initialize vertexai.init() explicitly inside the remote execution flow.
Async Mismatch WARNING: Failed to register API methods. ADK methods are asynchronous; the Engine expects synchronous generators for streaming. Implement a Synchronous Wrapper class.

:hammer_and_wrench: The Solution: The AgentWrapper pattern

This wrapper acts as a bridge between the Cloud Runtime and our ADK Agent. It handles re-authentication inside the remote container and exposes a clean, synchronous stream_query interface.

Code block: The deployment wrapper

from vertexai.preview import reasoning_engines
import vertexai

# Define staging bucket for artifacts
STAGING_BUCKET = f"gs://{PROJECT_ID}-vertex-staging"

# --- Wrapper to fix API registration and Context issues ---
class AgentWrapper:
    """
    Wrapper to expose stream_query and pass through all arguments.
    It re-establishes the Vertex AI context inside the remote worker.
    """
    def __init__(self, app, project_id: str, location: str):
        self.app = app
        self.project_id = project_id
        self.location = location

    def stream_query(self, **kwargs):
        # 1. Re-initialize Vertex AI context inside the remote execution
        # This fixes the "FailedPrecondition" context loss error
        import vertexai
        vertexai.init(project=self.project_id, location=self.location)

        # 2. Delegate to the app with all arguments (e.g. message, user_id)
        return self.app.stream_query(**kwargs)
# ----------------------------------------------

# Deployment Logic
# Note: We create a FRESH app instance to avoid pickling/threading errors
try:
    print("🚀 Starting Deployment to Vertex AI...")
    
    # 1. Create a FRESH, lightweight app instance
    clean_app = reasoning_engines.AdkApp(agent=cyber_agent, enable_tracing=False)

    # 2. Wrap the clean app with explicit project/location context
    wrapped_app = AgentWrapper(clean_app, project_id=PROJECT_ID, location=REGION)

    # 3. Create the Remote Engine
    remote_app = reasoning_engines.ReasoningEngine.create(
        wrapped_app,
        display_name="Cyber-Threat-Graph-Agent",
        description="Agentic GraphRAG for Cybersecurity",
        requirements=[
            "google-adk>=1.0.0",
            "langchain-community",
            "langchain-google-vertexai",
            "langchain",
            "neo4j",
            "google-cloud-aiplatform>=1.38.0"
        ],
    )
    print(f"âś… Deployed! Resource Name: {remote_app.resource_name}")
except Exception as e:
    print(f"ℹ️ Deployment skipped/failed: {e}")

:test_tube: 7. Verification & results

Once deployed, the agent is accessible via the Vertex AI API. We verify it by asking questions that require graph traversal.

Query 1: “What CVEs are associated with WellMess?”

Agent Execution Log:

  1. Reasoning: “User is asking about a specific Malware (WellMess). I should query the graph.”
  2. Tool Call: query_threat_graph({'question': 'What CVEs are associated with WellMess?'})
  3. Graph Execution: MATCH (m:Malware {name:'WellMess'})-[:EXPLOITS]->(v:Vulnerability) RETURN v.cve
  4. Final Answer: “CVE-2023-1234 is associated with WellMess.”

Complex query test (Multi-Hop)

Query 2: “Which threat actors target the Pharmaceuticals sector?”

The agent correctly traverses the graph: (Actor) -> [TARGETS] -> (Sector) and returns: “APT29 targets the Pharmaceuticals sector.”


:eye: 8. Interactive visualization (PyVis)

Textual answers are useful, but for threat intelligence, analysts need to see the connections. We integrate PyVis to render the graph structure directly in the notebook [^2].

Code Block: Visualization Logic

from pyvis.network import Network
from IPython.display import HTML
from neo4j import GraphDatabase

def visualize_attack_graph(root_node_name):
    # Connect to Neo4j (Read-Only)
    driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USER, NEO4J_PASSWORD))
    
    # Initialize Network
    net = Network(notebook=True, cdn_resources='in_line', height="500px", width="100%", bgcolor="#222222", font_color="white")
    
    # Cypher to fetch neighbors of the entity
    cypher = f"""
    MATCH (n)-[r]-(m)
    WHERE n.name = '{root_node_name}' OR n.alias = '{root_node_name}'
    RETURN n, r, m
    LIMIT 50
    """
    
    with driver.session() as session:
        result = session.run(cypher)
        for record in result:
            src = record['n']
            dst = record['m']
            rel = record['r']
            
            # Add Nodes (Color-coded by type if logic added)
            net.add_node(src.element_id, label=src.get('name') or src.get('cve'), color='#ff4b4b') 
            net.add_node(dst.element_id, label=dst.get('name') or dst.get('sector'), color='#4b94ff')
            
            # Add Edge
            net.add_edge(src.element_id, dst.element_id, title=rel.type, label=rel.type)
    
    driver.close()
    net.show('threat_graph.html')
    return HTML('threat_graph.html')

# Visualize the Actor we just queried
visualize_attack_graph("APT29")

:broom: 9. Cleanup & next steps

This project demonstrated that while the logic for Agentic GraphRAG is straightforward, the infrastructure required to host it involves nuanced dependency management and context handling.

Cleanup code

To avoid incurring costs for the deployed Reasoning Engine and Neo4j instance:

# Uncomment to delete the agent
# remote_app.delete()
print("🗑️ Agent Deleted")

Integration into production

You can now connect to this deployed agent from any external application (web app, dashboard, script) using its resource name:

# Client-side code to query your agent
from vertexai.preview import reasoning_engines

agent_id = "projects/123.../locations/us-central1/reasoningEngines/456..."
agent = reasoning_engines.ReasoningEngine(agent_id)

response = agent.query(question="Who exploits CVE-2023-1234?")
print(response)

Ready to hunt threats? :shield:

References:
[^1]: Google Cloud. “Vertex AI Reasoning Engine: Deployment Guide.”
[^2]: PyVis. “Interactive Network Visualization Documentation.”

8 Likes

Really interesting build — GraphRAG + Vertex AI workflows like this can make production deployments much more reliable by combining graph‑based retrieval with scalable agent engines. Curious how others are handling visualization and monitoring of deployed RAG workflows in production.

1 Like