Building a Transcript Summarization Agent with Google ADK and Vertex AI for Call Centers

hi Team ,

I recently experimented with Google’s Agent Development Kit (ADK) to build a simple AI agent for a customer service call center scenario. Instead of manually reviewing long chat transcripts, the agent automatically summarizes conversations and highlights key insights

please publish this article in google community

A structured framework to build AI agents as easily as web apps with Spring Boot or Django.

I recently experimented with Google’s Agent Development Kit (ADK) to build a simple AI agent for a customer service call center scenario. Instead of manually reviewing long chat transcripts, the agent automatically summarizes conversations and highlights key insights. With ADK, I could connect LLMs, and memory seamlessly — without heavy custom glue code.

What it is ?

An open-source framework to build, test, and deploy AI agents and multi-agent systems.

Why ADK?

Google’s Agent Development Kit (ADK) exists because building AI agents was previously fragmented, custom-coded, and hard to standardize. With ADK, they want to give developers:

  • A framework instead of everyone reinventing agents from scratch.

  • Built-in tools (search, code execution, web requests, memory, etc.) so developers don’t need to build these repeatedly.

  • A consistent orchestration model (sequential, parallel, loop workflows).

  • Evaluation, observability, and state management for debugging and scaling

  • Deployment-agnostic runtime so you can run locally, in cloud, or hybrid

Basically, ADK is like giving developers a starter kit + runtime + orchestration engine for agents.

How people were building agents before ADK

Before Google ADK and AWS Agent Core, developers built agents in ad-hoc ways:

  • Using frameworks like LangChain, LlamaIndex, or AutoGen to chain prompts, tools, and memory

  • Writing custom orchestration code to decide how agents call APIs or other agents.

  • Extending chatbot platforms (Dialogflow, Rasa, Lex) that weren’t designed for multi-agent or LLM workflows.

  • Stitching together LLM APIs + AWS Lambda/Cloud Functions + custom logic, which was brittle and hard to scale.

ADK standardizes all this by providing a ready-made framework.

Core components of ADK

  • Agent: Core entities that follow instructions, call tools, manage memory, and execute workflows.

  • Workflows → Orchestration engine supporting sequential, parallel, and looped task execution.

  • Tools: Tools give agents abilities beyond conversation, letting them interact with external APIs, search information, run code, or call other services.

  • Session Services: Session services handle the context of a single conversation (Session), including its history (Events) and the agent’s working memory for that conversation (State).

  • Memory & State → Mechanisms to maintain context across turns and persist agent state.

  • Callbacks: Custom code snippets you provide to run at specific points in the agent’s process, allowing for checks, logging, or behavior modifications.

  • Artifact Management: Artifacts allow agents to save, load, and manage files or binary data (like images or PDFs) associated with a session or user.

  • Runner: The engine that manages the execution flow, orchestrates agent interactions based on Events, and coordinates with backend services.

Use Case: Customer Service Call Center

I experimented this ADK for a customer service call center to automatically summarize and analyze chat transcripts in real-time or post-conversation. Instead of a human agent or supervisor reading through long transcripts, this system provides an instant, AI-generated summary.

What It Does

This code builds an AI agent using the Google ADK that takes a chat transcript as input and outputs a concise summary. The core functionality is driven by the agent’s ability to reason and use tools.

When a transcript is received, the agent’s internal model first identifies that it needs to summarize the text and, as per its instruction, use the extract_keywords tool. It calls this tool to get the top keywords from the transcript, then integrates those keywords into a final, comprehensive summary. This allows the system to not only summarize the conversation but also highlight the most important topics, like “billing,” “refund,” or “delay,” which is critical for quick triage and analysis.

Install ADK

python3 -m pip install google-adk

Agent

The core entity that ties LLM + tools + workflows + memory together.

transcript_agent = Agent(
    name="transcript_summarization_agent",
    description="Summarizes chat transcripts and highlights key issues and keywords.",
    model="gemini-1.5-flash",  # Vertex AI Gemini model
    tools=[keyword_tool],
    instruction="""
    You are an AI assistant specialized in summarizing customer chat transcripts. 
    Your task is to provide a clear and concise summary of the conversation. 
    **Important:** You must use the 'extract_keywords' tool to identify and list the top 5 keywords from the transcript as part of your summary.
    """
)

Memory & State

Keeps track of past interactions.

from adk.memory import ConversationMemory
from google.cloud.adk import Agent, Session
memory = ConversationMemory()
# Session ensures continuity across runs
session = Session(transcript_agent, memory=memory)

LLM (Core Engine)

The “brain” of the agent.

model="gemini-1.5-flash"

Tools

Custom integrations an agent can call.

def extract_keywords(transcript: str) -> list[str]:
    """Extracts top keywords from a given transcript."""
    words = transcript.lower().split()
    from collections import Counter
    freq = Counter(words)
    return [w for w, _ in freq.most_common(5)]

keyword_tool = FunctionTool(
    name="extract_keywords",
    fn=extract_keywords,
    description="Extracts the top 5 most common keywords from a text transcript."
)

Full version of code

# main.py
import os
import logging
from dotenv import load_dotenv
from flask import Flask, request, jsonify
import google.cloud.logging
from google.cloud.logging.handlers import CloudLoggingHandler
import vertexai
from vertexai.preview.agents import FunctionTool
from google.cloud.adk import Agent, Session
from google.cloud.adk.memory import ConversationMemory
# 0️⃣ Setup (env + logging)
load_dotenv()
PROJECT_ID = os.getenv("GCP_PROJECT")
LOCATION = os.getenv("GCP_REGION", "us-central1")
# Initialize Vertex AI
vertexai.init(project=PROJECT_ID, location=LOCATION)
# Cloud Logging setup
client = google.cloud.logging.Client()
handler = CloudLoggingHandler(client)
logging.getLogger().setLevel(logging.INFO)
logging.getLogger().addHandler(handler)
logging.info(" Transcript Summarization Agent starting...")
# 1️⃣ Tool (ADK Component: Tools)
def extract_keywords(transcript: str) -> list[str]:
    """Extracts top keywords from a given transcript."""
    words = transcript.lower().split()
    from collections import Counter
    freq = Counter(words)
    return [w for w, _ in freq.most_common(5)]
keyword_tool = FunctionTool(
    name="extract_keywords",
    fn=extract_keywords,
    description="Extracts the top 5 most common keywords from a text transcript."
)
# 2️⃣ Memory (ADK Component: Memory & State)
memory = ConversationMemory()
# 3️⃣ Agent (ADK Component: Agent)
transcript_agent = Agent(
    name="transcript_summarization_agent",
    description="Summarizes chat transcripts and highlights key issues and keywords.",
    model="gemini-1.5-flash",  # Vertex AI Gemini model
    tools=[keyword_tool],
    instruction="""
    You are an AI assistant specialized in summarizing customer chat transcripts. 
    Your task is to provide a clear and concise summary of the conversation. 
    **Important:** You must use the 'extract_keywords' tool to identify and list the top 5 keywords from the transcript as part of your summary.
    """
)
# 4️⃣ Runtime & Session (ADK Component: Runtime & Deployment)
session = Session(transcript_agent, memory=memory)
# 5️⃣ Flask App (exposed via Cloud Run or App Engine)
app = Flask(__name__)
@app.route("/summarize", methods=["POST"])
def summarize():
    data = request.json
    transcript = data.get("transcript", "")
    logging.info(f"Received transcript: {transcript}")
    # **Correct usage:** Let the agent handle the full task, including tool use.
    # The agent's instruction guides its behavior.
    prompt = f"Summarize the following customer transcript: {transcript}"
    
    response = session.run(prompt)
    logging.info(f"Agent response: {response.output}")
    return jsonify({"summary": response.output})
if __name__ == "__main__":
    app.run(host="0.0.0.0", port=int(os.getenv("PORT", 8080)))

Local Testing

Testing locally is the first step to ensure your code works before deploying.

  • Run the Flask app: Execute python main.py to start a local web server.

  • Send a request: Use a tool like curl or a REST client (e.g., Postman, Insomnia) to send a POST request to http://localhost:8080/summarize with a JSON payload containing the transcript.

Agentic AI solution

This summarization agent can easily be extended into a broader Agentic AI solution by adding specialized tools. For example, a sentiment_analyzer can gauge customer mood, an entity_extractor can capture details like product names or order IDs, and an issue_classifier can automatically categorize cases such as billing disputes or technical issues. Together, these agents form a coordinated system that goes beyond summarization to deliver end-to-end customer support.

  • sentiment_analyzer: A tool (e.g., calling a natural language API) to determine the overall sentiment (positive, negative, neutral) of the customer and/or agent.

  • entity_extractor: Identifies specific entities like product names, order numbers, customer IDs.

  • issue_classifier: A tool that can categorize the customer’s problem (e.g., “billing dispute,” “technical support,” “product inquiry”).

  • CRM_lookup: A tool to retrieve customer history or details from a CRM system based on an identified customer ID.

Think of a single AI Agent as a monolithic application. It can perform multiple tasks — like summarizing transcripts, detecting sentiment, and extracting key entities — but all of this logic is contained in one system. While this works for simpler use cases, it can become complex to maintain, scale, or update as your needs grow.

Agentic AI, on the other hand, is like a microservices architecture for AI. Instead of one agent handling everything, multiple specialized agents each focus on a single task: one summarizes transcripts, another analyzes sentiment, another classifies issues, and another suggests next steps. These agents are orchestrated to work together seamlessly, allowing the system to handle complex, multi-step workflows autonomously.

The key difference: a single AI agent multitasks inside a monolith, whereas Agentic AI distributes responsibilities across coordinated agents, providing flexibility, scalability, and faster, more reliable outcomes. In a call center scenario, this means supervisors get actionable insights from conversations automatically, without reading every transcript or manually combining multiple outputs.

Agentic AI, on the other hand, is like a microservices architecture for AI. Instead of one agent handling everything, multiple specialized agents each focus on a single task: one summarizes transcripts, another analyzes sentiment, another classifies issues, and another suggests next steps. These agents are orchestrated to work together seamlessly, allowing the system to handle complex, multi-step workflows autonomously.

The key difference: a single AI agent multitasks inside a monolith, whereas Agentic AI distributes responsibilities across coordinated agents, providing flexibility, scalability, and faster, more reliable outcomes. In a call center scenario, this means supervisors get actionable insights from conversations automatically, without reading every transcript or manually combining multiple outputs.

Thanks

Sudha

2 Likes

@Sudhaast Can you guide me on how to integrate with the CCAI Platform? I use Agent Assist to summarize a call transcript. Can you tell me what the differences are in terms of cost, code flexibility, and integration perspective?