Is there a configuration to prevent ADK from sending previous agent output in chat history?

The Question

Is there any configuration parameter that prevents ADK from automatically sending the previous agent’s output in chat history?

I’m working with ADK’s SequentialAgent and discovered that when using template variables like {summary} to selectively inject specific data into the system instruction, ADK also automatically adds the ENTIRE output from the previous agent to the conversation contents as [previous_agent] said: {...}.

I tried include_contents='none' on the second agent, but this only removes the initial user message - the agent transition messages still appear, resulting in data duplication.

The Problem

ADK’s SequentialAgent uses TWO data-passing mechanisms simultaneously:

  1. Template Injection (explicit in examples):

    • {summary} replaced with data from context.state['summary']
    • Injected into system instruction
  2. Automatic Chat History (implicit behavior):

    • ADK automatically adds: [summarizer_agent] said: {entire output dictionary}
    • Added to conversation contents/chat history
    • Passes the FULL agent output, not just selected fields

Example

# Agent 1
summarizer_agent = Agent(
    output_key="summary",  # Saves to context.state['summary']
    ...
)

# Agent 2
translator_agent = Agent(
    instruction="""
    ARTICLE SUMMARY:
    {summary}  # ← Template variable for SPECIFIC data injection

    Translate this to Spanish...
    """,
    include_contents='none',  # Tried this - doesn't eliminate agent transitions
)

Observed Behavior:

{
  "system_instruction": "...ARTICLE SUMMARY:\n{'title': '...', 'key_points': [...]}...",
  "contents": [
    {
      "text": "[summarizer_agent] said: {\n  'title': '...',\n  'key_points': [...]\n}",
      "role": "user"
    }
  ]
}

Impact

For pipelines with large structured data payloads, this duplication causes:

  • Increased token usage (10-30% overhead observed)
  • Non-ideal prompts with redundant information
  • Potential LLM confusion from seeing the same data in different formats

Environment:

  • ADK Version: 1.16.0
  • Python
  • SequentialAgent with output_key + template variable pattern

Related Discussion:
Also posted to GitHub Discussions - no resolution yet.

Would appreciate any guidance on configuration options or best practices!

I’ve found the root cause and successfully implemented a workaround for this issue.

Root Cause

This behavior is tracked in Issue #2207 and was closed as “NOT_PLANNED”. The ADK team considers the automatic injection of agent transition messages ([agent_name] said:) intentional.

This means there’s no built-in configuration to disable it, but we can work around it using callbacks.

Solution: before_model_callback

For template-only pipelines where agents receive all data through template variables (e.g., {summary}), you can use before_model_callback to clear conversation history before the request is sent to the LLM:

from google.adk.agents import Agent, SequentialAgent def clear_all_history(callback_context=None, llm_request=None): “”“Remove ALL conversation history from LLM requests. Use this when agents receive all necessary data through template variables in the instruction. No conversation history is needed. “”” if llm_request is not None: llm_request.contents = # Clear all history return None # Agent 1: Creates summary summarizer_agent = Agent( name=“summarizer_agent”, output_key=“summary”, instruction=“Summarize this article: {article_text}”, ) # Agent 2: Receives data ONLY through {summary} template translator_agent = Agent( name=“translator_agent”, instruction=“”“ARTICLE SUMMARY: {summary} Translate this summary to Spanish… “””, before_model_callback=clear_all_history, # ← Key solution ) pipeline = SequentialAgent( sub_agents=[summarizer_agent, translator_agent], )

How It Works

  1. ADK builds the LLM request with full history (including [agent_name] said: messages)
  2. Your callback runs and modifies llm_request.contents = [] (Python passes by reference)
  3. The modified request is sent to the LLM with only the system instruction

Results

What the LLM receives:

  • System instruction with populated template variables ({summary})
  • Empty conversation history (contents = [])
  • Eliminates token duplication

Reference

For alternative approaches (selective filtering, history trimming for conversational agents), see the discussion in Issue #2207.