Hey,
I am creating an image refinement agent via ADK that takes an image as input and then refines it based on the users text input.
However, when I send a prompt with an attached image via “adk web”, I only get the text prompt to use. Does anyone know how to get the image as well ?
What Ive tried:
-
Checked tool_context.user_content but only see users text
-
Enabled save_input_blobs_as_artifacts to true but when I printed tool_context.list_artifacts(), it doesnt show anything.
Hi @Siddhesh,
thanks for asking!
This is a problem for any media agent relying on Cloud Storage. Below you can see what I did.
I have created a custom tool that not only downloads the file but also explicitly registers it as an ADK Artifact. This ensures the image is properly surfaced and visualized in the Agent Development Kit (ADK) Web UI for inspection.
from google.adk.tools import ToolContext
# Visualization tools
async def visualize_image(gcs_uri: str, tool_context: ToolContext) -> dict:
"""Download an image from GCS and save it as an ADK artifact for visualization.
Args:
gcs_uri: The GCS URI of the image (e.g., gs://bucket-name/path/to/image.png)
tool_context: The ADK tool context
Returns:
Dictionary with status and artifact key
"""
try:
# Parse GCS URI
if not gcs_uri.startswith("gs://"):
return {"status": "error", "message": "URI must start with gs://"}
# Extract bucket and blob path
path_parts = gcs_uri[5:].split("/", 1)
if len(path_parts) != 2:
return {"status": "error", "message": "Invalid GCS URI format"}
bucket_name, blob_path = path_parts
# Download from GCS
storage_client = storage.Client()
bucket = storage_client.bucket(bucket_name)
blob = bucket.blob(blob_path)
image_bytes = blob.download_as_bytes()
# Get filename for artifact
filename = Path(blob_path).name
# Save to orchestrator folder
artifact_dir = ORCHESTRATOR_DIR / "artifacts"
artifact_dir.mkdir(exist_ok=True)
local_path = artifact_dir / filename
with open(local_path, "wb") as f:
f.write(image_bytes)
# Save as ADK artifact
await tool_context.save_artifact(
filename=filename,
artifact=types.Part.from_bytes(
data=image_bytes,
mime_type="image/png"
),
)
return {
"status": "success",
"artifact_key": filename,
"local_path": str(local_path),
"message": f"✅ Image saved as artifact: {filename}\n💾 Local copy: {local_path}"
}
except Exception as e:
return {"status": "error", "message": f"Failed to visualize image: {str(e)}"}
This pattern is reusable for managing any non-textual data (videos for example) within your agent’s reasoning flow.
Hope it helps.
Happy building!