Built a project that needed real-time voice conversation (bidiGenerateContent) and canvas text editing (generateContent) running simultaneously with shared session state. Wanted to share a few things I hit that aren’t in the docs.
You can’t mix bidi and unary calls in one ADK runner. It just 404s; no helpful error message. We spent way too long on this. Fix was two separate runners: voice_runner (WebSocket, run_live) and text_runner (SSE, run_async) sharing session and artifact services, each with its own agent tree built by the same factory function.
ADK’s SkillToolset appends all skills as XML to the system instruction. We had 151 scientific skills — 62K characters total. The native audio model rejects this with error 1008 and the session dies. Had to subclass it into a compact version with one-line summaries.
For anyone doing long sessions on Cloud Run: context window compression (trigger at 100K, slide to 80K) combined with Firestore dual-writes for state persistence eliminates the 15-minute session limit.
Project is open source: GitHub - samartho4/ice: Inform, Compute and Evolve with Gemini Live Agent · GitHub
Demo: ice
Happy to go deeper on any of this.