Improving Real-Time UX with a Multi-Agent Architecture: Lessons from Shopper’s Concierge demo | by Kaz Sato | Google Cloud - Community

I wanted to share some insights and design lessons from my recent work on Shopper’s Concierge, an AI agent demo built with ADK and Vector Search.

When I first started building the product search demo, my instinct was to use a single agent. My plan was for this one agent to handle everything: generating the search query from user input, running the search, and ranking the results.

However, I quickly ran into several frustrating problems:

That Awful Latency: Trying to do everything in one go meant the processing time dragged on. The result was a really “sluggish” UX. Especially when using voice interaction, waiting for a response just killed the intuitive feel.
Massive, Tangled Prompts: I ended up needing these huge, complex prompts that mixed the UI dialogue logic with the backend search logic. It made the whole design really unwieldy.
Debugging Headaches: With everything crammed into one agent, pinpointing the source of errors was incredibly difficult.

This wasn’t just my struggle, either. It seems to be a common issue. I saw a GitHub blog post from 2024 mentioning how packing complex logic into one LLM agent often leads to debugging nightmares and poor maintainability. That definitely resonated with my experience.

To tackle these problems, I decided to redesign the system using a multi-agent approach. I split the work between two distinct agents:

UI Agent: This agent focuses solely on interacting with the user, handling the voice dialogue. I designed it to be lightweight and fast, prioritizing that crucial real-time responsiveness.
Search Agent: This one acts more like a specialized tool. It handles the “heavy lifting” – the actual product search and ranking. Crucially, it operates asynchronously.

The real game-changer here was asynchronous processing. By treating the Search Agent as an “agent as a tool” running in the background, the UI Agent could respond to the user almost instantly. The user gets a smooth experience without really noticing the search happening behind the scenes.

It was great to see my approach echoed in discussions by E. Nakai (@enakai00) on X (link here). He talked about limiting the LLM agent’s role to task flow control and offloading specific jobs to dedicated tools – which is exactly how my Search Agent ended up functioning!

My experience seems to fit into the wider trends we’re seeing in AI system design. Since around 2024, multi-agent systems have become a go-to solution for complex tasks. Here’s how I see my work connecting:

Efficiency through Division (DeepLearning.AI): A 2024 DeepLearning.AI article confirmed that splitting tasks across agents is often more efficient than a single-agent approach.
Async for Real-Time (arXiv): I also saw a 2025 arXiv paper (arxiv.org) stressing asynchronous processing for reducing latency in real-time interactions, which directly supports the key technique I relied on.

Looking back, here are the three main takeaways from this project for me:

Asynchronous Processing is King for Real-Time: If you need responsiveness, offload heavy tasks to run asynchronously. It’s fundamental to minimizing user-perceived lag.
Divide and Conquer Reduces Complexity: Using multiple agents, each with a clear role (like separating UI from search logic), dramatically simplified my prompts and made debugging manageable.
Agents Can Be Specialized Tools: Not every agent needs to be a master planner. Designing some agents as focused “tools” (like my Search Agent) made the whole system more efficient.

Overall, building this product search demo really sold me on the power of multi-agent systems for improving real-time UX. By using asynchronous processing and clear role separation, I could overcome the bottlenecks of a single agent and create a much better user experience. I expect we’ll see this approach used much more widely.

Multi-agent systems open up exciting possibilities for AI. I hope sharing my experience is helpful, and I encourage you to explore this approach in your own projects!

Source Credit: https://medium.com/google-cloud/improving-real-time-ux-with-a-multi-agent-architecture-lessons-from-shoppers-concierge-demo-51c466a11662?source=rss—-e52cf94d98af—4