
Imagine that you’re a full-stack engineer at a popular candy shop called Crabby’s Taffy. Let’s say that the entire customer service workflow requires a human in the loop. For instance, customers send refund requests via email, and one of two customer service reps handles that backlog of emails. As a result, sometimes it takes several days for a customer to hear back about their refund.
You suspect that the refund process could use an automation overhaul, and that an AI agent might be a good fit: the refund system takes in conversational user input, and should be able to reason through the orders database and refund eligibility requirements. This agent could also converse with the user in real-time, and issue refunds immediately, boosting customer satisfaction. By building out an automated refund system, you can free up the two busy customer service reps to handle more complex requests, like customized taffy flavors for an upcoming wedding.
You’re ready to get started building this new refund agent using Python and Agent Development Kit (ADK) …
But where to start?
On one end, you could build the refund system with a simple LLM agent: create one super-prompt, plug in your model and tools, and let the LLM reason its way through to the end-goal. On the other end, you could implement a fixed sequence that doesn’t use an LLM at all for its central reasoning. Then there’s a gray area in between. For instance, you might add some custom fixed control-logic, but rely on LLMs to get the “yes” or “no” output that takes you to different logical branches.
The key thing to remember: when you rely on an LLM for reasoning to take advantage of flexibility, nuance, and speedier development, you are accepting a tradeoff of control. And the inverse is true: when you hardcode logic into your agent, you’re accepting that you might not hit all the edge-cases and interesting tool-use patterns that an LLM might be able to provide. You are losing flexibility.
You also might consider whether to build one simple agent, or a larger multi-agent system. As I covered in my last post, multi-agent systems have several advantages, from a modular and re-usable architecture, to higher-quality responses through a system of experts — but that comes with increased complexity.
In short, how you build your agent matters, and each implementation will have its own tradeoffs. To illustrate this, let’s walk through the process of building the Crabby’s Taffy refund agent five ways, using Agent Development Kit (ADK) and Python.
Let’s start with a simple, single-agent architecture. Here, the agent’s model, Gemini 2.5 Flash, processes the customer’s request, choosing to invoke one or more tools to decide whether the customer is eligible for a refund. Then, it uses the model to generate the final response to the user, in natural language.
Here are the three tools we’ll implement for this agent:
1 — get_purchase_history
: Fetches the customer’s latest taffy purchases from a mock database.
def get_purchase_history(purchaser: str) -> List[Dict[str, Any]]:
history_data = {
"Alexis": [
{
"purchase_id": "JD001-20250415",
"purchased_date": "2025-04-15",
"items": [
{
"product_name": "Assorted Taffy 1lb Box",
"quantity": 1,
"price": 15.00,
},
{
"product_name": "Watermelon Taffy 0.5lb Bag",
"quantity": 1,
"price": 8.00,
},
],
"shipping_method": "STANDARD",
"total_amount": 23.00,
}
...
purchaser = purchaser.strip().title()
if purchaser not in history_data:
logger.warning(f"No purchase history found for: {purchaser}")
return []
history = history_data[purchaser]
return history
2 — check_refund_eligibility
: Checks the Crabby’s Taffy refund policies against the user’s order (shipping method) and reason for refund, in order to issue a yes-or-no decision.
ELIGIBLE_SHIPPING_METHODS = ["INSURED"]
ELIGIBLE_REASONS = ["DAMAGED", "NEVER_ARRIVED"]def check_refund_eligible(reason: str, shipping_method: str) -> bool:
reason_upper = reason.strip().upper()
shipping_upper = shipping_method.strip().upper()
is_eligible = (
shipping_upper in ELIGIBLE_SHIPPING_METHODS and reason_upper in ELIGIBLE_REASONS
)
logger.info(f"Refund eligibility result: {is_eligible}")
return is_eligible
3 — process_refund
: Issues a full cash refund.
def process_refund(amount: float, order_id: str) -> str:
# For now, we'll simulate a successful refund
refund_id = f"REF-{order_id}-{int(amount*100)}"
return f"✅ Refund {refund_id} successful! We will credit ${amount:.2f} to your account within 2 business days."
Lastly, we use ADK’s default Agent
implementation, LLMAgent, to spec out the agent itself…
root_agent = Agent(
model=GEMINI_MODEL,
name="RefundSingleAgent",
description="Customer refund single-agent for Crabby's Taffy company",
instruction=top_level_prompt,
tools=[get_purchase_history, check_refund_eligibility, process_refund],
)
…Using this top_level_prompt
:
You are a friendly and helpful customer refund agent for the Crabby's Taffy candy company.
Your role is to process refund requests efficiently while maintaining excellent customer service.When a customer asks for a refund, first start by gathering the necessary information. We need TWO THINGS:
1. The customer's first name.
2. The reason for the refund request.
Prompt the user for this info until you have it.
Once you have the name and refund reason, then you need to do this, in this order:
1. Get Purchase History. Verify that this user has a recent order on file, and gather the order ID, total purchase amount, and shipping method.
2. Check Refund Eligibility. Use the order ID, total purchase amount, and shipping method to check if the refund is eligible. To do this:
- Extract the shipping method from the purchase history above
- Convert the customer's stated refund reason to one of these codes:
- DAMAGED: Package arrived damaged, melted, or opened.
- LOST: Package never arrived or went missing in transit.
- LATE: Package arrived late.
- OTHER: Any other reason, e.g. "It tasted gross."
Do not respond to the user while you're checking these items behind the scenes. Try to do both IN ONE GO. Don't stop.
Then,
- If the user IS ELIGIBLE for a refund, call the process refund function or sub-agent to issue the refund. Don't skip this step!
- If the user is NOT eligible for a refund, politely say that you're unable to accomodate the request.
When you're done with this whole process, be sure to thank the user for being a Crabby's Taffy customer, and send a few cute emojis, like 🦀 or 🍬 or similar.
Then we can test our single agent using the ADK Web UI. Here’s what it looks like in action:
Hooray! We’ve built a simple agent that calls three different tools.
This single-agent refund system works pretty well for our use case. It was simple to implement, relatively easy to reason about and debug, and it has fairly low latency because all the orchestration happens inside a single agent invocation. As shown in ADK’s trace view, the entire agent run took six seconds.
Imagine that we go into production with this single-agent architecture, and things are working well. But as time passes, and we keep adding more tools, more logic, and more bits of system prompt, this architecture becomes unwieldy and harder to reason about. It’s a big monolith! This is when we might want to consider a multi-agent architecture…
In this next pattern, we implement the refund system as an LLM-based multi-agent. Here, we implement one “coordinator” agent who “dispatches” requests to one of three sub-agents. For simplicity, we’ve built one sub-agent around each of our three existing tools: PurchaseVerifier
, RefundEligibility
, and RefundProcessor
.
Each of these sub-agents gets its own system prompt. For instance, here’s the RefundProcessorAgent
prompt:
process_refund_subagent_prompt = """
You are the SendRefund Agent for Crabby's Taffy.
You handle the final step of the refund process.First, verify if the user is eligible for a refund based on the response of a prior agent.
Eligibility Status: {is_refund_eligible}
If the eligibility status is true, call the process_refund tool to process the refund. Send back the returned value from the process refund tool as the final output to the user.
If the user is not eligible for a refund, say you're unable to accommodate the request, and exit.
"""
Notice how our prompt has a little {}
area for {is_refund_eligible}
. What is that? That’s actually pulling from ADK’s session state. Every interaction that a user has with this agent system, gets a session. That session gets state — like a short-term memory. In an ADK multi-agent, all the sub-agents share state. This way, when the RefundEligibilityAgent
runs, it writes its response to an output_key
called is_refund_eligible
. Then, when the refund processor agent runs, it can pull the true
/false
value from that state variable, to inform whether it sends a full refund to the user. Cool stuff!
So what are the tradeoffs for this multi-LLM-agent pattern? First, the pros: we get the benefits of a multi-agent system (a decoupled architecture that’s easier to scale and re-use), while also getting the flexibility of LLM reasoning. The Coordinator agent and all three sub-agents rely on the LLM to reason about when to invoke the sub-agents, and when to call tools. We end up with a conversational agent that looks like this:
But there are some drawbacks. First, you’ll most likely end up with more model calls when you have four agents instead of one. While this can result in higher-quality reasoning and responses, it might also mean more token throughput (costs) and higher latency. The traces for our multi-turn agent show that the total refund process took over 12 seconds, double the time of our single-agent system. This extra time could frustrate users.
And by completely relying on the LLM for reasoning in a multi-agent system, we are increasing the risk of routing errors (coordinator dispatches to the wrong sub-agent), and infinite loops (the same sub-agent is called over and over). With this setup, agent evaluation is extra important, so that you catch some of these problems early.
Let’s say that we decide an LLM-based multi-agent didn’t provide enough value over the single-agent. But what if another part of the Crabby’s Taffy company — the site recommendation dev team — wants to re-use a bit of our code, specifically the PurchaseHistoryAgent
, to help inform recommendations on what type of taffy a returning customer might like?
To do this, let’s keep our multi-agent architecture, but refactor it to reduce the “randomness” in the way the sub-agents are invoked.
Another way to implement agent systems is by “hardcoding” logic , and reducing the role that the LLM plays in reasoning, especially when it comes to delegating to sub-agents. A simple way to implement a workflow like this, is by using a SequentialAgent
. This type of agent works by — you guessed it — invoking sub-agents in a pre-defined, ordered sequence:
This architecture has a lot of similarities with the LLM-driven multi-agent: same sub-agents, same model, same tools, same prompts. The key difference is that rather than relying on the LLM to route to the right sub-agent in the right order, we are pre-defining the execution order:
root_agent = SequentialAgent(
name="SequentialRefundProcessor",
description="Processes customer refunds in a fixed sequential workflow",
sub_agents=[
purchase_verifier_agent,
refund_eligibility_agent,
refund_processor_agent,
],
)
At runtime, the sequential agent looks like this:
So what are the pros, here? First, a “hardcoded” agent sequence is much easier to test, debug, and reason about than an LLM-driven parent agent. It’s more predictable in its behavior.
At the same time, you still get the benefits of a multi-agent architecture. For instance, the site recommendation dev team could grab the PurchaseHistoryAgent
, and use that as part of their own multi-agent. And remember that the sub-agents are LLM-driven, in how they reason through their sub-task and call their tools. Only the parent agent is “hardcoded.”
Third, you likely will get lower latency because there are fewer calls to the model. The total running time of the SequentialAgent
invocation is speedier: about eight seconds, closer to what we had with our single-agent architecture. With fewer calls to the model comes lower costs.
But sequential agents have two major drawbacks. The first is that they are inflexible. They can’t skip steps in the way an LLM-based multi-agent could — even if running one of the sub-agents is unnecessary. They can’t adapt to nuance or edge-cases, either. The second drawback is cumulative latency. Because all the sub-agents must run one after the other, the total invocation time can stack up. This brings us to our next architecture….
What if instead of running our three sub-agents in a sequence, we could parallelize some of it, to speed up the end-user experience?
Here, we add an ADK ParallelAgent
within our SequentialAgent
architecture. Rather than first running PurchaseVerifier
, then RefundEligibility
, we run them both at the same time. The output of both agents then goes into the RefundProcessor
, which makes the final call on whether to send a full refund.
verifier_agent = ParallelAgent(
name="VerifierAgent",
description="Checks purchase history and refund eligibility in parallel",
sub_agents=[purchase_verifier_agent, refund_eligibility_agent],
)root_agent = SequentialAgent(
name="SequentialRefundProcessor",
description="Processes customer refunds in a fixed sequential workflow",
sub_agents=[
verifier_agent, # ParallelAgent is used as a sub-agent in our sequence!
refund_processor_agent,
],
)
While similar to the SequentialAgent architecture, adding a parallel step reduces the total running-time of the agent invocation. While the time-reduction here is only by 1–2 seconds compared to the sequential agent, in a full production agent that uses a lot of tools and model-calls, you could see much steeper reductions in latency.
By adding a ParallelAgent
, you are still guaranteeing that both will run in a fixed workflow, just like the SequentialAgent
.
But there are a few cons. The most significant is that a parallel setup might add complexity, especially around state management. For instance, if our RefundEligibilityAgent
relied on output from PurchaseVerifierAgent
(like the shipping method used in the customer’s order), we can’t guarantee that PurchaseVerifier
will have its output ready by the time RefundEligibility
kicks off. If we keep the dependency in, we could end up with a tricky race condition. This setup is just generally harder to test and debug than a standard sequential agent.
Another drawback of a parallel system is potential fighting over compute resources — make sure your agent runtime is configured to have both of your sub-agents running at the same time, especially if they are resource-intensive.
Overall, consider a parallel agent architecture when you have sub-agents that truly can run in parallel without relying too heavily on each other.
We’ve made it to our last agent implementation — phew! Imagine that you’re having some success with the structured, sequential agent, but you want to go beyond the binary outcome of “send refund” vs. “no refund.” For instance, what if customers that are ineligible for a cash refund, could instead receive some store credit for their next order?
This is where custom agents enter the picture. In ADK, you can build a Custom Agent by coding up your own implementation of the ADK BaseAgent
.
class CustomerRefundAgent(BaseAgent):
refund_eligibility_checker: LlmAgent
get_purchase_history: LlmAgent
process_full_refund: LlmAgent
offer_store_credit: LlmAgent
process_store_credit_response: LlmAgent
parallel_agent: ParallelAgent
...
Then, you can add your own logic, like if/else conditions. To implement the store credit alternative, that’s exactly what we’ll do.
Here, we keep the parallel agent from the last step, while changing the way we process refund eligibility. We’ll use the is_eligible variable within the session’s state — set by the RefundEligibilityAgent
— to determine whether to run our standard RefundProcessorAgent
, or to branch to a new set of sub-agents: StoreCreditAgent
and ProcessCreditDecision
:
is_eligible = ctx.session.state.get("is_eligible")
....
if purchase_history and not is_eligible:
logger.info(
f"[{self.name}] Customer not eligible for refund. Offering store credit."
)
async for event in self.offer_store_credit.run_async(ctx):
logger.info(
f"[{self.name}] Event from OfferStoreCredit: {event.model_dump_json(indent=2, exclude_none=True)}"
)
yield event
async for event in self.process_store_credit_response.run_async(ctx):
logger.info(
f"[{self.name}] Event from ProcessStoreCreditResponse: {event.model_dump_json(indent=2, exclude_none=True)}"
)
yield eventfinal_message = ctx.session.state.get("final_response", "")
This setup uses ADK events to signal changes in the agent’s control-flow and state between the ADK runner, and other parts of the ADK framework, allowing the agent to progress through the “branches” towards the FinalResponseAgent
.
From there, the store credit sub-agent offers the user a 50% store credit for their next purchase, then waits for the customer’s response to see if they’ve accepted that alternative:
offer_store_credit = LlmAgent(
name="OfferStoreCredit",
model=GEMINI_MODEL,
instruction="""
The customer is not eligible for a refund but has a valid purchase history. Politely explain this and offer them a 50% store credit for their next purchase as an alternative. Be empathetic and professional.
Ask if they would like to accept this offer.
""",
input_schema=None,
output_key="store_credit_offer",
)process_store_credit_response = LlmAgent(
name="ProcessStoreCreditResponse",
model=GEMINI_MODEL,
instruction="""
You are processing the customer's response to the store credit offer.
Based on the customer's response in session state with key 'customer_response':
- If they accept: Output "I'll send this to your account. Thanks!"
- If they decline: Output "I apologize that we couldn't accommodate your request. Thank you for your understanding."
Store the appropriate message based on their response.
""",
input_schema=None,
output_key="final_response",
Lastly, the FinalResponseAgent processes the output, no matter what it is (full refund, accepted store credit, declined store credit) and wraps up the conversation with the user.
Here’s what our custom agent looks like in action:
Now, let’s talk through the pros and cons of a CustomAgent
implementation in ADK. The key benefit is max flexibility — we’ve got predefined logic, we’ve got a ParallelAgent
, we have LLMAgents
using the model to reason… it’s a hodgepodge of “best tool for the job!” You can implement a Custom Agent to do nearly anything, while having precise control over the execution. Choose a Custom Agent for your gnarliest workflows, where you want the best of both worlds between ADK’s built-in patterns and your own custom logic.
The drawback? Complexity. Building a CustomAgent will probably take you the most time, effort, and lines of code. It requires a deeper knowledge of ADK, and might take longer to test and debug. It will probably take longer for a teammate to understand how it all works, and components of the code might be harder to re-use.
Overall, I’d recommend not jumping to a Custom Agent as your first agent implementation. Start simple with a single LLMAgent, and test it out on your evaluation dataset. If it isn’t performing well, start by tweaking your system prompts and making sure your tools are well-defined and understandable by the model. Then, once you’re familiar with the nuts and bolts of ADK — from state and memory, to events — the sky’s the limit! That’s what I like about ADK: it’s easy to get started, but also has lots of features to build customized agents.
There are some agent patterns we didn’t even have a chance to cover in this post, from loop agents that can reflect and iterate, to hierarchical agents that can decompose a problem. You can even add a human in the loop, if you have a critical task that requires human oversight.
The key takeaway of this post is that agents are flexible. You might solve one problem with several valid implementations, each with their own pros and cons. Don’t be afraid to start small, test out new patterns, and mix and match.
🦀 Learn more:
📚 Get back to basics:
Source Credit: https://medium.com/google-cloud/agent-patterns-with-adk-1-agent-5-ways-58bff801c2d6?source=rss—-e52cf94d98af—4