The landscape of AI applications has evolved dramatically over the past few years. What started as simple summarization and classification tasks has grown into sophisticated agentic systems capable of autonomous decision-making. In this post, I'll share three core principles for building effective agents based on practical experience: don't build agents for everything, keep it simple, and think like your agents.
The Evolution of AI Applications
Most of us started our AI journey with simple features—summarization, classification, extraction. These felt like magic two to three years ago but have now become table stakes. As products matured, we began orchestrating multiple model calls in predefined control flows, creating what we call workflows. This gave us a way to trade off cost and latency for better performance.
Today, we're seeing the emergence of true agentic systems. Unlike workflows, agents can decide their own trajectory and operate almost independently based on environment feedback. As we give these systems more agency, they become more useful and capable—but the cost, latency, and consequences of errors also increase proportionally.
Don't Build Agents for Everything
Agents are powerful tools for scaling complex and valuable tasks, but they shouldn't be a drop-in upgrade for every use case. Here's a practical checklist to determine when agents are appropriate:
The Agent Decision Checklist
-
Task Complexity: Agents thrive in ambiguous problem spaces. If you can map out the entire decision tree easily, build that explicitly and optimize every node. It's more cost-effective and gives you better control.
-
Task Value: Agent exploration costs tokens—lots of them. If your budget per task is around 10 cents (affording only 30-50k tokens), use a workflow to solve the most common scenarios instead. On the flip side, if your reaction is "I don't care how many tokens I spend, just get it done"—agents might be perfect for you.
-
Critical Capabilities: Ensure there aren't significant bottlenecks in the agent's trajectory. For coding agents, verify they can write good code, debug effectively, and recover from errors. Bottlenecks multiply cost and latency.
-
Cost of Errors: If errors are high-stakes and hard to discover, it's difficult to trust agents with autonomy. You can mitigate this with read-only access or human-in-the-loop systems, but this limits scalability.
Why Coding Makes a Great Agent Use Case
Coding exemplifies an ideal agent use case:
- Going from design doc to PR is highly ambiguous and complex
- Good code has significant value
- Models like Claude are already proficient at many coding workflows
- Output is easily verifiable through unit tests and CI
This explains why we're seeing so many creative and successful coding agents in production today.
Keep It Simple
Once you've identified a good use case, the second principle is radical simplicity. Here's what agents look like at their core:
The Three Components of Agents
Agents are essentially models using tools in a loop. Three components define them:
- Environment: The system the agent operates in
- Tools: Interfaces for the agent to take action and get feedback
- System Prompt: Goals, constraints, and ideal behaviors
That's it. Keep this simple because any upfront complexity kills iteration speed. Iterating on just these three basic components gives you the highest ROI—optimizations can come later.
Practical Examples
Whether building coding agents, search agents, or computer-use agents, they all share this same backbone. The environment depends on your use case, so your only real design decisions are:
- What tools to offer the agent
- What prompt to instruct the agent
Once you have these basics working, then optimize:
- Cache trajectories to reduce cost
- Parallelize tool calls to reduce latency
- Present progress to gain user trust
But always start simple and optimize only after you have the behaviors down.
Think Like Your Agents
The third principle is perhaps the most counterintuitive: put yourself in your agent's context window. Agents can exhibit sophisticated behavior, but at each step, they're just running inference on a limited set of contexts—typically 10-20k tokens. Everything the model knows about the world is contained in that window.
The Computer Use Agent Experience
Imagine being a computer-use agent:
- You receive a static screenshot and a poorly written description
- You can think and reason, but only tool actions affect the environment
- When you click, it's like closing your eyes for 3-5 seconds
- You reopen them to see a new screenshot—your action might have worked or shut down the computer
This exercise reveals what agents actually need:
- Clear screen resolution information for accurate clicking
- Recommended actions and limitations to avoid unnecessary exploration
- Coherent context about the current state
Leveraging Natural Language
Fortunately, we're building systems that speak our language. You can:
- Ask Claude if instructions are ambiguous
- Verify if tool descriptions make sense
- Analyze entire trajectories to understand decision-making
- Request suggestions for better context
This shouldn't replace your own understanding but helps gain the agent's perspective.
Personal Musings on the Future
Looking ahead, three areas occupy my thoughts:
Budget-Aware Agents
Unlike workflows, we lack control over agent cost and latency. Solving this will enable many more production use cases. The open question: how to best define and enforce budgets—time, money, or tokens?
Self-Evolving Tools
We're already using models to iterate on tool descriptions. This should generalize into meta-tools where agents design and improve their own tool ergonomics, making them more general-purpose.
Multi-Agent Collaboration
By the end of this year, we'll likely see more multi-agent collaborations in production. They offer:
- Better parallelization
- Nice separation of concerns
- Context window protection through sub-agents
The big question: how should these agents communicate? Our current synchronous user-assistant paradigm needs expansion for asynchronous communication and new roles.
Key Takeaways
If you remember nothing else:
- Don't build agents for everything - Use the checklist to determine if agents are appropriate for your use case
- Keep it simple - Focus on environment, tools, and prompts before optimizing
- Think like your agent - Understand their perspective to help them succeed
The journey from simple AI features to sophisticated agents has been remarkable. As AI engineers, our focus on practicality and making AI useful to the world drives this evolution forward. Let's keep building—thoughtfully and effectively.