Sabertaz Blog

Search posts and tags

Search for posts by title or tags

AI Agent Design Patterns and Best Practices

Guide to AI Agent principles, design patterns, orchestration, guardrails, and evaluation methods

Back to all articles13 min
AI Agent Design Patterns and Best Practices

AI Agent Design Patterns and Best Practices

Principles

First-Principles

From Lee Sedol vs. AlphaGo's Move 37, we can summarize first-principles agent principles:

  • Replica agents: Use biomimicry when workflows require human review, agents serve as copilots, or integrating with legacy UI-only tools
  • Alien agents: Use first-principles when the goal is pure result efficiency

Asymmetry of Verification and Verifiers

Asymmetry of verification and verifiers law:

All solvable and easily verifiable problems will be solved by AI.

Patterns

Agent design patterns:

  • Give agents a computer (CLI and files)
  • Progressive disclosure
  • Offload context
  • Cache context
  • Isolate context
  • Evolve context

Agent-native Architecture

Agent-native apps should:

  • Parity: Users complete tasks via UI <-> Agents implement via tools
  • Granularity: Tools should be atomic primitives
  • Composability: With above two, just write new prompts to create new features
  • Emergent capability
  • Files as universal interface: Files for legibility, databases for structure
  • Improvement over time:
    • Accumulated context: State persists across sessions
    • Developer-level refinement: System prompts
    • User-level customization: User prompts

Recursive Language Models

RLM achieves multi-hop reasoning code through divide-and-conquer and recursion, solving the Context Rot problem caused by long text.

Instructions

  • Use existing documents: Use existing operating procedures, support scripts, or policy documents to create LLM-friendly routines
  • Prompt agents to break down tasks: Providing smaller, clearer steps helps minimize ambiguity and helps models better follow instructions
  • Define clear actions: Ensure each step in the routine corresponds to a specific action or output
  • Capture edge cases: Real interactions often produce decision points, and a robust routine predicts common variations and includes instructions on how to handle them through conditional steps or branches, e.g., providing alternative steps when required information is missing

How to write a great AGENTS.md lessons from over 2500 repositories:

  1. States a clear role: Defines who the agent is (expert technical writer), what skills it has (Markdown, TypeScript), and what it does (read code, write docs)
  2. Executable commands: Gives AI tools it can run (npm run docs:build and npx markdownlint docs/). Commands come first
  3. Project knowledge: Specifies tech stack with versions (React 18, TypeScript, Vite, Tailwind CSS) and exact file locations
  4. Real examples: Shows what good output looks like with actual code. No abstract descriptions
  5. Three-tier boundaries: Set clear rules using always do, ask first, never do. Prevents destructive mistakes

Vibe Coding

  1. Spec the work:
    • Goal: Picking next highest-leverage goal
    • Breakdown: Breaking work into small and verifiable slice (pull request)
    • Criteria: Writing acceptance criteria, e.g., inputs, outputs, edge cases, UX constraints
    • Risk: Calling out risks up front, e.g., performance hot-spots, security boundaries, migration concerns
  2. Give agents context:
    • Repository: Repository conventions
    • Components: Component system, design tokens and patterns
    • Constraints: Defining constraints: what not to touch, what must stay backward compatible
  3. Direct agents what, not how:
    • Tools: Assigning right tools
    • Files: Pointing relevant files and components
    • Constraints: Stating explicit guardrails, e.g., don't change API shape, keep this behavior, no new deps
  4. Verification and code review:
    • Correctness: Edge cases, race conditions, error handling
    • Performance: N+1 queries, unnecessary re-renders, overfetching
    • Security: Auth boundaries, injection, secrets, SSRF
    • Tests: Coverage for changed behaviors
  5. Integrate and ship:
    • Break big work into tasks agents can complete reliably
    • Merge conflicts
    • Verify CI
    • Stage roll-outs
    • Monitor regressions

System

OpenAI Codex system prompts:

  • Instructions
  • Git instructions
  • AGENTS.md spec
  • Citations instructions

Coding

Writing good AGENTS.md:

  • AGENTS.md should define your project's WHY, WHAT, and HOW
  • Less is more: Include as few instructions as reasonably possible in the file
  • Keep the contents of your AGENTS.md concise and universally applicable
  • Use Progressive Disclosure: Don't tell Agent all the information to know, tell Agent when it needs, how to find and use it
  • Agent is not a linter: Use linters and code formatters, and use other features like Hooks and Slash Commands
  • AGENTS.md is the highest leverage point of the harness, so avoid auto-generating it. You should carefully craft its contents for best results

Pull Request

GitHub Copilot to debug issues faster:

Testing

Research

AI agents powered by tricky LLMs prompting:

Tool

Tool execution:

  1. Tool calling: Atomic toolkit
  2. Bash: Composable static scripts
  3. Codegen: Dynamic programs

Context

Dynamic Discovery

Dynamic context discovery:

  • Tool response → File
  • Terminal session → File
  • Reference conversation history when context compression
  • Load on demand
  • Progressive disclosure

Personalization

Metaprompting for memory extraction:

Context Engineering

LLMs do not uniformly utilize their context, their accuracy and reliability decline as the number of input tokens increases, called Context Rot.

Therefore, merely having relevant information in the model's context is insufficient: the presentation of information significantly impacts performance. This highlights the necessity of context engineering to optimize the amount of relevant information and minimize irrelevant context for reliable performance, e.g., custom Gemini CLI command.

Workflow

Plan Mode

Claude code EnterPlanMode system prompt:

Debug Mode

Cursor debug mode:

  1. Assume: Generate multiple hypotheses
  2. Log: Add logging points
  3. Collect: Collect runtime data (log, trace, profile)
  4. Locate: Reproduce bug, analyze actual behavior, precisely locate root cause
  5. Fix: Based on evidence, make targeted fixes

TDD

Test-driven development:

  1. Write tests: Have the agent write tests based on expected input/output pairs. Clearly state you're doing TDD, to avoid agent writing mock implementations for features that don't exist yet
  2. Run tests: Have the agent run tests and confirm tests actually fail. Clearly state not to write implementation code at this stage
  3. Commit tests
  4. Write code: Have the agent write code to pass tests, and instruct it not to modify tests. Tell it to iterate until all tests pass
  5. Submit code

Orchestration

Single-agent Systems

Multi-agent Systems: Manager Pattern

Other agents act as tools, called by the central agent:

Multi-agent Systems: Decentralized Pattern

Multiple agents run as peers:

Guardrails

Building Guardrails

  • Relevance classifier: Ensures agent responses stay within expected scope by flagging off-topic queries
  • Safety classifier: Detects unsafe inputs attempting to exploit the system (jailbreaks or prompt injection)
  • PII filter: Prevents unnecessary personal identity information leakage by reviewing model output for any potential PII
  • Content moderation: Flags harmful or inappropriate inputs (hate speech, harassment, violence) to maintain safe, respectful interactions
  • Tool safety measures: Evaluate risk of each tool available to your agent by assigning low, medium, or high ratings based on factors like read-only vs. write access, reversibility, required account permissions, and financial impact. Use these risk ratings to trigger automated actions like pausing for guardrail checks before high-risk feature execution, or escalating to human intervention when needed
  • Rule-based protection: Simple deterministic measures (blacklists, input length limits, regex filters) to prevent known threats like prohibited terms or SQL injection
  • Output validation: Ensure responses align with brand values through prompt engineering and content checks, preventing outputs that could damage brand integrity

Triggering a human intervention plan when exceeding failure thresholds or high-risk operations is a critical safety measure.

Evaluation

Agents eval:

  1. Start early
  2. Source realistic tasks from failures
  3. Define unambiguous, robust success criteria
  4. Design graders thoughtfully and combine multiple types (code-based, model-based, human)
  5. Make sure the problems are hard enough for model
  6. Iterate on evaluations to improve signal-to-noise ratio
  7. Read transcripts
  8. Pick framework: prompt foo, harbor

When building agents, trace is the source of truth:

  • Debugging becomes trace analysis
  • Testing becomes eval-driven
  • Can't set breakpoints in reasoning
  • Performance optimization changes: task success rate, reasoning quality, tool usage efficiency

Benchmarks

Benchmarks:

  • Aggregate: Don't obsess over a 1-2% lead on one benchmark, focus on specific and comprehensive domain
  • Relative: Compare within the same model family or lab, how did the score change from v1 to v2?
  • Verify: The only benchmark that matters at the end of the day is your workload

Libraries

Instruction

  • AGENTS.md: Open format for guiding coding agents
  • llms.txt: Helping language models use website
  • System: System prompts for AI agents

RAG

  • RAGFlow: Superior context layer for AI agents

Project

  • VibeKanban: Run coding agents in parallel without conflicts, and perform code review

Documentation

Slide

  • Banana: AI-native PPT generator based on nano banana pro

References