Modern Agent Engineering

How to Build Modern Web App Agents

A deep comparison of four approaches to building AI agents — OpenAI with raw fetch, Vercel AI SDK, Claude Agent SDK, and Agno with FastAPI — and which one you should pick.

17 min read Updated Mar 19, 2026

Every web app is becoming agentic. Search bars that answer questions. Dashboards that investigate anomalies on their own. Support interfaces that look up orders, check policies, and draft responses — all before a human touches anything.

The core pattern behind all of this is the same: an LLM that can call tools in a loop. But the tooling landscape for building these agents is fragmented. You can go raw with fetch, use a TypeScript SDK that abstracts the loop away, use Claude Code as an embedded agent engine, or reach for a Python framework that gives you sessions, memory, and a production API out of the box.

I explored four real approaches to building the same kind of agent. Here’s what I learned.

What Makes an Agent an Agent

Before diving into code, let’s get the mental model right. A chatbot takes a message and returns a response. An agent takes a goal and works toward it by deciding what actions to take, executing them, observing the results, and repeating until done.

The loop looks like this:

  1. Plan — the model reads the current context (goal, conversation history, available tools) and decides what to do next
  2. Act — it calls a tool (fetch data, run a query, send an email)
  3. Observe — the tool result is added to the conversation
  4. Repeat — the model decides whether to call another tool or respond to the user

Tool calling is the critical I/O layer. Without it, the model is just guessing. With it, the model can reach into your systems and do real work.

The four approaches below all implement this loop. They differ in how much of it you write yourself versus how much the framework handles for you.

Approach 1: OpenAI with Simple Fetch

This is the most fundamental approach. You call the OpenAI Chat Completions API using fetch, define your tools as JSON schemas, and write the agent loop yourself.

Defining tools

Tools are JSON objects that describe what functions the model can call. Each tool has a name, a description, and a parameter schema:

const tools = [
  {
    type: "function",
    function: {
      name: "get_weather",
      description: "Get the current weather for a city",
      parameters: {
        type: "object",
        properties: {
          location: {
            type: "string",
            description: "City name, e.g. 'San Francisco, CA'",
          },
          units: {
            type: "string",
            enum: ["celsius", "fahrenheit"],
            description: "Temperature units",
          },
        },
        required: ["location"],
      },
    },
  },
  {
    type: "function",
    function: {
      name: "search_knowledge_base",
      description: "Search internal documentation for relevant articles",
      parameters: {
        type: "object",
        properties: {
          query: { type: "string", description: "Search query" },
        },
        required: ["query"],
      },
    },
  },
];

The agent loop

The loop sends messages to the API, checks if the model wants to call tools, executes them, appends the results, and calls the API again until the model is done:

async function runAgent(userMessage: string): Promise<string> {
  const messages: Message[] = [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user", content: userMessage },
  ];

  while (true) {
    const response = await fetch("https://api.openai.com/v1/chat/completions", {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${API_KEY}`,
      },
      body: JSON.stringify({
        model: "gpt-4o",
        messages,
        tools,
        tool_choice: "auto",
      }),
    });

    const data = await response.json();
    const choice = data.choices[0];
    messages.push(choice.message);

    if (choice.finish_reason === "stop") {
      return choice.message.content;
    }

    if (choice.message.tool_calls) {
      for (const toolCall of choice.message.tool_calls) {
        const args = JSON.parse(toolCall.function.arguments);
        const result = await executeTool(toolCall.function.name, args);
        messages.push({
          role: "tool",
          tool_call_id: toolCall.id,
          content: JSON.stringify(result),
        });
      }
    }
  }
}

Tool execution

You map tool names to actual functions. This is where your business logic lives:

async function executeTool(name: string, args: Record<string, unknown>) {
  switch (name) {
    case "get_weather":
      return await fetchWeatherAPI(args.location as string, args.units as string);
    case "search_knowledge_base":
      return await searchDocs(args.query as string);
    default:
      return { error: `Unknown tool: ${name}` };
  }
}

What you get

Full control. You own every line of the loop. You decide how to handle errors, how many iterations to allow, when to inject system messages, and how to manage context length. There are zero framework dependencies — this works in Node, Deno, Bun, Cloudflare Workers, or even the browser.

What you don’t get. You have to handle everything yourself: argument validation, streaming, retries, token counting, provider switching, and conversation state. It’s the most code, but it’s also the most transparent.

Approach 2: Vercel AI SDK

The Vercel AI SDK abstracts the agent loop into a single function call. You define tools with Zod schemas, and the SDK handles multi-step execution, argument validation, and streaming automatically.

Defining tools

Tools use the tool() helper with Zod for type-safe parameter schemas:

import { generateText, tool, stepCountIs } from "ai";
import { openai } from "@ai-sdk/openai";
import { z } from "zod";

const weatherTool = tool({
  description: "Get the current weather for a city",
  parameters: z.object({
    location: z.string().describe("City name, e.g. 'San Francisco, CA'"),
    units: z.enum(["celsius", "fahrenheit"]).optional(),
  }),
  execute: async ({ location, units }) => {
    return await fetchWeatherAPI(location, units ?? "celsius");
  },
});

const searchTool = tool({
  description: "Search internal documentation for relevant articles",
  parameters: z.object({
    query: z.string().describe("Search query"),
  }),
  execute: async ({ query }) => {
    return await searchDocs(query);
  },
});

The agent loop

There is no manual loop. The SDK runs it for you:

const { text, steps } = await generateText({
  model: openai("gpt-4o"),
  system: "You are a helpful assistant.",
  prompt: userMessage,
  tools: { get_weather: weatherTool, search_knowledge_base: searchTool },
  stopWhen: stepCountIs(10),
});

That’s it. The SDK calls the model, executes any tool calls, feeds results back, and repeats — up to 10 steps. The steps array contains every intermediate call for debugging.

Streaming

For real-time UIs, swap generateText for streamText:

import { streamText } from "ai";

const result = streamText({
  model: openai("gpt-4o"),
  system: "You are a helpful assistant.",
  prompt: userMessage,
  tools: { get_weather: weatherTool, search_knowledge_base: searchTool },
  stopWhen: stepCountIs(10),
  onStepFinish: ({ toolCalls, toolResults }) => {
    console.log("Step completed:", { toolCalls, toolResults });
  },
});

return result.toTextStreamResponse();

Switching providers

One of the SDK’s strongest features: swap providers without changing your tool definitions or loop logic.

import { anthropic } from "@ai-sdk/anthropic";

const { text } = await generateText({
  model: anthropic("claude-sonnet-4-5"),
  tools: { get_weather: weatherTool, search_knowledge_base: searchTool },
  stopWhen: stepCountIs(10),
  prompt: userMessage,
});

Same tools, same stopping condition, different model. Your agent code doesn’t change.

What you get

Speed of development. The SDK eliminates the loop boilerplate and gives you Zod validation, multi-step execution, streaming, lifecycle hooks (onStepFinish), and provider abstraction out of the box. It’s the fastest path from zero to working agent in TypeScript.

What you trade. You give up some low-level control. Custom retry strategies, dynamic tool injection mid-loop, and unconventional message threading require working around the SDK’s abstractions rather than with them.

Approach 3: Claude Agent SDK

The Claude Agent SDK (@anthropic-ai/claude-agent-sdk) takes a fundamentally different approach from the previous two. Instead of you defining tools and wiring up execution, the SDK ships with built-in tools — Read, Write, Edit, Bash, Glob, Grep, WebSearch, WebFetch — that Claude already knows how to use. You give it a prompt and permissions. It handles the rest.

This is Claude Code as a library. The same engine that powers the CLI, packaged as an NPM module you can embed in your own applications.

Installation

npm install @anthropic-ai/claude-agent-sdk

A minimal agent

The core function is query(). It takes a prompt, returns an async generator that streams messages as the agent works:

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "What files are in this directory?",
  options: { allowedTools: ["Bash", "Glob"] },
})) {
  if ("result" in message) console.log(message.result);
}

No tool schemas. No execution handlers. No loop. Claude reads the directory using its built-in tools and streams back the result.

Permission control

You control exactly which tools the agent can access. A read-only code reviewer looks like this:

for await (const message of query({
  prompt: "Review this codebase for security issues",
  options: {
    allowedTools: ["Read", "Glob", "Grep"],
  },
})) {
  if ("result" in message) console.log(message.result);
}

For agents that modify files, use permissionMode to require approval on writes:

for await (const message of query({
  prompt: "Refactor the authentication module to use JWT",
  options: {
    permissionMode: "acceptEdits",
    allowedTools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"],
  },
})) {
  if ("result" in message) console.log(message.result);
}

Subagents

The SDK supports spawning specialized subagents for focused subtasks. Each subagent runs in its own context with its own instructions and tool permissions:

for await (const message of query({
  prompt: "Use the code-reviewer agent to review this codebase",
  options: {
    allowedTools: ["Read", "Glob", "Grep", "Agent"],
    agents: {
      "code-reviewer": {
        description: "Expert code reviewer for quality and security.",
        prompt: "Analyze code quality and suggest improvements.",
        tools: ["Read", "Glob", "Grep"],
      },
    },
  },
})) {
  if ("result" in message) console.log(message.result);
}

Hooks for observability

Run custom code at key points in the agent lifecycle — before a tool runs, after a tool completes, when the agent stops. This is how you add logging, audit trails, or guardrails:

import { query, HookCallback } from "@anthropic-ai/claude-agent-sdk";
import { appendFile } from "fs/promises";

const logFileChange: HookCallback = async (input) => {
  const filePath = (input as any).tool_input?.file_path ?? "unknown";
  await appendFile(
    "./audit.log",
    `${new Date().toISOString()}: modified ${filePath}\n`
  );
  return {};
};

for await (const message of query({
  prompt: "Refactor utils.py to improve readability",
  options: {
    permissionMode: "acceptEdits",
    hooks: {
      PostToolUse: [{ matcher: "Edit|Write", hooks: [logFileChange] }],
    },
  },
})) {
  if ("result" in message) console.log(message.result);
}

Sessions

Maintain context across multiple exchanges. The agent remembers files it read, analysis it performed, and conversation history:

let sessionId: string | undefined;

for await (const message of query({
  prompt: "Read the authentication module",
  options: { allowedTools: ["Read", "Glob"] },
})) {
  if (message.type === "system" && message.subtype === "init") {
    sessionId = message.session_id;
  }
}

for await (const message of query({
  prompt: "Now find all places that call it",
  options: { resume: sessionId },
})) {
  if ("result" in message) console.log(message.result);
}

What you get

The most powerful out-of-the-box agent. You don’t define tools, implement execution, or write a loop. The SDK ships with battle-tested tools for file I/O, shell commands, web search, and code editing. Subagents let you decompose complex tasks. Hooks give you lifecycle control. Sessions give you multi-turn memory. MCP support lets you connect external systems.

What you trade. This is Claude-only — no provider switching. The built-in tools are designed for code and file manipulation tasks, not arbitrary business logic. If your agent needs to call your own APIs (query a database, send a notification, charge a payment), you’ll need to expose them via MCP servers or Bash scripts rather than defining them as first-class tools. And because Claude handles the loop autonomously, you have less control over individual steps than with the raw fetch or AI SDK approaches.

Approach 4: Agno with FastAPI (Python)

Agno is a Python framework that treats agents as first-class deployable services. You define an agent, give it tools and memory, and Agno wraps it in a FastAPI application with sessions, streaming, and tracing built in.

Defining tools

Tools are plain Python functions:

import httpx

def get_weather(location: str, units: str = "celsius") -> dict:
    """Get the current weather for a city."""
    response = httpx.get(
        "https://api.weather.example/v1/current",
        params={"q": location, "units": units},
    )
    return response.json()

def search_knowledge_base(query: str) -> list[dict]:
    """Search internal documentation for relevant articles."""
    response = httpx.post(
        "https://api.internal/search",
        json={"query": query, "limit": 5},
    )
    return response.json()["results"]

Defining the agent

from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.db.sqlite import SqliteDb

agent = Agent(
    name="Web Assistant",
    model=OpenAIChat(id="gpt-4o"),
    tools=[get_weather, search_knowledge_base],
    instructions="You are a helpful assistant. Use tools when needed.",
    db=SqliteDb(db_file="agent_sessions.db"),
    add_history_to_context=True,
    num_history_runs=5,
    markdown=True,
)

Deploying as a FastAPI app

from agno.os import AgentOS

agent_os = AgentOS(
    agents=[agent],
    tracing=True,
)

app = agent_os.get_app()

# Run: fastapi dev main.py
# API available at http://localhost:8000
# Docs at http://localhost:8000/docs

That gives you a production-ready REST API with automatic OpenAPI documentation, per-session conversation history stored in SQLite, streaming responses, and built-in tracing.

Adding MCP tools

Agno supports Model Context Protocol servers as tool sources:

from agno.tools.mcp import MCPTools

agent = Agent(
    name="Docs Assistant",
    model=OpenAIChat(id="gpt-4o"),
    tools=[MCPTools(url="https://docs.example.com/mcp")],
    db=SqliteDb(db_file="sessions.db"),
    add_history_to_context=True,
    num_history_runs=3,
)

What you get

Batteries included. Sessions, memory, tracing, streaming, and API deployment are built in. You don’t wire up any of that yourself. If you’re building a Python backend that serves agents, Agno gets you to production faster than anything else.

What you trade. You’re locked into Python and Agno’s agent loop design. If you need fine-grained control over how the model iterates, how tools are selected, or how the conversation is managed, you’re working within Agno’s opinions rather than your own.

Side-by-Side Comparison

AspectOpenAI FetchVercel AI SDKClaude Agent SDKAgno + FastAPI
LanguageAny (JS, Python, etc.)TypeScript / JavaScriptTypeScript / PythonPython
Agent loopYou write itBuilt-in (stopWhen)Fully autonomousBuilt-in
Tool schemasJSON Schema (manual)Zod (type-safe)Built-in (Read, Edit, Bash, etc.)Python functions (docstrings)
Custom toolsYou define and executeYou define, SDK executesVia MCP servers or BashYou define, framework executes
Argument validationManualZod-poweredSDK-handledFramework-handled
StreamingManual SSE parsingstreamText() + helpersAsync generatorBuilt-in
Provider switchingRewrite API callsChange one lineClaude onlyChange model class
Session / memoryBuild it yourselfBuild it yourselfBuilt-in (session resume)Built-in (SQLite, etc.)
SubagentsBuild it yourselfBuild it yourselfNative (Agent tool)Multi-agent support
ObservabilityAdd your own tracingLifecycle hooksHooks (Pre/PostToolUse)Built-in tracing
DeploymentAny runtimeNode / Edge / ServerlessNode / CI/CD pipelinesFastAPI (uvicorn)
DependenciesNoneai + provider packages@anthropic-ai/claude-agent-sdkagno + fastapi
BoilerplateHighestLowLowestVery low
ControlTotalHighLow (autonomous)Medium

The Verdict

There is no universally best option. But there is a best option for your situation.

Pick OpenAI Fetch when you need maximum control

Choose this approach when:

  • You’re running in an environment where SDK dependencies are a problem — edge runtimes, browser extensions, lightweight serverless functions
  • You need a custom agent loop with non-standard behavior — dynamic tool injection, custom retry strategies, conditional branching between model calls
  • You want to understand exactly what’s happening at the protocol level
  • You’re building a framework or abstraction layer yourself

The raw fetch approach forces you to think about every decision the agent makes. That’s a liability for simple use cases but a superpower for complex ones. If you’re building something that doesn’t fit neatly into a framework’s model of how agents should work, start here.

Pick the Vercel AI SDK for TypeScript web apps with custom tools

Choose this approach when:

  • You’re building with Next.js, React, or any Node/Edge TypeScript stack
  • You want to switch between OpenAI, Anthropic, Google, and other providers without rewriting agent logic
  • You value type safety and want Zod validation on every tool call
  • You need streaming responses for a chat UI
  • Your agent calls your own business logic — database queries, internal APIs, payment processing
  • You want a clean abstraction without losing the ability to customize via hooks

The AI SDK hits the sweet spot between convenience and control for TypeScript applications. The stopWhen API, lifecycle hooks, and provider abstraction save you hundreds of lines of boilerplate while keeping the important decisions in your hands.

Pick the Claude Agent SDK for code-centric and automation agents

Choose this approach when:

  • Your agent works with files, code, shell commands, or web content — the built-in tools already cover these
  • You’re building CI/CD automation, code review bots, documentation generators, or dev tooling
  • You want subagents that can decompose complex tasks into focused subtasks
  • You need sessions that maintain context across multiple exchanges
  • You want the absolute least boilerplate — a working agent in five lines of code

The Claude Agent SDK is the most powerful option when your use case aligns with its built-in tools. A code review agent, a bug-fixing pipeline, a research assistant that reads files and searches the web — these are trivial to build because you’re not implementing any tool execution. The trade-off is clear: you’re locked to Claude, and custom business logic tools require MCP servers rather than simple function definitions.

Pick Agno for Python backends

Choose this approach when:

  • You’re already on Python and FastAPI
  • You need built-in session management, conversation memory, and persistence
  • You want a production API with OpenAPI docs, tracing, and streaming from day one
  • You’re integrating MCP servers or building multi-agent systems
  • You want the shortest path from agent definition to deployed service

Agno is the most opinionated of the four, and that’s its strength. If your use case fits its model, you’ll ship faster than with any other approach. If it doesn’t, you’ll fight the framework.

Regardless of what you pick

The framework is not the hard part. Agent engineering is. These practices matter more than your choice of SDK:

Treat tools as strict API contracts. Every tool should have typed inputs, structured outputs, and explicit error handling. A tool that returns unstructured text or silently swallows errors will poison the agent’s reasoning.

Set step limits and cost budgets. An agent without limits will loop forever or drain your API budget. Set a maximum step count, track token usage per run, and kill runs that exceed your budget. This is non-negotiable for production.

Make destructive tools require approval. Any tool that sends an email, deletes data, charges money, or modifies production state should go through a human approval step. The model will call it confidently whether or not it should.

Add observability from the start. Log every tool call, every model response, every step. Without traces, you can’t debug why the agent chose the wrong tool, made an extra call, or hallucinated an argument. This is the single most common mistake in agent development.

Validate tool arguments before execution. Don’t trust the model to produce valid JSON every time. Parse and validate arguments before executing the tool. Return machine-readable errors when validation fails so the model can self-correct.

Keep tool counts low. Models perform best with fewer, well-described tools. If you have more than 15-20 tools, use dynamic tool selection (tool RAG) to surface only the relevant ones per request.

The Bottom Line

The framework you pick matters less than how you use it. A well-engineered agent on raw fetch will outperform a sloppy one on the fanciest SDK. Pick the approach that fits your stack, apply the engineering discipline, and ship the agent.

Start simple. Add tools one at a time. Trace everything. Set limits. And don’t let the model send that email without asking first.