Designing Subagents in the Cloud: Agno vs LangChain vs AI SDK

Most teams do not need “multi-agent systems” in the abstract.

They need something much more specific:

one agent to research
another to write or implement
a third to review, approve, or route the result

That is the real subagent question.

Not “can the framework spawn helpers?”

But:

how do those helpers share context?
where does state live between steps?
how do you run them in the cloud without the whole system turning into orchestration sludge?

I audited the official docs for Vercel AI SDK, LangChain/LangGraph, Agno, and AgentOS on March 30, 2026 for this piece. The goal is not to pick a winner in the abstract. It is to help engineers and PMs choose the right subagent design for real hosted systems.

TL;DR

AI SDK is the cleanest fit when you want app-controlled subagents inside a TypeScript product. Its subagents are usually invoked as tools, run in isolated context, and can return a compressed summary back to the parent. This is excellent for web apps and Vercel-style cloud deployments, but you own persistence and orchestration discipline.
LangChain/LangGraph is the strongest fit when subagents need durable execution, checkpoints, thread state, and resumability. If the workflow might pause, retry, or continue across long-running cloud jobs, LangGraph is the most natural architecture. The tradeoff is more orchestration complexity.
Agno + AgentOS is the most complete fit when you want agent teams and workflows served as production APIs with sessions, traces, control plane visibility, and scheduling already modeled. It is especially good for Python-first internal platforms and self-hosted multi-agent services.
For most product teams, the framework choice comes down to where the “brain” should live:
- AI SDK: in your app layer
- LangChain/LangGraph: in the workflow graph and thread runtime
- Agno/AgentOS: in a dedicated agent runtime and control plane

What You Will Learn Here

How AI SDK, LangChain, and Agno actually model subagents.
Which framework fits three common cloud scenarios better than the others.
How to think about context, retries, and ownership when subagents run in production.
Practical code patterns for a parent agent delegating to specialists.
A decision framework you can use with both engineers and PMs when a team says, “we probably need multiple agents.”

The Research Audit: What the Official Docs Clearly Support

Here is the cleanest reading of the primary sources.

1. AI SDK treats subagents as explicit delegated tools

The AI SDK subagents docs are refreshingly direct:

a parent agent invokes a subagent through a tool
the subagent runs with its own context window
you can decide what the parent model actually sees by shaping the returned output with toModelOutput

That last point is the most important one.

AI SDK’s subagent design is not just about decomposition. It is about context compression.

The user can still see the full subagent execution in the UI, but the parent model can receive only a concise summary. This is a strong fit for product applications where you want rich execution traces for the user or developer, but you do not want the main orchestrator polluted by every intermediate token.

My inference: AI SDK gives you the most explicit app-layer control over subagent boundaries, but it expects you to build the surrounding production discipline yourself.

2. LangChain/LangGraph treats subagents as context-engineered nodes in a durable workflow

LangChain’s multi-agent docs emphasize that context engineering is the core design problem. The docs explicitly frame multi-agent patterns around deciding:

which parts of the conversation each agent sees
which tools each agent gets
how agent outputs are included or omitted from the next step

LangGraph goes further by giving you thread persistence, checkpoints, and durable execution. That matters because long-running cloud workflows rarely succeed in one uninterrupted shot. They pause, retry, resume, and sometimes wait for human approval or external events.

The subgraph and remote-graph patterns are especially important here. The docs show that you can isolate state inside subgraphs and call deployed graphs remotely. That is much closer to a real cloud architecture than a simple in-process “call helper function” story.

My inference: LangChain/LangGraph is strongest when subagents are part of a durable execution graph, not just a convenience abstraction.

3. Agno treats teams and workflows as first-class production services

Agno’s docs take a more runtime-centric approach.

Instead of just asking how one agent calls another, Agno models:

teams of agents
workflows with structured steps
AgentOS as the runtime, control plane, and operational surface

The docs are explicit that AgentOS supports agents, teams, workflows, approvals, tracing, session tracking, memories, and schedules. The control plane also makes delegation visible, which matters in real cloud systems because the problem is often not “did the model answer?” but “which agent did what, when, and why did it stall?”

Agno also supports remote execution patterns, including remote teams and remote workflows. That moves the design from “local library orchestration” into “servable multi-agent runtime.”

My inference: Agno/AgentOS is strongest when you want subagents to exist as part of an operational platform, not just inside one app route.

First, the Practical Mental Model

When people say “subagents,” they usually mean one of three very different things.

1. In-request delegation

One parent agent delegates a narrow subtask inside a single request-response cycle.

Examples:

summarize five documents before writing the final answer
ask a code-search specialist to locate relevant files
use a reviewer agent to critique a generated draft before sending it back

This is where AI SDK feels most natural.

2. Durable workflow delegation

A task spans multiple steps, runs longer than one request, and may need retries, checkpointing, or human review.

Examples:

a compliance workflow that researches, drafts, validates, and waits for approval
a cloud automation that resumes after a downstream service recovers
a research pipeline that fans out work, then aggregates results later

This is where LangChain/LangGraph tends to fit best.

3. Platform-level agent teams

You are not just building one product feature. You are building a reusable multi-agent service with sessions, traces, admin visibility, and scheduling.

Examples:

an internal operations platform with multiple agent teams
a content or research factory with daily scheduled workflows
a Python backend serving several agent endpoints to other services

This is where Agno + AgentOS is especially compelling.

Real Scenario 1: Product Copilot Inside a Web App

Let us say you are building a customer-support copilot in a SaaS product.

You want:

a parent agent that talks to the user
a research subagent that gathers account context and prior tickets
a policy subagent that checks refund or escalation rules
the final response streamed back into the app UI

This is a very common “subagents in the cloud” use case.

Why AI SDK is usually the best fit here

The AI SDK is strong when:

your main product is already TypeScript and React
the parent agent lives close to the UI
subagents are just one part of the product request path
you want explicit control over what the parent model sees

The AI SDK docs explicitly support this pattern through subagents-as-tools and toModelOutput.

AI SDK example

import { tool, ToolLoopAgent } from "ai";
import { z } from "zod";

const accountResearcher = new ToolLoopAgent({
  model: "openai/gpt-5.1",
  instructions: `You are an account research subagent.
Return a short final summary with:
- account status
- relevant prior tickets
- any refund risk
- unresolved uncertainty`,
  tools: {
    getAccount: accountLookupTool,
    getTickets: ticketHistoryTool,
  },
});

const researchAccount = tool({
  description: "Research the customer's account before responding.",
  inputSchema: z.object({
    userId: z.string(),
  }),
  execute: async ({ userId }) => {
    return await accountResearcher.generate({
      prompt: `Research user ${userId} for support response planning.`,
    });
  },
  toModelOutput: ({ output }) => {
    const lastText = output?.text ?? "No summary produced.";
    return { type: "text", value: lastText };
  },
});

export const supportAgent = new ToolLoopAgent({
  model: "anthropic/claude-sonnet-4.5",
  instructions:
    "Help the user clearly. Use researchAccount before giving any account-specific answer.",
  tools: { researchAccount, checkPolicy: refundPolicyTool },
});

Cloud read

This pattern works well when deployed as part of your application backend on Vercel or a similar hosted environment:

the parent agent stays close to your UI and auth layer
subagents remain implementation details of the request
persistence stays in your app database, not hidden inside the framework

Where teams get into trouble

They try to make AI SDK behave like a durable workflow engine.
They let every subagent return its full raw transcript into the parent context.
They forget that long-lived memory and retry semantics are still application responsibilities.

If the workflow is mostly “respond to a user inside the product,” AI SDK is usually the cleanest answer.

Real Scenario 2: Long-Running Research or Compliance Pipeline

Now imagine a more operational workflow.

You want:

a planner agent to break down a compliance review
a retrieval specialist to gather documents
a policy analyst to compare those documents against control requirements
a report writer to draft the final report
the whole run to survive retries, pauses, and human approval steps

This is no longer just a “web app request” problem.

It is a workflow problem.

Why LangChain/LangGraph is usually the best fit here

LangChain and LangGraph are strongest when:

the task needs checkpointing and resumability
you want clear thread semantics
different agents should see different slices of state
the workflow may run remotely or across deployed graph boundaries

The docs’ emphasis on context engineering, subgraphs, persistence, and durable execution all line up with this scenario.

LangGraph-style example

import { Annotation, StateGraph } from "@langchain/langgraph";
import { PostgresSaver } from "@langchain/langgraph-checkpoint-postgres";

const State = Annotation.Root({
  request: Annotation<string>(),
  evidence: Annotation<string[]>(),
  draft: Annotation<string>(),
});

const gatherEvidence = async (state: typeof State.State) => {
  return {
    evidence: await retrievePoliciesAndDocs(state.request),
  };
};

const writeDraft = async (state: typeof State.State) => {
  return {
    draft: await draftComplianceReport(state.evidence),
  };
};

const graph = new StateGraph(State)
  .addNode("gatherEvidence", gatherEvidence)
  .addNode("writeDraft", writeDraft)
  .addEdge("__start__", "gatherEvidence")
  .addEdge("gatherEvidence", "writeDraft")
  .addEdge("writeDraft", "__end__");

const checkpointer = PostgresSaver.fromConnString(process.env.DATABASE_URL!);

export const complianceWorkflow = graph.compile({ checkpointer });

You can make each node call a specialized agent or subgraph. The important design point is that the workflow state survives across steps and runs.

ASCII flow

User request
   |
   v
Planner / Router
   |
   +--> Retrieval subgraph ----+
   |                           |
   +--> Policy analysis -------+--> Report writer --> Human approval --> Final artifact
                               |
                         checkpointed state

Cloud read

This is the pattern I would trust for hosted systems that need:

durable execution
resume-after-failure behavior
long-running multi-step orchestration
remotely deployed graph components

Where teams get into trouble

They over-engineer simple request/response features into graphs.
They treat every helper as an agent when a normal tool call would do.
They underestimate how much schema and state design matters once the workflow becomes durable.

If the phrase “resume this run tomorrow” matters to your product or operations team, LangGraph deserves very serious consideration.

Real Scenario 3: Internal Agent Platform or Scheduled Multi-Agent Service

Now take a different case.

You are building an internal platform for your company:

research teams run analyst agents
operations teams run workflow automations
managers want traces, approvals, sessions, and schedules
multiple teams need to inspect and operate the system in production

This is not just a coding abstraction anymore.

It is an operational product.

Why Agno + AgentOS is usually the best fit here

Agno and AgentOS stand out when:

you are Python-first
you want teams and workflows as first-class concepts
you want an actual control plane for observing runs
scheduling, approvals, and session visibility matter from day one

The AgentOS docs are unusually explicit here. They describe a control plane for agents, teams, workflows, approvals, traces, sessions, memories, and schedules. That is much closer to a multi-agent runtime platform than a lightweight SDK abstraction.

Agno-style example

from agno.agent import Agent
from agno.team import Team
from agno.models.openai import OpenAIChat

researcher = Agent(
    name="Researcher",
    model=OpenAIChat(id="gpt-5.1"),
    instructions="Gather evidence and return only source-backed findings.",
    tools=[search_docs_tool],
)

writer = Agent(
    name="Writer",
    model=OpenAIChat(id="gpt-5.1"),
    instructions="Write a concise executive draft from the research findings.",
)

editor = Agent(
    name="Editor",
    model=OpenAIChat(id="gpt-5.1"),
    instructions="Review the draft for accuracy, clarity, and missing risks.",
)

content_team = Team(
    name="Content Team",
    mode="coordinate",
    members=[researcher, writer, editor],
    instructions="Delegate clearly and keep outputs short between steps.",
)

Then serve it through AgentOS:

from agno.os import AgentOS

os = AgentOS(teams=[content_team])
app = os.get_app()

if __name__ == "__main__":
    os.serve(app="main:app", port=8000, reload=False)

Cloud read

This model is especially attractive when you want:

agent services exposed over APIs
persistent sessions and traces
a browser control plane for operations teams
cron-like scheduled workflows
one runtime serving multiple agent teams

Where teams get into trouble

They adopt the full runtime before confirming they actually need a platform.
They assume every team needs a multi-agent topology when a single agent plus tools would be simpler.
They do not put enough effort into team instructions and role boundaries, which makes the nice runtime features less useful.

If your architecture is trending toward “we need an internal agent service, not just an app feature,” Agno becomes much more attractive.

Side-by-Side: What Each Framework Is Really Optimizing For

Framework	Best subagent pattern	Strongest cloud scenario	Main tradeoff
AI SDK	Parent agent calling subagents as tools with controlled summaries	Web product agents and copilots close to the UI	You own persistence, retries, and most orchestration discipline
LangChain/LangGraph	Subagents inside durable graphs and checkpointed thread state	Long-running workflows, resumable jobs, approval-heavy orchestration	More moving parts and more state design overhead
Agno + AgentOS	Teams and workflows as servable runtime objects	Internal platforms, scheduled multi-agent services, Python-first APIs	Heavier runtime commitment and more platform opinionation

If I were advising a team today, I would use this rule set.

Pick AI SDK when:

your product is already TypeScript-first
subagents are part of a request path, not an operations platform
you want the parent app to control auth, persistence, UI, and cloud deployment
you need one or two specialists, not a whole runtime ecosystem

Pick LangChain/LangGraph when:

the workflow is long-running or interruption-prone
resumability is a product requirement
agent steps need persisted thread state and checkpointing
you are comfortable modeling workflows explicitly instead of hiding them in app code

Pick Agno + AgentOS when:

you want a dedicated Python runtime for agents, teams, and workflows
operations visibility matters early
you expect scheduling, approvals, and session management to be first-class requirements
your system feels more like an internal platform than a single app feature

Final Take

The practical subagent question is not “which framework supports multiple agents?”

All three do, in different ways.

The better question is:

Where should the coordination live in your cloud system?

AI SDK says: let the app own it.

LangGraph says: make the workflow graph own it.

Agno/AgentOS says: run it inside a dedicated agent runtime.

That is the decision that will shape your architecture far more than whether the docs use the word “subagent,” “team,” or “workflow.”

Luis Mori Guerra

Recent Articles

Topics

Designing Subagents in the Cloud: Agno vs LangChain vs AI SDK

TL;DR

What You Will Learn Here

The Research Audit: What the Official Docs Clearly Support

1. AI SDK treats subagents as explicit delegated tools

2. LangChain/LangGraph treats subagents as context-engineered nodes in a durable workflow

3. Agno treats teams and workflows as first-class production services

First, the Practical Mental Model

1. In-request delegation

2. Durable workflow delegation

3. Platform-level agent teams

Real Scenario 1: Product Copilot Inside a Web App

Why AI SDK is usually the best fit here

AI SDK example

Cloud read

Where teams get into trouble

Real Scenario 2: Long-Running Research or Compliance Pipeline

Why LangChain/LangGraph is usually the best fit here

LangGraph-style example

ASCII flow

Cloud read

Where teams get into trouble

Real Scenario 3: Internal Agent Platform or Scheduled Multi-Agent Service

Why Agno + AgentOS is usually the best fit here

Agno-style example

Cloud read

Where teams get into trouble

Side-by-Side: What Each Framework Is Really Optimizing For

Pick AI SDK when:

Pick LangChain/LangGraph when:

Pick Agno + AgentOS when:

Final Take

Sources

Search the blog

Luis Mori Guerra

Recent Articles

Topics

TL;DR

What You Will Learn Here

The Research Audit: What the Official Docs Clearly Support

1. AI SDK treats subagents as explicit delegated tools

2. LangChain/LangGraph treats subagents as context-engineered nodes in a durable workflow

3. Agno treats teams and workflows as first-class production services

First, the Practical Mental Model

1. In-request delegation

2. Durable workflow delegation

3. Platform-level agent teams

Real Scenario 1: Product Copilot Inside a Web App

Why AI SDK is usually the best fit here

AI SDK example

Cloud read

Where teams get into trouble

Real Scenario 2: Long-Running Research or Compliance Pipeline

Why LangChain/LangGraph is usually the best fit here

LangGraph-style example

ASCII flow

Cloud read

Where teams get into trouble

Real Scenario 3: Internal Agent Platform or Scheduled Multi-Agent Service

Why Agno + AgentOS is usually the best fit here

Agno-style example

Cloud read

Where teams get into trouble

Side-by-Side: What Each Framework Is Really Optimizing For

What I Would Actually Recommend

Pick AI SDK when:

Pick LangChain/LangGraph when:

Pick Agno + AgentOS when:

Final Take

Sources