Modern Agent Engineering

From Chat to Agent UI: ChatKit, A2UI, and Structured Interaction Surfaces

Agent UI is becoming its own stack. Here's how ChatKit, A2UI, and MCP Apps fit together, where plain chat breaks down, and how to design structured interaction surfaces that actually help users get work done.

24 min read

TL;DR

  • Chat got us into the agent era, but chat alone is a weak workspace for review, approval, comparison, filtering, and multi-step editing.
  • The stack is splitting into three layers: chat shells like ChatKit, declarative UI protocols like A2UI, and host runtimes like MCP Apps.
  • ChatKit is the fastest way to ship agentic chat inside your own product if you want a polished interface, streaming, threads, attachments, and widgets without assembling everything from scratch.
  • A2UI is a good fit when agents are remote, untrusted, multi-platform, or cross-organization and need to describe UI safely as data instead of shipping executable code.
  • MCP Apps standardizes interactive app surfaces inside agent hosts such as Claude, ChatGPT, Goose, and VS Code, combining tool calls with sandboxed UI resources.
  • For first-party web apps, a very practical default is: useChat on the frontend, streamText on the backend, and either typed tool parts or custom data parts whenever the user needs a real surface instead of another paragraph.
  • The winning design pattern is not “replace your product with chat.” It is “use chat for intent and narration, then hand the user into a structured surface when the work becomes visual, stateful, or approval-heavy.”

What You Will Learn Here

  • Why plain chat stops being enough for serious agent workflows
  • How ChatKit, A2UI, and MCP Apps differ at an architectural level
  • When each layer is the right choice
  • What “structured interaction surfaces” actually mean in product terms
  • How to implement two practical structured-surface patterns with AI SDK
  • Design rules for building agent-native interfaces without creating a confusing mess

Why Chat Alone Stops Being Enough

The repo already has “The Chat Pivot”, and that shift is real. Users increasingly expect to ask software for outcomes instead of navigating menus.

But chat has an obvious limit: it is excellent for intent capture and explanation, and pretty awkward for work that depends on inspection, manipulation, or confirmation.

Text is a poor medium for:

  • reviewing a deployment plan with multiple risk flags
  • filtering a table of 200 results
  • approving one clause in a contract but rejecting another
  • comparing two candidate queries side by side
  • adjusting a map, chart, or timeline interactively
  • stepping through a configuration wizard with dependent options

The MCP Apps announcement from January 26, 2026 explains this well: text can summarize a tool result, but users often want to sort, filter, drill down, preview, or approve without turning every tiny action into another prompt.

That is the architectural shift. Agent UI is not just “better chat bubbles.” It is a stack for moving between:

  • conversational intent
  • structured state
  • visible work surfaces
  • tool execution
  • human approval

A Better Mental Model

Think about agent UI as three cooperating layers:

  1. The conversation shell captures intent, streams progress, and keeps the user oriented.
  2. The structured interaction surface lets the user inspect, edit, compare, or approve something without reducing everything to prose.
  3. The tool and state layer turns those interactions into real actions in your systems.

In other words, chat becomes the control plane, while structured surfaces become the workspace.

Chat-only flow

User -> Prompt -> Agent -> Text answer -> Another prompt -> Another text answer

Agent UI flow

User -> Chat intent
     -> Agent/tool reasoning
     -> Structured surface (table/form/chart/review card)
     -> User action
     -> Updated model context + tool call
     -> Confirmed result

Once you see that split, ChatKit, A2UI, and MCP Apps stop looking like competitors. They live at different layers of the same emerging stack.

Layer 1: ChatKit as the Embedded Agent Shell

OpenAI ChatKit is the most productized option in this stack. The docs describe it as a framework for building high-quality AI-powered chat experiences, with built-in streaming, attachments, thread management, source annotations, and rich interactive widgets.

This matters because most teams do not actually want to invent chat UX primitives from scratch. They want:

  • a production-ready conversation shell
  • a fast way to embed agent chat inside their app
  • support for tools, workflows, and widgets
  • room to customize the UI so it still feels native to their product

The interesting architectural detail is that ChatKit supports two modes:

  • an OpenAI-hosted backend for the fastest path
  • a self-hosted backend using the Python SDK if you want maximum control over workflows, storage, inference, and data paths

That makes ChatKit a pragmatic choice when:

  • you own the application shell
  • you want agent chat inside your product now
  • you care more about shipping velocity than cross-host portability
  • your main interaction pattern is still conversational, with widgets as supporting UI

Where ChatKit is strongest:

  • embedded SaaS copilots
  • support and operations panels
  • internal tools that need attachments, threads, and guided workflows
  • products where the agent is a first-class feature, not an external plugin

Where ChatKit is weaker:

  • cases where the UI must travel between hosts you do not control
  • cases where remote agents need to describe native UI without iframe-style embedding
  • highly heterogeneous client environments where web, mobile, and desktop must all render the same structured payload differently

So ChatKit is best understood as the shell layer: the fastest way to ship a polished agent conversation experience inside your own app.

Layer 2: A2UI as the Declarative UI Protocol

On December 15, 2025, Google introduced A2UI as an open project for agent-driven interfaces.

The problem statement is important. If the agent lives inside your app, it can manipulate the UI directly. But in many real systems, the agent doing the work is remote:

  • it runs in the background
  • it lives on another server
  • it belongs to a partner organization
  • it should not be allowed to execute arbitrary UI code in your client

Google’s framing is sharp: the industry needed something “safe like data, but expressive like code.”

That is what A2UI tries to do.

Instead of sending HTML or JavaScript from the agent, A2UI sends a structured payload that references trusted components in a client-side catalog. The client keeps control over styling, rendering, and security. The agent describes the structure and data.

According to the launch post, A2UI is designed around three ideas:

  • Security first: the agent can request only pre-approved components from a trusted catalog
  • Incremental updates: the format is friendly to progressive rendering as new state arrives
  • Framework portability: the same payload can map to React, web components, Flutter, SwiftUI, or something else

That makes A2UI appealing when:

  • your agent and your UI live across a trust boundary
  • you want the same agent response to render natively on multiple client platforms
  • you care about declarative rendering rather than remote code execution
  • you expect agents to generate bespoke forms, cards, charts, or flows on the fly

It also changes how we think about “agentic UI.” The agent is no longer the renderer. It becomes a planner that emits structured UI intent.

Here is a simplified example of the kind of thing this enables:

{
  "component": "ApprovalCard",
  "props": {
    "title": "Rotate production API keys",
    "summary": "3 services will restart during the next maintenance window.",
    "riskLevel": "medium",
    "actions": [
      { "id": "approve", "label": "Approve" },
      { "id": "request_changes", "label": "Request changes" }
    ]
  }
}

The client decides what ApprovalCard actually looks like. The agent decides that an approval card is the right surface for this moment.

That separation is powerful.

It also means A2UI is not a complete app platform by itself. You still need:

  • a component catalog
  • renderer bindings
  • state synchronization rules
  • approval and permission logic
  • a transport or orchestration layer around it

So A2UI is best seen as the protocol layer for agent-generated UI, especially across trust boundaries.

Layer 3: MCP Apps and the Host Runtime

If ChatKit is the shell and A2UI is the declarative protocol, MCP Apps are the runtime pattern for shipping interactive surfaces inside agent hosts.

This is where the ecosystem gets especially interesting.

The official MCP Apps post describes tools returning UI resources instead of plain text. A tool can include _meta.ui.resourceUri, the host fetches the UI resource, renders it in a sandboxed iframe, and then enables bidirectional communication between the UI and the host through JSON-RPC over postMessage.

That gives us a concrete host architecture:

User
  -> Agent host (ChatGPT / Claude / VS Code / Goose)
  -> MCP tool call
  -> UI resource returned by server
  -> Sandboxed app surface rendered in-conversation
  -> User clicks / edits / filters
  -> App sends structured events back to host
  -> Host updates model context and optionally calls more tools

The important part is not just that a widget appears. It is that the widget becomes a participant in the agent loop.

The January 26, 2026 MCP Apps announcement highlights exactly the kinds of surfaces that benefit from this model:

  • dashboards
  • configuration wizards
  • document review
  • real-time monitoring
  • other stateful interfaces that are clumsy as pure chat

The same post also notes that the pattern builds on work from MCP-UI and the OpenAI Apps SDK, then standardizes it across multiple hosts.

As of March 30, 2026, OpenAI’s Help Center describes developer mode and MCP apps in ChatGPT as supporting interactive UI plus full MCP actions, with admin controls and RBAC on Business, Enterprise, and Edu plans.

This host/runtime layer is the right fit when:

  • you want your experience to run inside external agent hosts
  • you want tool results to open into interactive app surfaces
  • you care about cross-client distribution more than owning the full shell
  • your product is becoming “something an agent can use with a person in the loop”

It is less ideal when:

  • you need pixel-perfect control of the entire user experience
  • your product needs to feel like one tightly integrated app shell
  • you cannot depend on host support or sandbox behavior evolving over time

A Practical Default for First-Party Apps

If you are shipping inside your own web app this quarter, the cleanest practical setup is often:

  • ChatKit when you want the fastest polished shell
  • AI SDK UI when you want flexible frontend rendering and tight control over custom surfaces
  • typed tool parts when the model should ask for a specific interaction
  • custom data parts when the server should stream a structured artifact into the conversation

I re-checked the current AI SDK UI docs on March 30, 2026 before writing the examples below. The key ideas the docs confirm are:

  • useChat is the frontend state and streaming layer
  • tool parts are the right fit for actions and approvals
  • custom data parts are the right fit for cards, status panes, artifacts, and progressively updated UI
  • generative UI in AI SDK is fundamentally “tool result -> typed UI component”

That gives you a very usable mental model:

User message
  -> useChat()
  -> /api/chat
  -> streamText(...)
  -> tool parts and/or data parts
  -> React components render inside the conversation
  -> user clicks, confirms, edits, or filters
  -> next tool call or next model step

The benefit is simple: you stop treating the assistant response as “one string” and start treating it as a stream of typed interaction parts.

Astro: Static Shell, Streaming Agent Island, Server-Side MCP

Astro is a particularly good home for this pattern because its defaults already match the shape of a good agent UI:

  • most of the page should stay static and fast
  • only the agent surface should hydrate
  • server-side secrets and tool access should stay off the client
  • dynamic or personalized sections should be isolated instead of turning the whole app into a client-rendered SPA

The official docs reinforce this architecture:

  • Astro’s islands model says interactive components should hydrate independently with client:*
  • Astro’s endpoints docs say server endpoints can return a Response and, in static mode, must opt out of prerendering with export const prerender = false
  • Astro’s on-demand rendering docs say server-rendered routes require an adapter and note that output: 'server' makes routes dynamic by default
  • Astro’s server islands docs show how to defer slow or personalized server-rendered components with server:defer

That leads to a very practical Astro architecture for agent features:

Static Astro page
  -> React agent island (client:visible or client:load)
  -> Astro API route (/api/chat)
  -> AI SDK streamText(...) on the server
  -> MCP clients / DB / internal APIs on the server only
  -> typed tool parts and data parts back to the island

Common Astro Pattern 1: Static page + lazy chat island

Keep the surrounding page static and hydrate only the agent panel.

---
// src/pages/copilot.astro
import BaseLayout from "@/layouts/BaseLayout.astro";
import AgentPanel from "@/components/AgentPanel.tsx";
---

<BaseLayout
  title="Release Copilot"
  description="Review deploy plans, approvals, and rollout status."
>
  <section class="prose">
    <h1>Release Copilot</h1>
    <p>
      The docs, checklist, and rollout notes can stay static. Only the agent
      workspace needs to hydrate.
    </p>
  </section>

  <AgentPanel client:visible />
</BaseLayout>

Why this is a good default:

  • your SEO and first paint stay excellent
  • the agent code loads only when needed
  • the agent UI feels like one focused island instead of taking over the entire page

If the assistant is the primary interaction on the page, use client:load. If it is secondary, client:visible or client:idle is usually the better tradeoff.

Common Astro Pattern 2: Astro API route as the agent gateway

This is the cleanest place to integrate AI SDK, auth context, MCP clients, and internal APIs.

// src/pages/api/chat.ts
import type { APIRoute } from "astro";
import { openai } from "@ai-sdk/openai";
import { convertToModelMessages, streamText, type UIMessage } from "ai";

// Needed when the overall site is static or hybrid and this route must stay live.
export const prerender = false;

export const POST: APIRoute = async ({ request, locals, cookies }) => {
  const { messages }: { messages: UIMessage[] } = await request.json();

  // Read auth/session context here, not in the browser.
  const session = cookies.get("session")?.value;
  const tenantId = locals.tenantId;

  const result = streamText({
    model: openai("gpt-4.1-mini"),
    system: `You are a release copilot for tenant ${tenantId}. Never suggest actions outside the user's scope.`,
    messages: await convertToModelMessages(messages),
  });

  return result.toUIMessageStreamResponse();
};

In practice, this route becomes your boundary for:

  • authentication and RBAC
  • provider keys
  • MCP connections
  • database lookups
  • audit logging
  • request shaping before the model sees anything

If your Astro app uses output: 'server', you usually do not need prerender = false on every route. If it stays mostly static, add an adapter and opt out only for the live routes that need streaming.

Common Astro Pattern 3: Keep MCP on the server, not in the island

If your agent uses MCP servers, the React island should talk only to your Astro endpoint. Let the endpoint own the MCP connections and tool exposure.

Browser island
  -> /api/chat
  -> server-side tool wrapper
  -> MCP server(s)
  -> tool result
  -> AI SDK UI message stream
  -> browser island

This is usually the right pattern because it gives you one place to enforce:

  • which MCP servers are reachable
  • which tools are exposed for a given role
  • which tool calls need approval
  • how much raw tool output reaches the model or the UI

It also means you can swap:

  • a local MCP server
  • a remote MCP server
  • a direct internal service call

without rewriting the frontend island.

Common Astro Pattern 4: Use server islands for personalized side panels

Chat itself belongs in a client island, but nearby UI can often stay server-rendered.

Examples:

  • recent agent runs
  • user-specific thread summaries
  • account-scoped quotas
  • approval queue counts
---
// src/pages/copilot.astro
import BaseLayout from "@/layouts/BaseLayout.astro";
import AgentPanel from "@/components/AgentPanel.tsx";
import AgentSidebar from "@/components/AgentSidebar.astro";
---

<BaseLayout title="Ops Copilot">
  <div class="grid gap-6 lg:grid-cols-[1fr_320px]">
    <AgentPanel client:load />

    <AgentSidebar server:defer>
      <div slot="fallback">Loading recent runs…</div>
    </AgentSidebar>
  </div>
</BaseLayout>

This keeps the expensive or personalized server work isolated instead of forcing the whole page into a heavier runtime path.

Astro + AI SDK example: island UI calling an Astro route

// src/components/AgentPanel.tsx
"use client";

import { useChat } from "@ai-sdk/react";
import { DefaultChatTransport } from "ai";

export default function AgentPanel() {
  const { messages, sendMessage, status } = useChat({
    transport: new DefaultChatTransport({
      api: "/api/chat",
    }),
  });

  return (
    <section className="rounded-2xl border p-4">
      <button
        onClick={() =>
          sendMessage({
            text: "Review release 2026.03.30-1 and tell me if we should deploy.",
          })
        }
      >
        Run review
      </button>

      <p>Status: {status}</p>

      {messages.map((message) => (
        <article key={message.id}>
          {message.parts.map((part, index) =>
            part.type === "text" ? <p key={index}>{part.text}</p> : null,
          )}
        </article>
      ))}
    </section>
  );
}

This is the simplest end-to-end Astro pattern:

  • Astro page provides the shell
  • React island provides the agent UI
  • Astro API route provides the streaming backend
  • AI SDK provides the message protocol
  • MCP and sensitive tools stay on the server

The Astro-specific rule of thumb

If you are building agent UI in Astro, default to:

  1. static page first
  2. one focused client island for the agent surface
  3. one server endpoint for streaming
  4. MCP only on the server
  5. server islands only for nearby personalized or dynamic sections

That gives you most of the upside of an agent-native UI without giving up Astro’s biggest advantage: shipping very little JavaScript outside the part of the page that actually needs it.

Example 1: Stream a Review Card Into the Chat

This is the pattern I would reach for when the assistant should:

  • explain something in natural language
  • attach a structured artifact to the thread
  • let the user inspect visible state before they act

Think release plans, contract review summaries, incident timelines, or onboarding checklists.

Backend: stream a persistent data part plus assistant text

// ai/types.ts
import { UIMessage } from "ai";

export type MyUIMessage = UIMessage<
  never,
  {
    "deploy-plan": {
      releaseId: string;
      risk: "low" | "medium" | "high";
      services: string[];
      status: "draft" | "approved";
    };
  }
>;

// app/api/chat/route.ts
import { openai } from "@ai-sdk/openai";
import {
  convertToModelMessages,
  createUIMessageStream,
  createUIMessageStreamResponse,
  streamText,
} from "ai";
import type { MyUIMessage } from "@/ai/types";

export async function POST(req: Request) {
  const { messages }: { messages: MyUIMessage[] } = await req.json();

  const stream = createUIMessageStream<MyUIMessage>({
    async execute({ writer }) {
      writer.write({
        type: "data-deploy-plan",
        id: "release-2026-03-30-1",
        data: {
          releaseId: "2026.03.30-1",
          risk: "medium",
          services: ["api", "worker", "web"],
          status: "draft",
        },
      });

      const result = streamText({
        model: openai("gpt-4.1-mini"),
        system:
          "Explain the rollout in plain English and tell the user to inspect the deploy card before approving.",
        messages: await convertToModelMessages(messages),
      });

      writer.merge(result.toUIMessageStream());
    },
  });

  return createUIMessageStreamResponse({ stream });
}

Frontend: render the card directly from message.parts

"use client";

import { useChat } from "@ai-sdk/react";
import { DefaultChatTransport } from "ai";
import type { MyUIMessage } from "@/ai/types";

export default function ReleaseCopilot() {
  const { messages, sendMessage } = useChat<MyUIMessage>({
    transport: new DefaultChatTransport({
      api: "/api/chat",
    }),
  });

  return (
    <>
      <button
        onClick={() =>
          sendMessage({ text: "Review release 2026.03.30-1 before we deploy." })
        }
      >
        Review release
      </button>

      {messages.map((message) => (
        <article key={message.id}>
          {message.parts.map((part, index) => {
            if (part.type === "text") {
              return <p key={index}>{part.text}</p>;
            }

            if (part.type === "data-deploy-plan") {
              return (
                <section key={part.id ?? index} className="rounded-xl border p-4">
                  <h3>Release {part.data.releaseId}</h3>
                  <p>Risk: {part.data.risk}</p>
                  <p>Services: {part.data.services.join(", ")}</p>
                  <p>Status: {part.data.status}</p>
                </section>
              );
            }

            return null;
          })}
        </article>
      ))}
    </>
  );
}

Why this pattern is useful:

  • the card is part of the conversation, not a disconnected modal
  • the assistant can narrate what matters while the UI holds the exact state
  • data-part reconciliation lets you update the same card over time instead of appending noisy text

In spirit, this is very close to the A2UI idea: the model decides that a “deploy plan” artifact should exist, but your app still owns the actual rendering.

Example 2: Pause for Human Approval Without Leaving the Thread

This is the pattern I would use for high-stakes actions:

  • deploy to production
  • rotate credentials
  • send a customer-facing message
  • run a destructive migration

The assistant should not just say “Should I proceed?” in plain text. It should surface an explicit approval interaction, then continue automatically when the result is available.

Backend: expose a confirmation tool

// app/api/chat/route.ts
import { openai } from "@ai-sdk/openai";
import { convertToModelMessages, streamText, type UIMessage } from "ai";
import { z } from "zod";

export async function POST(req: Request) {
  const { messages }: { messages: UIMessage[] } = await req.json();

  const result = streamText({
    model: openai("gpt-4.1"),
    system:
      "Before any production action, call askForConfirmation. Do not proceed until the user has decided.",
    messages: await convertToModelMessages(messages),
    tools: {
      askForConfirmation: {
        description: "Ask the user for confirmation.",
        inputSchema: z.object({
          message: z
            .string()
            .describe("The message to ask for confirmation."),
        }),
      },
    },
  });

  return result.toUIMessageStreamResponse();
}

Frontend: render the confirmation tool as a real UI step

"use client";

import { useChat } from "@ai-sdk/react";
import {
  DefaultChatTransport,
  lastAssistantMessageIsCompleteWithToolCalls,
} from "ai";

export default function ApprovalChat() {
  const { messages, sendMessage, addToolOutput } = useChat({
    transport: new DefaultChatTransport({
      api: "/api/chat",
    }),
    sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls,
  });

  return (
    <>
      <button
        onClick={() =>
          sendMessage({ text: "Deploy release 2026.03.30-1 to production." })
        }
      >
        Start deploy
      </button>

      {messages.map((message) => (
        <article key={message.id}>
          {message.parts.map((part, index) => {
            if (part.type === "text") {
              return <p key={index}>{part.text}</p>;
            }

            if (part.type === "tool-askForConfirmation") {
              if (part.state === "input-available") {
                return (
                  <section
                    key={part.toolCallId}
                    className="rounded-xl border border-amber-300 bg-amber-50 p-4"
                  >
                    <p>{part.input.message}</p>
                    <div className="mt-3 flex gap-2">
                      <button
                        onClick={() =>
                          addToolOutput({
                            tool: "askForConfirmation",
                            toolCallId: part.toolCallId,
                            output: "Yes, approved.",
                          })
                        }
                      >
                        Approve
                      </button>

                      <button
                        onClick={() =>
                          addToolOutput({
                            tool: "askForConfirmation",
                            toolCallId: part.toolCallId,
                            output: "No, blocked.",
                          })
                        }
                      >
                        Block
                      </button>
                    </div>
                  </section>
                );
              }

              if (part.state === "output-available") {
                return <p key={part.toolCallId}>Decision: {part.output}</p>;
              }
            }

            return null;
          })}
        </article>
      ))}
    </>
  );
}

Why this pattern is useful:

  • the approval is visible and auditable
  • the user is acting on a bounded surface, not on an ambiguous paragraph
  • sendAutomaticallyWhen lets the conversation continue once all tool results exist
  • your agent loop now has an explicit human checkpoint instead of a vague “please confirm” sentence

This is much better than forcing the user to type “yes, approve the canary rollout for release 2026.03.30-1” into chat.

How These Pieces Fit Together

Here is the cleanest way I have found to think about the stack:

  • ChatKit is a product framework for embedded agent conversation
  • A2UI is a declarative language for agent-specified UI
  • MCP Apps are a standard runtime pattern for interactive surfaces inside host clients

Or more bluntly:

  • ChatKit helps you build an agent-native app
  • A2UI helps remote agents describe native UI safely
  • MCP Apps help tools ship interactive UI into shared agent hosts

These can overlap.

A real product might:

  • use ChatKit inside its own SaaS app
  • let remote subagents emit A2UI-style structured UI descriptions
  • expose selected workflows as MCP Apps for users who prefer ChatGPT or Claude as their main host

That is why “agent UI” is becoming its own stack. The old frontend split of “API + web app” is not enough anymore. We now need to design the relationship between:

  • the model
  • the host
  • the structured surface
  • the tool system
  • the human approval point

Design Rules for Structured Interaction Surfaces

If you are building this kind of system, a few rules already feel durable.

1. Use chat for intent, not for every micro-action

Chat is great for:

  • asking
  • clarifying
  • narrating
  • summarizing

It is bad for:

  • dense comparison
  • repeated filtering
  • approval toggles
  • multi-field editing

Move into a surface the moment the user needs to manipulate state instead of talk about state.

2. Prefer visible state over implied state

If the agent is proposing a change, the user should see:

  • what will happen
  • what data the decision is based on
  • what they are approving
  • what the rollback path is

This is why cards, tables, diffs, and forms matter. They externalize the agent’s reasoning into something inspectable.

3. Keep rendering host-owned and capability-bounded

This is where A2UI and MCP Apps are especially valuable. The agent should not get to execute arbitrary UI code just because it wants a richer interaction. Your client or host should decide:

  • which components are trusted
  • which resources can render
  • which actions require confirmation
  • which tool calls are allowed from the surface

4. Treat user actions as structured events

“User clicked approve” is better than a vague natural-language follow-up.

The more important the action, the more it should look like an event with explicit fields, auditability, and permission checks.

5. Design for fallback

Not every host will support every surface forever. Build your flows so they degrade gracefully:

  • widget if available
  • plain-text explanation if not
  • one-click approval if available
  • explicit confirmation prompt if not

6. Separate narration from execution

One of the cleanest patterns in agent UX is:

  • chat explains what is happening
  • the surface captures the decision
  • the tool performs the action

When those collapse together, users lose trust fast.

ASCII Architecture Flows

1. Embedded app pattern

Your SaaS app
  -> ChatKit shell
  -> Agent runtime
  -> Internal tools / APIs
  -> Widget inside chat
  -> User approves or edits
  -> Tool executes action

2. Remote-agent pattern

User app
  -> Host/orchestrator
  -> Remote agent
  -> A2UI payload
  -> Native client renderer
  -> Structured user event
  -> Agent/tool follow-up

3. Cross-host app pattern

User in ChatGPT / Claude / VS Code
  -> Host calls MCP server
  -> Tool returns UI resource
  -> Host renders sandboxed app
  -> User interacts in conversation
  -> Host updates model context
  -> More tools run if needed

Build vs Adopt: A Practical Decision Guide

Choose ChatKit when:

  • you own the full product experience
  • you want the fastest path to a polished agent chat UI
  • conversation is the center of the UX
  • widgets support the conversation instead of replacing it

Choose A2UI when:

  • your agent is remote or cross-organization
  • native rendering matters more than shared iframe execution
  • security and portability are more important than UI code freedom
  • multiple client platforms need to interpret the same structured response

Choose MCP Apps / Apps SDK when:

  • your users increasingly work inside agent hosts
  • you want tool results to become interactive surfaces
  • you want distribution across multiple hosts
  • your workflow benefits from in-conversation apps, not just text answers

Choose a hybrid when:

  • you have your own product shell and also want external host distribution
  • some workflows are conversational while others clearly need structured review surfaces
  • your roadmap includes both embedded copilots and agent-facing integrations

The Real Shift

The important trend is not that “chat is replacing UI.” The deeper shift is that UI is becoming partially agent-addressable.

That means we now design interfaces for three participants at once:

  • the human user
  • the agent
  • the host runtime that mediates trust, permissions, and rendering

The teams that do this well will stop treating agent UX as a floating chat bubble. They will build systems where conversation, structured surfaces, and tool execution reinforce each other.

That is what makes agent UI its own stack.

Source List