AG-UI protocol: reading it as a wire format

2/4

Building an agent UI always ends the same way for me. I wire Server-Sent Events from the backend, invent yet another envelope for "token chunks vs tool calls vs state updates", then spend a week debugging the parts where my frontend and backend disagreed. AG-UI is the first serious attempt I have seen at standardising that envelope instead of shipping another React library.

In this post I treat AG-UI as what it actually is: a wire protocol. I looked past the official SDKs and wrote a minimal Spring WebFlux endpoint that emits AG-UI events directly. The surface you need to know is small, and once the shape clicks the rest is just plumbing.

The event stream is the contract

AG-UI ships as an open MIT-licensed protocol maintained by CopilotKit. A client, usually a browser, sends a single HTTP POST to an agent endpoint with a RunAgentInput body. The server responds with a stream of typed events that end in either RUN_FINISHED or RUN_ERROR. On the wire that stream is Server-Sent Events by default, one JSON event per data: line, but the abstraction layer is transport-agnostic; WebSockets and binary frames are permitted by the spec.

The events fall into five families I keep coming back to when I read the docs:

Lifecycle: RUN_STARTED, RUN_FINISHED, RUN_ERROR, STEP_STARTED, STEP_FINISHED
Text messages: TEXT_MESSAGE_START, TEXT_MESSAGE_CONTENT, TEXT_MESSAGE_END
Tool calls: TOOL_CALL_START, TOOL_CALL_ARGS, TOOL_CALL_END
State: STATE_SNAPSHOT, STATE_DELTA, MESSAGES_SNAPSHOT
Escape hatches: CUSTOM and RAW for anything that does not fit

That is sixteen event types in the version I read. The count matters less than the rule around them: events sharing a messageId or toolCallId must be issued in START → CONTENT/ARGS → END order, and every run must be bracketed by RUN_STARTED and a terminal RUN_FINISHED or RUN_ERROR. Any frontend can rebuild a coherent UI by replaying those events in order.

For state, AG-UI uses a snapshot-delta pattern that will feel familiar if you have ever written a CRDT-adjacent UI. The first STATE_SNAPSHOT is the truth. Every later STATE_DELTA is a JSON Patch (RFC 6902) applied on top. This keeps the stream cheap for long conversations and lets the server re-emit a snapshot whenever clients fall out of sync.

The lifecycle of a single run is easier to see as a timeline than as prose:

A minimal AG-UI endpoint in Spring WebFlux

To convince myself the protocol was really this small, I wrote a single-file Spring Boot service that streams valid AG-UI events without any AG-UI library at all. This is the whole thing:

kotlin

// AgUiDemo.kt - start with: ./gradlew bootRun
package demo

import com.fasterxml.jackson.databind.ObjectMapper
import org.springframework.boot.autoconfigure.SpringBootApplication
import org.springframework.boot.runApplication
import org.springframework.http.MediaType.TEXT_EVENT_STREAM_VALUE
import org.springframework.web.bind.annotation.PostMapping
import org.springframework.web.bind.annotation.RequestBody
import org.springframework.web.bind.annotation.RestController
import reactor.core.publisher.Flux
import java.time.Duration
import java.util.UUID

@SpringBootApplication
class AgUiDemo
fun main(args: Array<String>) { runApplication<AgUiDemo>(*args) }

data class RunAgentInput(
    val threadId: String,
    val messages: List<Map<String, Any>> = emptyList(),
)

@RestController
class AgentController(private val mapper: ObjectMapper) {

    @PostMapping("/agent", produces = [TEXT_EVENT_STREAM_VALUE])
    fun run(@RequestBody input: RunAgentInput): Flux<String> {
        val runId = UUID.randomUUID().toString()
        val messageId = UUID.randomUUID().toString()

        val events = listOf(
            mapOf("type" to "RUN_STARTED", "threadId" to input.threadId, "runId" to runId),
            mapOf("type" to "TEXT_MESSAGE_START", "messageId" to messageId, "role" to "assistant"),
            mapOf("type" to "TEXT_MESSAGE_CONTENT", "messageId" to messageId, "delta" to "Hello "),
            mapOf("type" to "TEXT_MESSAGE_CONTENT", "messageId" to messageId, "delta" to "from AG-UI."),
            mapOf("type" to "TEXT_MESSAGE_END", "messageId" to messageId),
            mapOf("type" to "STATE_DELTA", "delta" to listOf(
                mapOf("op" to "add", "path" to "/lastTurn", "value" to runId))),
            mapOf("type" to "RUN_FINISHED", "threadId" to input.threadId, "runId" to runId),
        )

        return Flux.fromIterable(events)
            .delayElements(Duration.ofMillis(60))
            .map { "data: ${mapper.writeValueAsString(it)}\n\n" }
    }
}

Run it with ./gradlew bootRun on a standard Spring Boot 3.x project that has spring-boot-starter-webflux on the classpath, then hit it with:

bash

curl -N -X POST localhost:8080/agent \
  -H 'Content-Type: application/json' \
  -d '{"threadId":"t1"}'

The events arrive on the wire in order. A CopilotKit React client pointed at this URL renders the streaming response exactly as though it had come from a full LangGraph or CrewAI integration.

Three details in that snippet are worth pausing on. First, produces = TEXT_EVENT_STREAM_VALUE is what turns a Reactor Flux<String> into SSE; delayElements is only so I can watch the stream flow in the terminal. Second, STATE_DELTA carries a JSON Patch array, not a plain diff; this is the single detail I got wrong on my first attempt, because it is easy to confuse with JSON Merge Patch (RFC 7396). Third, the protocol does not mandate id or event: SSE fields — only data: with a JSON payload and a terminating blank line. Agents that lean on SSE event names for routing are off-spec.

Where it sits next to MCP and A2A, and where it falls short

AG-UI is not a competitor to MCP. MCP (Model Context Protocol) standardises the agent-to-tool edge with JSON-RPC; AG-UI standardises the agent-to-frontend edge with an event stream. A2A protocols sit at the agent-to-agent edge. In a system where a UI talks to an orchestrator that calls two tools and another agent, all three protocols can coexist without overlap.

Hosted runtimes have started picking it up. Amazon Bedrock AgentCore added AG-UI alongside its existing MCP and A2A surfaces in March 2026, which made the three-protocol layering visible in a single managed deployment and gave me a concrete reason to keep treating AG-UI as a stable contract rather than a passing convention. Google's A2UI v0.9, announced a few weeks earlier, stacks a generative-UI vocabulary on top that AG-UI can carry as CUSTOM events — so the protocol stays narrow while UI-description gets moved up a level.

That narrow focus also exposes rough edges. A few from my notes:

SSE is half-duplex. User input mid-run still goes back over a separate HTTP call; the spec permits WebSockets, but no first-party SDK uses them yet, so bidirectional flows like voice interruption are left to you.
Authentication is unopinionated. The spec does not prescribe bearer-token headers, scopes, or tenant claims. Every production deployment I looked at bolts those on top.
The event schema can fight with agent frameworks. Tool schemas carrying $schema meta-fields have triggered validation crashes when bridging from Google ADK into Pydantic AI through AG-UI, a symptom of the protocol passing fully-typed tool payloads end to end.
No published benchmarks for throughput or p99 latency exist. The snapshot-delta pattern also makes large conversation state expensive if you emit a fresh STATE_SNAPSHOT on every reconnect. I had to design my own budget for that case.

None of these are dealbreakers, but they are the shape of the engineering work AG-UI leaves on your plate.

When I would reach for it, and when I would not

Reach for AG-UI when you want a stable contract between a frontend and one or more agent runtimes that you might swap later, and when a streaming token feed, tool calls, and shared state are the three things the UI needs to see.

Skip it when a single-call, blocking JSON-over-HTTP response would do, when you need full-duplex voice or cursor-level collaboration (pick a WebSocket or WebRTC protocol directly), or when you already own both ends of the wire and the cost of a sixteen-event vocabulary outweighs the portability you gain.

Takeaways:

AG-UI is a small wire protocol, not a framework; the five event families and the lifecycle ordering rule are almost the whole spec.
The HTTP contract is "POST a RunAgentInput, receive an SSE stream of typed events terminated by RUN_FINISHED or RUN_ERROR".
A valid AG-UI stream can come from any backend; the Spring WebFlux example above is under 40 lines and needs no SDK.
MCP, AG-UI, and A2A each cover a different edge of an agent system and compose cleanly, and hosted runtimes like Bedrock AgentCore now expose all three side by side.
Treat auth, backpressure, and snapshot size as your problem; the spec will not decide them for you.

Further reading: the canonical list of events lives in the AG-UI docs under concepts/events, and the reference SDKs for TypeScript, Python, and Kotlin sit in the ag-ui-protocol/ag-ui repository on GitHub.

Still here? You might enjoy this.

Nothing close enough — try a different angle?

Code Graphs for Coding Agents: The Delivery Shape Matters More Than the Algorithm

I spent a weekend pointing a coding agent at a 480k-line Go monorepo and watching it grep-loop through 38 tool calls on one question. AST-derived code graphs fix that, but the delivery shape — local stdio MCP, remote service, or skill — changes the economics more than the graph algorithm does. Here is where I would put one in 2026, with a minimal Go indexer I can drop next to the agent.

Exposing Spring AI Agents via the A2A Protocol: What Interoperability Actually Buys You

Spring AI's server-side A2A integration is stable enough to put in production, but the protocol is most useful at organizational boundaries, not as an internal RPC replacement. This post walks through what actually changes in a Spring AI codebase, where the sharp edges still are, and a practical decision framework for A2A vs MCP vs plain REST.

The Deterministic Backbone: Why Production AI Systems Are Moving Away From Fully Autonomous Agents

Fully autonomous agents are hard to bound, hard to test, and expensive to operate. A deterministic backbone with narrow agent steps gives you the control flow back while keeping the intelligence where it matters. Here is how to design, test, and migrate toward it.

Reading AG-UI as a wire protocol, not a framework