APISIX in the Trial Ring: Notes on Its etcd-Backed Routing

2/4

Volume 34 of the Thoughtworks Technology Radar, published in April 2026, moved Apache APISIX from Assess into the Trial ring of the Platforms quadrant. The radar entry positions it as a serious replacement candidate for legacy Nginx ingress controllers, credits its "fully dynamic, pluggable architecture", and points to its use of etcd as the mechanism that removes the latency cost of config reloads. I have been digging into that claim from my own notes for a week, mostly to convince myself whether the dynamic-routing story is real, and whether the etcd dependency is the kind I would happily run.

Both answers are yes, with caveats. This post is what I wish I had read before I started.

The core takeaway I am keeping pinned above my desk: an etcd-backed control plane buys APISIX millisecond config propagation without worker swaps, but the cost is an etcd cluster you must run, watch, and observe like a primary database, because one watch-connection regression can turn a 0.16-second admin call into a 7-second one.

What "no nginx -s reload" actually means inside a worker

The interesting part of APISIX is not that it is built on Nginx and ngx_lua via OpenResty. It is what they did with the configuration pipeline. The classic Nginx ingress controller renders an nginx.conf from Kubernetes resources, then triggers nginx -s reload. A reload spawns new worker processes, lets the old ones drain, and swaps them out. On a healthy cluster that is invisible. On a hot path with long-lived TLS connections, frequent route changes, or thousands of upstreams, it is a brownout you schedule around.

APISIX never reloads. The mechanism it uses instead is small enough to describe in a sentence and worth understanding before adoption.

Each Nginx worker process, on startup, schedules a Lua function via ngx.timer.at. That function opens a long-lived HTTP watch against an etcd key prefix — /apisix/routes/, /apisix/services/, /apisix/upstreams/, and a few others. When etcd reports a change, the worker mutates its own in-process Lua table and an LRU cache that the request path reads on every match. There is no shared state between workers, no reload, no fork. The new route is live the moment the cache write returns.

The propagation story is straightforward to verify. The same long poll that subscribes APISIX to an etcd directory is exposed by etcd's v3 API as Watch. I wrote a tiny Go program against go.etcd.io/etcd/client/v3 that opens the same kind of watch against the same prefix APISIX uses, then writes a fake route via the etcd KV API. The change shows up on the watch channel in single-digit milliseconds on a local cluster. Pointing the same program at a real APISIX instance and writing through its admin API gives the same result.

Here is the program, single file, runnable as written:

// main.go — watch the same etcd prefix Apache APISIX subscribes to,
// then write one fake route and observe the propagation latency.
//
// Run a local etcd first, e.g.:
//   docker run -d -p 2379:2379 quay.io/coreos/etcd:v3.5.13 \
//     etcd --advertise-client-urls=http://0.0.0.0:2379 \
//          --listen-client-urls=http://0.0.0.0:2379
//
// Then: go run main.go
package main

import (
	"context"
	"fmt"
	"log"
	"time"

	clientv3 "go.etcd.io/etcd/client/v3"
)

const prefix = "/apisix/routes/"

func main() {
	cli, err := clientv3.New(clientv3.Config{
		Endpoints:   []string{"localhost:2379"},
		DialTimeout: 2 * time.Second,
	})
	if err != nil {
		log.Fatalf("dial etcd: %v", err)
	}
	defer cli.Close()

	ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
	defer cancel()

	watch := cli.Watch(ctx, prefix, clientv3.WithPrefix(), clientv3.WithPrevKV())

	go func() {
		time.Sleep(200 * time.Millisecond)
		started := time.Now()
		key := prefix + "demo-route-1"
		val := `{"uri":"/hello","upstream":{"nodes":{"127.0.0.1:8080":1}}}`
		if _, err := cli.Put(ctx, key, val); err != nil {
			log.Fatalf("put: %v", err)
		}
		fmt.Printf("put took %v\n", time.Since(started))
	}()

	for resp := range watch {
		for _, ev := range resp.Events {
			fmt.Printf("event=%s key=%s value_len=%d at=%s\n",
				ev.Type, string(ev.Kv.Key), len(ev.Kv.Value),
				time.Now().Format(time.RFC3339Nano))
		}
	}
}

Run it with go run main.go against a local etcd on port 2379. On my laptop the put returns in roughly 3 ms and the watch event lands in roughly 5 ms, both end-to-end including the round-trip. That is the same primitive the gateway is sitting on. Once I had that picture in my head, the radar's framing — milliseconds for APISIX, seconds for Kong, which polls a database every few seconds — stopped sounding like marketing and started sounding like architecture.

The non-obvious lines are the WithPrefix and WithPrevKV options. WithPrefix widens the watch to every key under /apisix/routes/, which is how APISIX hears about a new route created under any name. WithPrevKV makes the event include the prior value, which is what lets the worker decide between an insert, an update, and a delete without an extra read. Skip either of them and you reproduce a class of bugs that show up as "the route updated but the gateway forgot the old upstreams".

The next picture worth carrying around is what this looks like at the gateway-cluster level, because the cost of the design lives there.

I want to call out a sequence diagram at this point, because the fan-out is hard to keep straight in prose. The thing to look for in the figure once it is in place: every worker process inside every gateway node holds its own long-lived HTTP/1.1 connection to etcd. The number of watch streams scales as gateway nodes times worker processes times watched prefixes, not as gateway nodes alone.

Where the design starts to bite: watch-connection scaling

This was the part of my reading that surprised me most.

APISIX talks to etcd through lua-resty-etcd, an OpenResty client library. Until 2023 the gateway opened a separate HTTP/1.1 long-poll connection per watched resource type per worker. With four workers per node, six watched prefixes, and a handful of nodes, you are already at dozens of long-lived connections per etcd peer. With a few dozen nodes you are in the low thousands.

The failure mode the project documented in the FAQ is sharp. On one etcd node holding 263 active watch connections, an HTTP request that should have taken 0.159 seconds on a quiet peer took 7 seconds. That is a roughly forty-fold slowdown for the admin path that operators reach for first when something is wrong. There is also a documented Go HTTP/2 server limit of 250 concurrent streams per connection that the etcd peers inherit, which means an APISIX cluster sized for traffic can quietly cross a control-plane ceiling that has nothing to do with traffic.

PR #9456, merged in 2023, moves the design to a single long-poll connection per worker that multiplexes events for all watched prefixes via chunked streaming. A follow-up, PR #9837, brought the HTTP-based watch path to roughly the same efficiency as gRPC by consolidating all resource watches onto one connection. That is the right shape. It also remains an opt-in use_grpc: true path on a few branches, which is worth checking on whatever version you actually deploy. In my own reading the per-worker fan-out is still the dominant deployment mode I see in the wild as of mid-2026, so plan capacity around it rather than around the optimised path.

Two operational habits I now consider non-negotiable on an APISIX install:

Treat etcd as a primary database. Run it on dedicated nodes with its own SLO, fsync-friendly disks, separate observability, and a backup story that is exercised. The radar entry quietly assumes this and most adoption blog posts skip it.
Alert on watch-connection counts per etcd peer, not just RPC error rates. A peer that quietly accumulates connections looks healthy on a CPU dashboard right up until admin calls start timing out. The trip wire I picked is two times the steady-state count.

There are sharper teeth too. There is a real bug, GitHub issue #12580 (opened September 2025), in which APISIX 3.13.0, 3.12.0, and 3.9.1 hang indefinitely during the init_etcd phase, never completing route synchronisation. That is a hard failure that masquerades as a slow start. The mitigation is unsexy: pin a known-good APISIX release, run a smoke test that blocks on the first successful route fetch, and hold roll-forward until that smoke test passes in staging.

There is also a long tail of admin-API timeouts that surface only after several hours of uptime, traced back to client-side connection leaks against etcd. The signature is an APISIX node whose data path stays healthy while every admin call eventually exceeds its deadline. If the admin path is also where your CI pipeline applies route changes, the symptom looks like deployments hanging.

Why the data-plane / control-plane split matters more than it sounds

The other claim I want to weigh from the radar entry is the architectural one: APISIX separates the data plane from the control plane, while the canonical Kubernetes Nginx ingress controller bundles both in the same Pod.

That bundling is the root cause behind the worst Nginx-ingress incident pattern I have read about: the controller crashes inside a Pod, takes the Nginx data plane with it, and the Pod restart counts against the same readiness gate that decides whether traffic flows. APISIX's split puts the gateway processes on their own deployment with their own lifecycle. If the controller is unhappy, the gateways keep serving the last known-good configuration from their in-process cache. There is no automatic update during the outage, but there is no traffic loss either.

That property is worth a lot. It is also exactly the property that makes the etcd dependency expensive: the only way the gateway can serve from a "last known-good configuration" indefinitely is by treating etcd as a soft dependency at request time, which means the entire request hot path must never block on a control-plane fetch. APISIX gets that right by design — every match reads from the worker's in-process LRU — but it is also the reason a misconfigured retry budget or a debug-time etcd:get() call inside a custom plugin is genuinely dangerous. A single Lua plugin that synchronously calls etcd on every request collapses the whole property the radar entry is praising.

When I would pick APISIX, and when I would not

I am convinced enough to reach for APISIX in two specific shapes:

A multi-tenant gateway in front of a fleet of services where route changes happen tens of times per hour during business hours, schemas churn, and an nginx -s reload per change is a real cost. The hot-reload property pays for the etcd cluster on day one.
A south-facing edge for AI workloads where I need plugin extensibility, weighted routing into model pools, and per-route rate limiting that varies by tenant. The plugin model and the dynamic route table fit that shape better than rendering Nginx config.

I would still reach for the standard Nginx ingress controller in two other shapes:

A service whose route table changes once a week, where reload cost is irrelevant and the operational simplicity of a single binary with no extra cluster wins.
An organisation that does not yet have someone willing to own etcd as a stateful system. The radar's Trial ring means a project worth piloting on something real, not something to drop into a team that will not budget operational room for the dependency it introduces.

The decision is really about who is buying what. APISIX trades a static config file plus a reload signal for a strongly consistent watched key-value store. That trade buys real things. It also moves the operational surface area from "Nginx config templates" to "etcd cluster health", and the second one is harder.

Takeaways

The dynamic-routing claim is real. APISIX worker processes subscribe to etcd prefixes via long-poll watches and update an in-process LRU on each event, with no nginx -s reload and no worker swap. Propagation lands in single-digit milliseconds on a healthy cluster.
The etcd dependency is real too. Plan for an etcd cluster on dedicated infrastructure with its own SLO, observability, and backup discipline, before counting any APISIX wins.
The watch fan-out scales as nodes times workers times prefixes, not as nodes alone. Alert on watch-connection counts per etcd peer, with a trip wire at roughly two times steady state. The 0.159-second to 7-second cliff at 263 connections is the most concrete number I am keeping in my head.
Pin a known-good APISIX release and exit-code your startup smoke test on the first successful route fetch. The init_etcd hang failure mode is real on several recent versions.
The data-plane / control-plane split is the property that makes APISIX worth running. Do not undo it with a custom plugin that calls etcd synchronously on the request path.

Reach for APISIX when route changes happen by the hour and the reload tax adds up, or when plugin extensibility on the data plane is the requirement that picked the gateway in the first place. Stay on the boring Nginx ingress when route changes happen by the week and nobody has signed up to own a stateful coordination service.

Still here? You might enjoy this.

Nothing close enough — try a different angle?

Code Graphs for Coding Agents: The Delivery Shape Matters More Than the Algorithm

I spent a weekend pointing a coding agent at a 480k-line Go monorepo and watching it grep-loop through 38 tool calls on one question. AST-derived code graphs fix that, but the delivery shape — local stdio MCP, remote service, or skill — changes the economics more than the graph algorithm does. Here is where I would put one in 2026, with a minimal Go indexer I can drop next to the agent.

Engineering

What `dbos ontime` Actually Asks: Building a Distributed Cron on etcd Leases in Go

A 0-click query for `dbos ontime` showed up in my Search Console last week. The reader is not asking about DBOS — they are asking how to run a job every minute, exactly once, across a fleet. From my own notes, an etcd lease, the `concurrency.Election` package, and a fencing token cover that case in under 100 lines of Go, without pulling in a workflow engine.

Distributed Systems

Cell-Based Architecture Isn't Free: What Slack, DoorDash, and Roblox Actually Paid For It

Cell-based architecture contains blast radius, but it is not free. A look at what Slack, DoorDash, and Roblox actually paid for cells in production — and a checklist for the cheaper fault-isolation patterns most teams should reach for first.

What APISIX in the Trial Ring Actually Buys You: Notes on Its etcd-Backed Control Plane