All posts
Long-form notes on software, distributed systems, and the craft of building. Shipping one a week.
What APISIX in the Trial Ring Actually Buys You: Notes on Its etcd-Backed Control Plane
Volume 34 of the Thoughtworks Technology Radar moved Apache APISIX into the Trial ring. I spent a week digging through the docs, source code, and a couple of bug reports to convince myself the etcd-backed dynamic-routing claim was real — and to weigh the operational cost it hides. These are my notes on the watch mechanism, the connection-scaling cliff at 263 long polls, and when I would and would not reach for APISIX in 2026.
Structured Concurrency Looks the Same in Four Runtimes — Until a Child Fails
I wrote the same fan-out four times — Java 25 StructuredTaskScope, Kotlin coroutineScope, Swift withThrowingTaskGroup, Python asyncio.TaskGroup — and the surface API is nearly interchangeable. The cancellation and exception-aggregation semantics are not. These are my notes on what diverges on the failure path and why only Python hands you every failure by default.
A Fitness Function Is Just a Test That Fails the Build When the Architecture Drifts
A fitness function is not a framework artifact — it is a build-failing test that encodes one architectural invariant. I encode a layering rule in about 60 lines of TypeScript using the compiler's own API, test the test against good, bad, and generated-code trees, then draw the line between an invariant worth gating and a metric gate that backfires under Goodhart's law.
Iceberg Schema Evolution: Drop-Then-Add Is Not a Rename
Apache Iceberg tracks every column by a unique numeric id, not by name. From my own digging into the spec and a small Kotlin program against a local catalog, the trap that bit me hardest is this: a drop followed by an add of the same column name is not a rename, and treating it as one quietly orphans your historical data.
Convergence Is a Property of Your Merge Function, Not the Network
I once watched an afternoon of offline edits vanish under a last-writer-wins sync, and the fix was not better networking — it was a better merge function. These are my notes on why CRDT replicas converge: a merge that is commutative, associative, and idempotent. I rebuild a minimal add-wins OR-Set in TypeScript, run it, and weigh what the guarantee costs in tombstones and memory.
Kotlin 2.4: The Three Changes That Moved My Hand on the Keyboard
Kotlin 2.4.0 shipped a long changelog, but only three features changed how I actually type: stable context parameters, explicit backing fields, and (still behind a flag) name-based destructuring. Here is my backend-engineer's cut, verified against the 2.4.0 compiler, plus the K1 removal I had to put on a calendar.
Catching a Retry Race with One Seed: Deterministic Simulation in Rust using turmoil
I had three flaky retry tests no one could reproduce on a laptop. I rewrote one in Rust on top of turmoil, Tokio's deterministic simulator, and a single 8-byte seed pinned the partition race byte-for-byte. These are my notes on what the seed actually controls, what leaks past it, and when deterministic simulation testing is worth the seam.
Reading AG-UI as a wire protocol, not a framework
I kept rebuilding the same SSE envelope every time I wrote an agent UI. AG-UI is the first serious attempt I have seen at standardising that envelope. In this post I strip the protocol down to its wire shape and rebuild a minimal Spring WebFlux endpoint that speaks it without an SDK.
Two-Phase Commit on the JVM: The Blocking Problem Nobody Puts in the Diagram
I crashed a Two-Phase Commit coordinator on purpose in a small Kotlin simulation to measure how long participants stay locked when the coordinator vanishes between phases. The result is the part of 2PC the diagrams never show — and the reason I would model most cross-service writes as a saga instead.
Drop the Right Requests First: Priority-Aware Load Shedding Under Overload
Static RPS caps shed the wrong traffic. Concurrency is what saturates a service, not request rate. From my notes after reading the InfoQ piece on overload protection, Uber's January writeup on Cinnamon, and Netflix's QCon SF talk on service-level prioritized load shedding, here is why latency is the right control signal — and how a small priority taxonomy plus an adaptive concurrency limit keep the cheapest traffic shedding first.