Connection · Interrupted

Something didn't load

Part of this page failed to reach you. Reload to try again — if it keeps happening, check your connection.

Skip to main content
Writing · № 03

All posts

Long-form notes on software, distributed systems, and the craft of building. Shipping one a week.

All Posts
7 posts · Distributed Systems
Distributed Systems01

Convergence Is a Property of Your Merge Function, Not the Network

I once watched an afternoon of offline edits vanish under a last-writer-wins sync, and the fix was not better networking — it was a better merge function. These are my notes on why CRDT replicas converge: a merge that is commutative, associative, and idempotent. I rebuild a minimal add-wins OR-Set in TypeScript, run it, and weigh what the guarantee costs in tombstones and memory.

Jun 14
Distributed Systems02

Two-Phase Commit on the JVM: The Blocking Problem Nobody Puts in the Diagram

I crashed a Two-Phase Commit coordinator on purpose in a small Kotlin simulation to measure how long participants stay locked when the coordinator vanishes between phases. The result is the part of 2PC the diagrams never show — and the reason I would model most cross-service writes as a saga instead.

May 30
Distributed Systems03

Drop the Right Requests First: Priority-Aware Load Shedding Under Overload

Static RPS caps shed the wrong traffic. Concurrency is what saturates a service, not request rate. From my notes after reading the InfoQ piece on overload protection, Uber's January writeup on Cinnamon, and Netflix's QCon SF talk on service-level prioritized load shedding, here is why latency is the right control signal — and how a small priority taxonomy plus an adaptive concurrency limit keep the cheapest traffic shedding first.

May 28
Distributed Systems04

Actor-per-Entity vs Postgres Optimistic Locking: A Seat-Reservation Bake-off

I ran the same hot-key seat reservation workload two ways: Postgres with a version column and retries, and a single actor per seat. The actor design did not scale better — it moved the hard problem from concurrency control to routing and rebalance correctness, and that trade was the easier one to reason about under hot keys.

May 26
Distributed Systems05

Auditing a Scala Service Against Chad Fowler's Four Regenerative Constraints

I walked a Scala order-processing service from my notes through Chad Fowler's four regenerative constraints. Two passed for free, two would force a real redesign. Here is what I learned about where "loosely coupled module" ends and "regenerative component" begins, and which parts of the redesign I would actually pay for.

May 23
Distributed Systems06

AckWait Is a Contract: How a 30-Second Default Took Down My JetStream Consumer

I lost an evening to a NATS JetStream pull consumer that doubled its work in production. The cause was three lines of ConsumerConfig I never wrote. These are my notes on what AckWait actually counts, why MaxDeliver = -1 is the silent footgun, and the 70-line Go contract I now ship on every JetStream consumer.

May 12
Distributed Systems07

Cell-Based Architecture Isn't Free: What Slack, DoorDash, and Roblox Actually Paid For It

Cell-based architecture contains blast radius, but it is not free. A look at what Slack, DoorDash, and Roblox actually paid for cells in production — and a checklist for the cheaper fault-isolation patterns most teams should reach for first.

Apr 23