#RAG

Every post I've written about RAG — 2 articles, across Engineering, AI. Most recent: “Turning LLM Context Engineering Into an Evaluation Loop with DSPy.”

Engineering01

Turning LLM Context Engineering Into an Evaluation Loop with DSPy

Notes from two weekends of digging into DSPy. I stopped treating prompts as the source of truth and started treating them as compiled output from a typed signature, a metric, and an optimizer. Here is the smallest end-to-end program I kept, how MIPROv2 actually searches, and where the approach breaks down in practice.

May 3

AI02

Memory Evaluation: Measuring How AI Memory Decays Over a Project's Lifetime

Most AI memory benchmarks grade on recall and stop there. That hides the real failure mode: stale facts quietly poisoning the context window. Here is a lifecycle-based evaluation framework that tests recall, revision, and controlled forgetting across the change points every long-lived project goes through.

Apr 17

Posts tagged #RAG

Turning LLM Context Engineering Into an Evaluation Loop with DSPy

Memory Evaluation: Measuring How AI Memory Decays Over a Project's Lifetime