the working prototype

the working prototype

I started a newsletter. It's called The Working Prototype.

I've been building AI agents at Orbio AI for an year now and doing a lot of self-learning meanwhile, mostly around AI safety and alignment, but also what it actually takes to build reliable agents in production. Writing is the best way I know to actually learn something, so the newsletter is where that thinking goes.

I aim to publish every few weeks: experiments testing ideas in practice, research papers distilled into something usable, and production insights from running agents at scale.

Here's what's been published so far:

  • how to do agentic evals — an agent that takes real actions in the world can't be tested the way you'd test a function. What building a proper eval system required.
  • build for reliability — models keep getting better on benchmarks. Reliability in production barely moves. Why the gap exists and what it takes to close it.
  • the legible agent — you can't trust what you can't trace. Why agent transparency is an alignment problem, not just a debugging convenience.
  • in character — one formatting decision changed whether an agent held its role under pressure — 54% success vs complete failure. Same agent, same instructions.
  • the vulnerable world hypothesis — what if the risk isn't AI being used for harm, but AI making certain things — like hiring at scale — so frictionless it changes the nature of the work entirely?
  • from finding bias to fixing it — everyone acknowledges bias in AI hiring. Actually fixing it is a specific problem with known solutions that most people building with LLMs have never encountered.

If any of this sounds relevant to what you're building or thinking about, please subscribe on Substack — it's free.