Our research, building AI in production.
Plain-English field notes on the techniques behind our work — the formulas and trade-offs we've found that actually hold up once real users and real data show up.
RAG: getting the right context, every time
Most wrong answers aren't a model problem — they're a retrieval problem. Here's the formula we landed on for grounding AI in your own data.
Read researchAI agents: when to let the model take actions
Agents are powerful and easy to get wrong. The formula that worked for us: constrain the tools, verify every step, and keep a human in the loop where it counts.
Read researchVision: turning documents and images into structured data
OCR alone is brittle. The formula that worked: pair a vision model with a strict schema and a confidence check, so messy scans become clean, trustworthy data.
Read researchEvaluation: how we know an AI feature actually works
You can't improve what you don't measure. The formula that changed everything for us: a golden dataset and an LLM-as-judge, run on every change.
Read researchWant this applied to your stack?
Let's scope the build.
We turn these approaches into working software in your repo, on your stack.