Skip to main content
Notes from the Compass

We build it. Then we write about it.

FMCSA compliance is hard, AI hallucinates, and we're building both at once. Below is the inside view — eval baselines, prompt engineering, monitoring lessons. The same kind of post we wish other vendors wrote.

ENGINEERINGMay 17, 2026·8 min read

How we got to 85% citation accuracy on a 60-question FMCSA eval

Before any AI compliance product is honest with you, the team needs a baseline. We built a 60-question eval set across 15 categories, ran vanilla claude-sonnet-4-6 against it with a minimal system prompt, and got 51/60 = 85%. Here's where the failures were, why the eCFR round-trip catches the rest, and what the 95% architecture looks like.

Read the post →

See what we're writing about — actually working

Try the Ask Compass demo on the homepage. Real CFR citations, real eCFR verification.

★ Try the live demo →