What it costs. What you save.
Baseline = what your prompts would cost on Opus-only. We log every decision so you can verify.
A concrete case
Solo founder, Claude Code Max plan, RTX 4090, ~80 prompts/day, ~8% critical. Most of the day is renames, commits, small edits and “explain this” — those route to T0 local (free). The ~8% that's real debugging or a cross-file refactor goes to Sonnet/Opus. Set those inputs in the calculator above to see this profile's monthly figure — it's computed from the same per-tier costs as the N=34 benchmark below, not a marketing number. Your mix (and your savings) shift with how much you keep local.
Benchmark proof
Cost per prompt on a 34-prompt blind-judged validation set.
Reproduce it yourself
The benchmark is pre-registered and open. 34 prompts × 3 arms (mooter, Sonnet-only, Opus-only) with a blind LLM judge — design, prompts, raw rows and per-pack diagnostics all live in the repo. Clone it and run the harness:
See wave1-benchmark/README.md + BENCHMARK_DESIGN.md for the full method, confidence intervals and mis-routing analysis.
N=34 is a small set — only medium-to-large effects are detectable. On this cloud-only set mooter matches the quality bar at lower cost per prompt; the larger savings come from routing simple work to free local T0, which this set doesn't isolate.