zeuikli/claude-pilot-suite
24 stars · Last commit 2026-05-08
Claude Code execution playbook with 3 pilot modes: cost-first (Haiku), quality-first (Sonnet), ceiling-elevation (Opus). Quantitative escalation gates.
README preview
# Claude Pilot Suite > **Version**: 0.4.1 (2026-05-08 — opus-pilot added; 5 benchmark-driven fixes for haiku/sonnet-pilot) > **License**: MIT > **Validated on**: Claude Sonnet 4.6 + Haiku 4.5 + Opus 4.7 (6-agent × 10-question pilot-vs-vanilla benchmark) A portable, opinionated execution playbook for Claude Code that provides: - **Haiku 4.5 quality close to Opus 4.7** (Haiku Pilot mode) — with 70–85% cost savings - **Sonnet 4.6 quality floor protection + predictable escalation** (Sonnet Pilot mode) — with 55–65% cost savings - **Opus 4.7 ceiling elevation** (Opus Pilot mode) — 5 structural mechanisms to exceed vanilla Opus baseline on hard analytical tasks **What "quality floor protection" means**: on structured wiki-extraction tasks, Sonnet+SKILL ≈ Sonnet Plain (both ~283/300 vs Opus 286/300 in A/B/C benchmark). The SKILL's value is in *preventing quality drops* on complex tasks (ambiguous requirements, multi-step agentic, cross-module design) — not in boosting already-strong performance on routine extraction. Think of it as insurance: no premium on good days, prevents catastrophic failure on hard days. Built on the empirical finding that **prompt structure + sub-agent delegation often matters more than raw model capability**. Documented in two arXiv-backed studies: > AgentOpt (arxiv 2604.06296): On HotpotQA, Opus alone = 31.71%; Ministral planner + Opus solver = 74.27%. > Augment Code AGENTS.md eval (2026-04): Best-quality docs ≈ one model tier upgrade. Bad docs are worse than none. ---