uppifyagency/claude-harness

3 stars · Last commit 2026-03-29

Multi-agent harness for long-running AI application development. Planner-Generator-Evaluator architecture with progressive simplification. Claude Code plugin.

README preview

# claude-harness

Multi-agent harness for long-running application development. Based on [Anthropic's harness design pattern](https://www.anthropic.com/engineering/harness-design-long-running-apps).

## What it does

Orchestrates three specialized agents in a Plan-Build-Evaluate loop:

- **Planner** — expands a brief into a product spec (high-level, no implementation details)
- **Generator** — builds the application in a continuous session
- **Evaluator** — tests the running app like a real user, produces pass/fail verdicts

Each agent runs in a fresh context with file-based handoffs (`.harness/*.md`). The separation of generator and evaluator — inspired by GANs — produces dramatically better results than solo generation.

## Skills

| Skill | Usage | Purpose |
|-------|-------|---------|
| `/harness` | `/harness "build a retro game maker"` | Full Plan-Build-Evaluate loop |
| `/harness-eval` | `/harness-eval` | Standalone evaluation against spec |

View full repository on GitHub →