FlineDev/TandemKit
18 stars · Last commit 2026-04-23
Planner/Generator/Evaluator orchestration harness for Claude Code (and Codex)
README preview
<p align="center"> <img src="https://github.com/FlineDev/TandemKit/blob/main/Logo.png?raw=true" height="256" /> <br><br> <a href="#how-it-works">How It Works</a> · <a href="#installation">Installation</a> · <a href="#mission-lifecycle">Mission Lifecycle</a> · <a href="#faq">FAQ</a> </p> # TandemKit Describe your goal, approve the spec, then step away — Claude and Codex loop together until it's right. TandemKit is a [Claude Code](https://docs.anthropic.com/en/docs/claude-code) plugin that runs three sessions — Planner, Generator, and Evaluator — with two of them pairing Claude and Codex as independent reviewers. You are only needed at two points: during **planning** (questions and spec approval) and at **review** (when evaluation passes and you give feedback or call it done). Between those two points, the Generator implements and the Evaluator verifies in a tight loop, with no manual review or copy-pasting from you. In both the Planner and Evaluator sessions, Claude automatically launches [Codex](https://openai.com/index/introducing-codex/) as a background task using the official [Codex plugin](https://github.com/openai/codex-plugin-cc), so two different models independently investigate and converge on a result — everything inside Claude Code. ## Why TandemKit? ### Who Is It For? You have a **Claude Max** subscription (which includes Claude Code) and a **ChatGPT** subscription (which includes Codex). You work on tasks complex enough to warrant the extra cost — TandemKit is not recommended for simple, small, or mechanical tasks, since the multi-session loop uses more tokens than a regular Claude session. ### The Reasoning