shinpr/rashomon
9 stars · Last commit 2026-04-04
Measure prompt and skill improvements with blind A/B comparison.
README preview
<p align="center"> <img src="assets/rashomon-banner.jpg" width="600" alt="Rashomon"> </p> <p align="center"> <a href="https://claude.ai/code"><img src="https://img.shields.io/badge/Claude%20Code-Plugin-purple" alt="Claude Code"></a> <a href="LICENSE"><img src="https://img.shields.io/badge/License-MIT-blue" alt="License"></a> </p> **Know whether your skills actually improve agent behavior — not just look different.** ## Why rashomon? > Inspired by the *Rashomon effect* — the idea that the same event can produce different outcomes depending on perspective. > rashomon makes those differences explicit and comparable. - Built a skill but unsure if it actually changes agent behavior? - Iterating on skills and prompts by gut feel instead of evidence? - Want proof that your changes made things better, not just different?