Run a single experiment iteration. Edit the target file, evaluate, keep or discard.
✓Works with OpenClaudeRun exactly ONE experiment iteration: review history, decide a change, edit, commit, evaluate.
Usage
/ar:run engineering/api-speed # Run one iteration
/ar:run # List experiments, let user pick
What It Does
Step 1: Resolve experiment
If no experiment specified, run python {skill_path}/scripts/setup_experiment.py --list and ask the user to pick.
Step 2: Load context
# Read experiment config
cat .autoresearch/{domain}/{name}/config.cfg
# Read strategy and constraints
cat .autoresearch/{domain}/{name}/program.md
# Read experiment history
cat .autoresearch/{domain}/{name}/results.tsv
# Checkout the experiment branch
git checkout autoresearch/{domain}/{name}
Step 3: Decide what to try
Review results.tsv:
- What changes were kept? What pattern do they share?
- What was discarded? Avoid repeating those approaches.
- What crashed? Understand why.
- How many runs so far? (Escalate strategy accordingly)
Strategy escalation:
- Runs 1-5: Low-hanging fruit (obvious improvements)
- Runs 6-15: Systematic exploration (vary one parameter)
- Runs 16-30: Structural changes (algorithm swaps)
- Runs 30+: Radical experiments (completely different approaches)
Step 4: Make ONE change
Edit only the target file specified in config.cfg. Change one thing. Keep it simple.
Step 5: Commit and evaluate
git add {target}
git commit -m "experiment: {short description of what changed}"
python {skill_path}/scripts/run_experiment.py \
--experiment {domain}/{name} --single
Step 6: Report result
Read the script output. Tell the user:
- KEEP: "Improvement! {metric}: {value} ({delta} from previous best)"
- DISCARD: "No improvement. {metric}: {value} vs best {best}. Reverted."
- CRASH: "Evaluation failed: {reason}. Reverted."
Step 7: Self-improvement check
After every 10th experiment (check results.tsv line count), update the Strategy section of program.md with patterns learned.
Rules
- ONE change per iteration. Don't change 5 things at once.
- NEVER modify the evaluator (evaluate.py). It's ground truth.
- Simplicity wins. Equal performance with simpler code is an improvement.
- No new dependencies.
Related community Skills
Other Claude Code skills in the same category — free to download.
board-prep
/em -board-prep — Board Meeting Preparation
browserstack
>-
challenge
/em -challenge — Pre-Mortem Plan Analysis
code-to-prd
|
codebase-onboarding
Codebase Onboarding
competitor-alternatives
When the user wants to create competitor comparison or alternative pages for SEO and sales enablement. Also use when the user mentions 'alternative page,' 'vs page,' 'competitor comparison,' 'comparis
content-strategy
When the user wants to plan a content strategy, decide what content to create, or figure out what topics to cover. Also use when the user mentions \"content strategy,\" \"what should I write about,\"
contract-and-proposal-writer
Contract & Proposal Writer
Want a community skill personalized to YOUR project?
This is a generic skill that works for everyone. Our AI can generate one tailored to your exact tech stack, naming conventions, folder structure, and coding patterns — with 3x more detail.