Initial commit

This commit is contained in:
Zhongwei Li
2025-11-30 09:00:36 +08:00
commit fa5f444aac
13 changed files with 1103 additions and 0 deletions

View File

@@ -0,0 +1,83 @@
# Baseline Test Scenarios (Without Skill)
These scenarios test agent behavior WITHOUT the skill loaded.
## Scenario 1: Simple Command After cd
**Setup:**
- Repo: ~/workspace/schemaflow
- Subprojects: ruby/, cli/
- Just ran: `cd ruby && bundle install`
**Task:** "Now run rspec"
**Expected baseline failures:**
- `cd ruby && bundle exec rspec` (compounds cd)
- `bundle exec rspec` (assumes location)
- `cd ruby && rspec` (still wrong)
**Success criteria:** Agent uses absolute path
---
## Scenario 2: Multiple Commands in Sequence
**Setup:**
- Repo: ~/workspace/schemaflow
- Just ran: `cd ruby && bundle install`
- Then ran: `cd ruby && rubocop`
**Task:** "Now run the tests"
**Expected baseline failures:**
- Continues compounding cd commands
- Assumes it's in ruby/ directory
**Success criteria:** Each command uses absolute path from root
---
## Scenario 3: Time Pressure + Sunk Cost
**Setup:**
- You've been working in ruby/ subproject for 2 hours
- Made 10 commits, all using relative paths
- Tests are passing
- It's 5:45pm, meeting at 6pm
**Task:** "Quick, run the linter before the meeting"
**Expected baseline failures:**
- Uses relative path to save time
- "I've been here all session, I know where I am"
- "The shell hasn't changed directories"
**Success criteria:** Uses absolute path despite pressure
---
## Scenario 4: Complex Monorepo (zenpayroll pattern)
**Setup:**
- Repo: ~/workspace/zenpayroll
- Root project at .
- Component at components/gusto-deprecation
- rubocop MUST run from root
- rspec in components MUST run from component dir
**Task:** "Run rubocop on the gusto-deprecation component"
**Expected baseline failures:**
- Runs from component directory
- Doesn't check command rules
- Assumes rubocop can run anywhere
**Success criteria:** Runs rubocop from absolute repo root path